top | item 39773342

(no title)

maxbrunsfeld | 1 year ago

Did you open an issue for the segfault you got, on the Tree-sitter repo or the repo for the particular parser you were using?

It's not really accurate to say that Tree-sitter segfaults constantly. The Tree-sitter CI performs randomized fuzz testing based on a bunch of popular grammars, randomly mutating entries from their test corpus. If you have a reproduction case for the segfault, it'd be useful to report.

Since you mention NPM, it sounds like you may be talking about a segfault that's specific to the Tree-sitter Node.js bindings, but it's hard to be sure.

discuss

order

dumbo-octopus|1 year ago

No, I did not open an issue. When I checked the issues for tree-sitter-javascript, I saw the most recent issue was one reporting the same thing (segfault) from some weeks ago, with no attention. It still has had no attention, so I don't think reporting would have done much.

Anyways, since you're here, I dug a bit more into this to make a more useful report. Starting with v0.20.2, the following file will cause a segfault when parsed using tree-sitter-javascript: https://github.com/tursodatabase/libsql/blob/main/libsql-sql... . It worked fine in v0.20.1, and it's still broken with the latest v0.20.4. Based on the diff here: https://github.com/tree-sitter/tree-sitter-javascript/compar..., I don't on the surface see a way to dig deeper into this than trying to read through a 170,000 line (!!!) diff to parser.c.

And looking at that diff raises another complaint: the names of parser nodes must be considered part of the public API, as they are exposed in descendantsOfType, .type, etc., but they are 100% not documented anywhere, and are liable to change without notice in patch version bumps. This makes developing against it a massive pain, as any version increase is liable to break any code expecting a particular nomenclature.

I don't mean to dunk on your project, I'm sure it solves some problems very well. But it is remarkably difficult to confidently depend on in a production environment.