(no title)
maxbrunsfeld | 1 year ago
It's not really accurate to say that Tree-sitter segfaults constantly. The Tree-sitter CI performs randomized fuzz testing based on a bunch of popular grammars, randomly mutating entries from their test corpus. If you have a reproduction case for the segfault, it'd be useful to report.
Since you mention NPM, it sounds like you may be talking about a segfault that's specific to the Tree-sitter Node.js bindings, but it's hard to be sure.
dumbo-octopus|1 year ago
Anyways, since you're here, I dug a bit more into this to make a more useful report. Starting with v0.20.2, the following file will cause a segfault when parsed using tree-sitter-javascript: https://github.com/tursodatabase/libsql/blob/main/libsql-sql... . It worked fine in v0.20.1, and it's still broken with the latest v0.20.4. Based on the diff here: https://github.com/tree-sitter/tree-sitter-javascript/compar..., I don't on the surface see a way to dig deeper into this than trying to read through a 170,000 line (!!!) diff to parser.c.
And looking at that diff raises another complaint: the names of parser nodes must be considered part of the public API, as they are exposed in descendantsOfType, .type, etc., but they are 100% not documented anywhere, and are liable to change without notice in patch version bumps. This makes developing against it a massive pain, as any version increase is liable to break any code expecting a particular nomenclature.
I don't mean to dunk on your project, I'm sure it solves some problems very well. But it is remarkably difficult to confidently depend on in a production environment.