These ideological decisions don't sound very pragmatic. There's a lot of open-source prior art in this space (OpenGrok, Kythe, SourceGraph) which provide support for most large languages and have annotation output formats that are broadly similar to this JSON file, and you could still support users having indexers for small languages running as part of CI.
> There does not exist any widely available standalone C parsing library to provide C programs with access to an AST. There’s LLVM, but I have a deeply held belief that programming language compiler and introspection tooling should be implemented in the language itself. So, I set about to write a C parser from scratch.
Even if you prefer to write your C indexer in C, you could use LLVM's C [1] or Python [2] APIs. Plus, you can handle C++ without having to implement your own C++ parser from scratch, which is a much larger undertaking than C99 plus a few GNU extensions.
One problem with OpenGrok et al is scale. I already have a service which is designed to run arbitrary user tasks in an environment configured for their project's needs, so I wanted something that could take advantage of that.
As for parsing C++, since LLVM is written in C++ using it to write a C++ annotator would be a natural fit :) But C and C++ are different langauges and I don't wish to require LLVM to deal with it. LLVM is one of the largest open source projects on the net, and it requires a lot more complexity and compile time to utilize under these circumstances. On the other hand, I came up with a solution which is <1,300 lines of code and won't grow much more as it expands to support a broader set of C extensions.
There does exist prior art, but I deliberately chose to go with the lowest common denomoniator to provide support for a lot of use-cases we can't predict in an environment which gives users more control over its behavior. I think over time it will be pretty easy to plug the prior art into this system, but harder to plug their systems into novel use-cases. The existing solutions are not always the best, but I did put in a lot of research time to validate that assumption.
Github also recently open-sourced their Haskell-based Semantic, which annotates and cross-references a whole bunch of languages (all the languages any of our clients use), and is built on tree-sitter, so there's, like, several levels of prior art available here.
On SourceHut it's less language-aware and more generic, which makes it more useful for a wider range of use-cases. However, I could totally see a tool being possible which converts LSIF files into SourceHut annotations.
For Python dig up the old PySonar project (the author took it down for some reason but there are mirrors/archives where you can find versions. Oh hey, it looks like it has been resurrected: https://github.com/yinwang0/pysonar2) It might have been superseded by something else in the meantime, I dunno.
It was the basis for Google's internal Python annotations thingy, and it fscking rocks.
So excited to use this once my requirements are implemented (mostly just LFS, and to a lesser extent Merge(|Pull) Requests). Admittedly I don't need it, I just really appreciate the simplistic UI and straight forward pricing model.
LFS support is something I'd like to do, but It's Complicated(TM). Main challenges include finding a good place with ample bandwidth and storage, figuring out where/how to take backups of it, and measuring bandwidth and storage usage to integrate with billing. Not a priority right now, but may land between the beta and stable periods.
As for merge requests, don't hold your breath. SourceHut embraces the email-based model. A tutorial is available here to give you an idea of how it works: https://git-send-email.io and check out this video for the maintainer's side: https://aerc-mail.org/
The advantages of email include:
- It's based on a venerable and well-understood standards, with ample open-source tooling available
- It's decentralized, federated, and highly fault tolerant
- It doesn't lock you into my platform, you have ownership over your content and can freely interact with projects anywhere
It's also easy and natural to review code by writing emails, and by far the most efficient workflow for git collaboration I've used (having extensively worked in email, GitHub, GitLab, and Gerrit). I think you should give it a chance!
Yep, you could. One thing which would be super cool is highlighting a snippet of code, entering an annotation, and having it uploaded to git.sr.ht right there.
It has a nice summary of what's cool about it. It's very lightweight on the UI but it's actually very featureful, and includes mailing lists, CI service, etc.
I assume you're the same person I've been talking to on Lobsters. Clarification: they're talking about git push over https, which is deliberately unsupported in favor of the more secure SSH push option. git.sr.ht doesn't even have access to your password, so if the server is compromised then the attacker can't dump password hashes.
[+] [-] Scaevolus|6 years ago|reply
> There does not exist any widely available standalone C parsing library to provide C programs with access to an AST. There’s LLVM, but I have a deeply held belief that programming language compiler and introspection tooling should be implemented in the language itself. So, I set about to write a C parser from scratch.
Even if you prefer to write your C indexer in C, you could use LLVM's C [1] or Python [2] APIs. Plus, you can handle C++ without having to implement your own C++ parser from scratch, which is a much larger undertaking than C99 plus a few GNU extensions.
[1]: https://github.com/llvm-mirror/clang/blob/fb2a26cc2e40e007f1... [2]: https://github.com/llvm-mirror/clang/blob/master/bindings/py...
[+] [-] Sir_Cmpwn|6 years ago|reply
As for parsing C++, since LLVM is written in C++ using it to write a C++ annotator would be a natural fit :) But C and C++ are different langauges and I don't wish to require LLVM to deal with it. LLVM is one of the largest open source projects on the net, and it requires a lot more complexity and compile time to utilize under these circumstances. On the other hand, I came up with a solution which is <1,300 lines of code and won't grow much more as it expands to support a broader set of C extensions.
There does exist prior art, but I deliberately chose to go with the lowest common denomoniator to provide support for a lot of use-cases we can't predict in an environment which gives users more control over its behavior. I think over time it will be pretty easy to plug the prior art into this system, but harder to plug their systems into novel use-cases. The existing solutions are not always the best, but I did put in a lot of research time to validate that assumption.
[+] [-] tptacek|6 years ago|reply
[+] [-] iso-8859-1|6 years ago|reply
https://code.visualstudio.com/blogs/2018/12/04/rich-navigati...
[+] [-] Sir_Cmpwn|6 years ago|reply
[+] [-] carapace|6 years ago|reply
For Python dig up the old PySonar project (the author took it down for some reason but there are mirrors/archives where you can find versions. Oh hey, it looks like it has been resurrected: https://github.com/yinwang0/pysonar2) It might have been superseded by something else in the meantime, I dunno.
It was the basis for Google's internal Python annotations thingy, and it fscking rocks.
[+] [-] asdkhadsj|6 years ago|reply
[+] [-] Sir_Cmpwn|6 years ago|reply
As for merge requests, don't hold your breath. SourceHut embraces the email-based model. A tutorial is available here to give you an idea of how it works: https://git-send-email.io and check out this video for the maintainer's side: https://aerc-mail.org/
The advantages of email include:
- It's based on a venerable and well-understood standards, with ample open-source tooling available
- It's decentralized, federated, and highly fault tolerant
- It doesn't lock you into my platform, you have ownership over your content and can freely interact with projects anywhere
It's also easy and natural to review code by writing emails, and by far the most efficient workflow for git collaboration I've used (having extensively worked in email, GitHub, GitLab, and Gerrit). I think you should give it a chance!
[+] [-] rjeli|6 years ago|reply
[+] [-] nerdponx|6 years ago|reply
Conceivably someone could write an offline annotation viewer/editor as well? I would love for something like that to catch on.
Imagine Emacs and Pycharm plugins for viewing and editing these annotations, for example.
[+] [-] Sir_Cmpwn|6 years ago|reply
[+] [-] imagiko|6 years ago|reply
[+] [-] Sir_Cmpwn|6 years ago|reply
https://sourcehut.org
It has a nice summary of what's cool about it. It's very lightweight on the UI but it's actually very featureful, and includes mailing lists, CI service, etc.
[+] [-] woodrowbarlow|6 years ago|reply
[+] [-] 0xDEFC0DE|6 years ago|reply
[+] [-] svnpenn|6 years ago|reply
[+] [-] Sir_Cmpwn|6 years ago|reply
[+] [-] jefurii|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] crispyporkbites|6 years ago|reply
This random HN commenter says yes!
[+] [-] NetOpWibby|6 years ago|reply