Sourcegraph CEO here. Sourcegraph is now 2 separate products: code search and Cody (our code AI). Cody remains open source (Apache 2) in the client/cody* directories in the repository, and we're extracting that to a separate 100% OSS repository soon.
Our licensing principle remains to charge companies while making tools for individual devs open source. Very few individual devs (or companies) used the limited-feature open-source variant of code search, so we decided to remove it. Usage of Sourcegraph code search was even more skewed toward our official non-OSS build than in other similar situations like Google Chrome vs. Chromium or VS Code vs. VSCodium. Maintaining 2 variants was a burden on our engineering team that had very little benefit for anyone.
You can see more explanation at https://github.com/sourcegraph/sourcegraph/issues/53528#issu.... The change was announced in the changelog and in a PR (all of our development occurs in public), and we will have a blog post this week after we separate our big monorepo into 2 repos as planned: the 100% OSS repo for Cody and the non-OSS repo for code search.
You can still use Sourcegraph code search for free on public code at https://sourcegraph.com and on our self-hosted free tier on private code (which means individual devs can still run Sourcegraph code search 100% for free). Customers are not affected at all.
Sourcegraph only provided non-OSS images and the build process was difficult and broken for a long time, the application itself was frequently broken in OSS version as well, searching issues for a few minutes brings up quite a few results. [1] [2] [3] [4]
It's no wonder, that the usage of OSS version was pretty low, when few were able to build it and even if they managed that, the resulting application was broken every few releases.
Both VS Code and Chromium are easy to build, due to their nature and popularity, they are available prebuilt from many sources. I would install "unofficial" Chromium build from my distribution's repository, I wouldn't keep my code in unofficial Sourcegraph build from some random person on Github. Comparing them is rather unfair, but there's another issue that stopped OSS adoption.
For a long time, official Sourcegraph Docker image came with a 10 seat free license, which suited many people and they weren't looking for alternatives like OSS build.
I would argue that announcing license change and closing of your product as a small block in change log file or when someone mentions the problem in Github Issues is not adequate for such a change.
Not using open-first principles, restricting the product by using enterprise only plugins, which others mentioned under this post, not providing open source builds and changing license without preceding announcement, while previously using open source terminology for some feel-good free marketing leaves a bitter taste. Especially with so many companies doing this right now due to interest rates.
For what it's worth, I'd advocated to adopt SourceGraph at work for a long time but the open-source version being impossible to deploy essentially blocked us from ever considering it.
I don't doubt that your perception of the OSS version's lack of success is accurate, and it is definitely easier to close off the source, but at the same time the outcome of this is that one funnel into it is closing with the calculation that the effort spent to keep that funnel open wasn't worth the people coming in through it.
The other possibility, and one that I subscribe to (not that it isn't self-serving) is that the funnel was never open enough to see success in the first place.
> charge companies while making tools for individual devs open source
Stop using the term Open Source. It's not open source if you apply restrictions like this, it's pretty easy to see that you're being disingenuous. These licenses are not OSI approved.
I am one of the few people who used the open source version and really liked it, and I'm disappointed by the changes.
The challenge I had with Sourcegraph is that it's out of reach of developers working on personal projects. There isn't a hosted plan, and for my projects I can't easily open source them due to my employer.
I was really excited when the Sourcegraph App was released, since it allowed me to give Sourcegraph a try on my project without going through the complex self-hosted setup. I went as far as getting scip-clang working with my Bazel-based project, and then tried out the docker-compose setup on my home lab.
Now that code search was removed from the app, and this change, I'm concerned that I won't be able to use Sourcegraph for my personal projects in the future.
This is a missed opportunity. I think individual developers using products for personal projects are powerful advocates, since those developers may convince their employer to purchase the product. If I could I'd gladly pay, but I'm just one person and can't justify $5k/year.
Also, I appreciate all the comments here and find them fair and thoughtful, including the critical stuff. You can join our Discord at https://discord.gg/rDPqBejz93 to chat more after this goes off the HN frontpage. And if anyone wants to chat with me directly to share feedback or complaints, let me know (and we can share the recording publicly if you're OK with it).
I asked for an enterprise trial once because my company was seriously considering a purchase, despite not being able to properly evaluate the OSS version.
The response was basically "sorry, we're too busy".
Great straightforward non apologetic answer props. Refreshing compared to what we often get here nowadays and reminiscent of how ceos used to reply here.
(I Don like the news itself obviously but the delivery was good)
Do you see any potential trademark conflicts you may run into against Google due to Codey[0]? I don't know which was announced/branded first, but I imagine a big co like Google is tough to win against even when you're in the right.
https://oracle.github.io/opengrok/ is open source and very good at huge source base, e.g. for the whole android and linux kernel together, fast and useful.
I'll add something I have been working on https://github.com/boyter/cs which is aimed at a smaller scale. It works fine for multiple repositories so long as they aren't too large.
livegrep is... fine. It's literally what it says it is. It's a web version of grep. livegrep is definitely not a replacement for sourcegraph, which actually understands the underlying code, and lets you follow code paths, search for references, etc.
Never found a startup on the premise that someone else's product will be inadequate forever.
The recent rewrite of github search has probably made sourcegraph irrelevant. If you may recall, original github search used almost the most horrible algorithm possible. It dropped all punctuation and spacing and just searched for identifiers. No patterns allowed, no quoting allowed. One of the only meta-arguments was filename:xyz.
Now that github has improved its basic search functionality, sourcegraph might be doomed.
I used sourcegraph at Lyft which (at the time) had unlimited money to waste on software tools, and installed the open-source version at Databricks but nobody cared.
> The recent rewrite of github search has probably made sourcegraph irrelevant.
It only makes it irrelevamt if all your code is hosted on Github.
I'm quite tired of Github-proprietary solutions being hailed as the "industry norm." Or vendors like shipping products and integrations that only work with Github. Git is a decentralized protocol; please treat it like one.
Did anyone actually use the open version? I dimly remember that I looked into it like 2-3 years ago, but all the really interesting stuff was not included in that. The pricing for enterprise was absolutely bonkers, something like 100$ per month&developer, which already made it clear that they are obviously only targeting big players with infinite budget. Seems the pricing is now changed, and it "starts at 5k/year" for some "Enterprise Starter" edition, but despite lots of bullet points it is very unclear to me what the limitations really are. I'm actually really interested in this product and it might be a good addition to our tool set, it's a shame the pricing is so opaque.
Their support in the demo period sucked, their complex C++ support was lacking, they didn't integrate into modern C++ build systems well, and their prices were insane.
They kept trying to push this "campaign" feature on us, which is an overly-complex auto-refactoring tool that couldn't even support our non-proprietary, well-known build system. For the cost of their license, we instead hired two developers for code refactors, who then went on to make other tooling, and we didn't need to hire someone to babysit their crappy service integration.
I would not say that they had found their niche when speaking to us. Perhaps it has gotten better.
> Individual devs will still be able to use Sourcegraph for free on public code at sourcegraph.com and within our self-hosted free tier on private code.
> Very few individual devs or companies used the limited variant of code search that was open source. The vast majority (99.9%+) used the enterprise product. Maintaining two variants going forward was a big burden on our engineering team that had very little benefit for users.
A few months back they removed free enterprise license that allowed 10 dev seats, some smaller companies were holding back the updates and looking at the OSS version - I guess, not anymore
I've never in general been a fan of "open core" products.
As someone who builds things, it feels like poor craftsmanship to put obstacles in front of your users and limit the extent to which they can use your work.
It also feels like decisions to hamper how people use a product are driven purely by greed.
Let's imagine a world in which Sourcegraph were completely free software. They would probably still have enterprise customers pay them to securely host Sourcegraph on-premise. They wouldn't be able to charge per seat. They would have to make sure their product was cheap enough that their customers wouldn't save a ton of $$ by hiring engineers to maintain Sourcegraph on premise themselves.
I am curious if they (or anyone else running an open core business) has estimates for:
1. How many customers they would lose if they went fully free.
2. How much revenue they would lose if they went fully free.
Building free software and charging people to host it can be the foundation for a sustainable business, but it's unlikely to give VCs the kind of outcomes they want from a successful investment.
To be honest, I think it's fine for infrastructure to be closed/proprietary. There are good reasons to do this if you are writing programs for which security is important - releasing your infrastructure code freely gives attackers a lot of ammunition to work with.
If we believe in the power of automation and in building high quality software, it is possible to build free software that:
1. Is easy for you to deploy and maintain securely on customer infrastructure.
2. Requires very little operational overhead from its you as the host (in terms of support).
3. For which the infrastructure code is proprietary.
What is a good open-source system for code search if I want to plug 100 or so git repos into it and have it available over the web? GH search is not desirable because it would search too broadly and would not cover repos on Gitlab etc.
I looked at the Debian code search [1] in the past, but for some reason thought it required a bit too much effort and didn't complete my investigation of it. Though [2] looks pretty approachable.
Sourcegraph mentioned Zoekt [3], but I am not sure how usable it is. If it was pretty good, why did Sourcegraph OSS exist?
Finally, from all the discussion how Sourcegraph OSS was very behind in the past few years, I guess there is no serious plan to fork it?
Edit: GCS release [4] seems to have been open-sourced without a frontend.
Edit2: Livegrep [5] and Opengrok [6] were recommended higher in the thread. Quite excited to try them out but if someone has working Docker Compose configs, I would be very thankful for the head start.
Edit3: there is also Eureka [7]. Seems less powerful but easier to deploy.
[3] used to be a Google open source project as well, but it fell out of maintenance, and Sourcegraph took it over. It powers most of the basic regex/literal search in Sourcegraph.
Mozilla's code is searchable in Searchfox (https://searchfox.org/) which uses the indexer from Livegrep, combined with their own Git indexer and language-specific cross reference databases.
OpenGrok (https://github.com/oracle/opengrok) is also rather well known, but I have found it to have a slightly worse UI than alternatives.
Yet another concrete example of why copyleft licenses are better than pushover ones and why CLAs are bad. It would have been illegal for them to do this if the old license were copyleft and they accepted contributions without a CLA.
I don't understand your comment. What do you think would have happened if it was GPL instead of Apache? That a person would come out of nowhere willing to rewrite all the SourceGraph owned code in the repo?
The crude reality is that development tools cost real money to be developed and we no longer are in a environment where VCs are showering companies with bags of money without some really down-to-earth, concrete plan for profitability.
Companies like Microsoft and Google can have themselves the luxury of keeping projects like VSCode and Golang open source. The economics make sense for them. Not all companies can do that, especially small startups.
I remember a time when buying a C compiler cost money, real money. I don’t think we are ever going back to that, but I also think that paid tools with enterprise pricing are back.
I don’t care about the morality of that and I am arguing pro neither against it. It is a just a fact, a seismic shift that we can’t really stop.
I've worked at more than one place that considered sourcegraph and decided it was too expensive (and these were software shops with money to spend on good tooling). With language servers working so well now, I think SG may have already missed the boat and this looks like an early part of their death spiral.
I remember having used a Red Hat (?) tool back in 2002 for understanding the source code of the Brazilian voting machine so we could more easily port it to Windows CE (the 2002 model ran on it initially, then on Linux from 2004-ish). It had a very Motif-like interface. Does anyone else remember its name?
[+] [-] sqs|2 years ago|reply
Our licensing principle remains to charge companies while making tools for individual devs open source. Very few individual devs (or companies) used the limited-feature open-source variant of code search, so we decided to remove it. Usage of Sourcegraph code search was even more skewed toward our official non-OSS build than in other similar situations like Google Chrome vs. Chromium or VS Code vs. VSCodium. Maintaining 2 variants was a burden on our engineering team that had very little benefit for anyone.
You can see more explanation at https://github.com/sourcegraph/sourcegraph/issues/53528#issu.... The change was announced in the changelog and in a PR (all of our development occurs in public), and we will have a blog post this week after we separate our big monorepo into 2 repos as planned: the 100% OSS repo for Cody and the non-OSS repo for code search.
You can still use Sourcegraph code search for free on public code at https://sourcegraph.com and on our self-hosted free tier on private code (which means individual devs can still run Sourcegraph code search 100% for free). Customers are not affected at all.
[+] [-] CAP_NET_ADMIN|2 years ago|reply
It's no wonder, that the usage of OSS version was pretty low, when few were able to build it and even if they managed that, the resulting application was broken every few releases.
Both VS Code and Chromium are easy to build, due to their nature and popularity, they are available prebuilt from many sources. I would install "unofficial" Chromium build from my distribution's repository, I wouldn't keep my code in unofficial Sourcegraph build from some random person on Github. Comparing them is rather unfair, but there's another issue that stopped OSS adoption.
For a long time, official Sourcegraph Docker image came with a 10 seat free license, which suited many people and they weren't looking for alternatives like OSS build.
I would argue that announcing license change and closing of your product as a small block in change log file or when someone mentions the problem in Github Issues is not adequate for such a change.
Not using open-first principles, restricting the product by using enterprise only plugins, which others mentioned under this post, not providing open source builds and changing license without preceding announcement, while previously using open source terminology for some feel-good free marketing leaves a bitter taste. Especially with so many companies doing this right now due to interest rates.
https://github.com/sourcegraph/sourcegraph/issues/43231 https://github.com/sourcegraph/sourcegraph/issues/43203 https://github.com/sourcegraph/sourcegraph/issues/6790 https://github.com/sourcegraph/sourcegraph/issues/6783
[+] [-] ynx|2 years ago|reply
I don't doubt that your perception of the OSS version's lack of success is accurate, and it is definitely easier to close off the source, but at the same time the outcome of this is that one funnel into it is closing with the calculation that the effort spent to keep that funnel open wasn't worth the people coming in through it.
The other possibility, and one that I subscribe to (not that it isn't self-serving) is that the funnel was never open enough to see success in the first place.
[+] [-] xenago|2 years ago|reply
Stop using the term Open Source. It's not open source if you apply restrictions like this, it's pretty easy to see that you're being disingenuous. These licenses are not OSI approved.
[+] [-] jwmcglynn|2 years ago|reply
The challenge I had with Sourcegraph is that it's out of reach of developers working on personal projects. There isn't a hosted plan, and for my projects I can't easily open source them due to my employer.
I was really excited when the Sourcegraph App was released, since it allowed me to give Sourcegraph a try on my project without going through the complex self-hosted setup. I went as far as getting scip-clang working with my Bazel-based project, and then tried out the docker-compose setup on my home lab.
Now that code search was removed from the app, and this change, I'm concerned that I won't be able to use Sourcegraph for my personal projects in the future.
This is a missed opportunity. I think individual developers using products for personal projects are powerful advocates, since those developers may convince their employer to purchase the product. If I could I'd gladly pay, but I'm just one person and can't justify $5k/year.
[+] [-] sqs|2 years ago|reply
[+] [-] linuxdude314|2 years ago|reply
Are you sure this isn’t just a way for you to crack down on license abuse?
[+] [-] HumanOstrich|2 years ago|reply
The response was basically "sorry, we're too busy".
Your business model is fascinating.
[+] [-] inglor|2 years ago|reply
(I Don like the news itself obviously but the delivery was good)
[+] [-] bilalq|2 years ago|reply
[0]: https://cloud.google.com/blog/products/ai-machine-learning/g...
[+] [-] hv42|2 years ago|reply
[+] [-] sensibleduck|2 years ago|reply
[+] [-] jasonmp85|2 years ago|reply
[deleted]
[+] [-] rattray|2 years ago|reply
https://github.com/livegrep/livegrep
Demo at https://livegrep.com/search/linux
We used it at Stripe and it was quite popular; often, searching even a single repo was faster on livegrep than with ripgrep locally.
A post reviewing it: https://www.alexdebrie.com/posts/faster-code-search-livegrep...
A post by its creator, nelhage, on its impact: https://blog.nelhage.com/post/reflections-on-performance/ and another on its architecture: https://blog.nelhage.com/2015/02/regular-expression-search-w...
[+] [-] synergy20|2 years ago|reply
[+] [-] boyter|2 years ago|reply
[+] [-] ryan_lane|2 years ago|reply
[+] [-] williamDafoe|2 years ago|reply
The recent rewrite of github search has probably made sourcegraph irrelevant. If you may recall, original github search used almost the most horrible algorithm possible. It dropped all punctuation and spacing and just searched for identifiers. No patterns allowed, no quoting allowed. One of the only meta-arguments was filename:xyz.
Now that github has improved its basic search functionality, sourcegraph might be doomed.
I used sourcegraph at Lyft which (at the time) had unlimited money to waste on software tools, and installed the open-source version at Databricks but nobody cared.
[+] [-] lopkeny12ko|2 years ago|reply
It only makes it irrelevamt if all your code is hosted on Github.
I'm quite tired of Github-proprietary solutions being hailed as the "industry norm." Or vendors like shipping products and integrations that only work with Github. Git is a decentralized protocol; please treat it like one.
[+] [-] ilyt|2 years ago|reply
[+] [-] me551ah|2 years ago|reply
1. GitHub isn’t free, especially for large private organisations 2. Source graph has much better search functions compared to GitHub
[+] [-] __float|2 years ago|reply
My understanding is that GitHub's on premises version doesn't have any plans to include the new code search functionality.
[+] [-] deng|2 years ago|reply
[+] [-] CAP_NET_ADMIN|2 years ago|reply
It seems like the author of Sourcegraph OSS containers announced that his release train is now dead
https://github.com/jensim/sourcegraph-release-train/
[+] [-] isityouyesitsme|2 years ago|reply
They kept trying to push this "campaign" feature on us, which is an overly-complex auto-refactoring tool that couldn't even support our non-proprietary, well-known build system. For the cost of their license, we instead hired two developers for code refactors, who then went on to make other tooling, and we didn't need to hire someone to babysit their crappy service integration.
I would not say that they had found their niche when speaking to us. Perhaps it has gotten better.
[+] [-] caiusdurling|2 years ago|reply
[+] [-] rattray|2 years ago|reply
> We remain committed to Zoekt, the open source code search engine, and will continue to upstream changes to it.
https://github.com/sourcegraph/zoekt
> The source code will remain publicly available.
> Individual devs will still be able to use Sourcegraph for free on public code at sourcegraph.com and within our self-hosted free tier on private code.
> Very few individual devs or companies used the limited variant of code search that was open source. The vast majority (99.9%+) used the enterprise product. Maintaining two variants going forward was a big burden on our engineering team that had very little benefit for users.
[+] [-] CAP_NET_ADMIN|2 years ago|reply
[+] [-] bogwog|2 years ago|reply
It also says they offer a free self hosted version for individuals, but I couldn’t find that on their site.
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] zomglings|2 years ago|reply
As someone who builds things, it feels like poor craftsmanship to put obstacles in front of your users and limit the extent to which they can use your work.
It also feels like decisions to hamper how people use a product are driven purely by greed.
Let's imagine a world in which Sourcegraph were completely free software. They would probably still have enterprise customers pay them to securely host Sourcegraph on-premise. They wouldn't be able to charge per seat. They would have to make sure their product was cheap enough that their customers wouldn't save a ton of $$ by hiring engineers to maintain Sourcegraph on premise themselves.
I am curious if they (or anyone else running an open core business) has estimates for:
1. How many customers they would lose if they went fully free.
2. How much revenue they would lose if they went fully free.
Building free software and charging people to host it can be the foundation for a sustainable business, but it's unlikely to give VCs the kind of outcomes they want from a successful investment.
To be honest, I think it's fine for infrastructure to be closed/proprietary. There are good reasons to do this if you are writing programs for which security is important - releasing your infrastructure code freely gives attackers a lot of ammunition to work with.
If we believe in the power of automation and in building high quality software, it is possible to build free software that:
1. Is easy for you to deploy and maintain securely on customer infrastructure.
2. Requires very little operational overhead from its you as the host (in terms of support).
3. For which the infrastructure code is proprietary.
This can lead to a very solid business.
Why don't we see more businesses like this?
[+] [-] sixhobbits|2 years ago|reply
[+] [-] CAP_NET_ADMIN|2 years ago|reply
[+] [-] smarx007|2 years ago|reply
I looked at the Debian code search [1] in the past, but for some reason thought it required a bit too much effort and didn't complete my investigation of it. Though [2] looks pretty approachable.
Sourcegraph mentioned Zoekt [3], but I am not sure how usable it is. If it was pretty good, why did Sourcegraph OSS exist?
Finally, from all the discussion how Sourcegraph OSS was very behind in the past few years, I guess there is no serious plan to fork it?
Edit: GCS release [4] seems to have been open-sourced without a frontend.
Edit2: Livegrep [5] and Opengrok [6] were recommended higher in the thread. Quite excited to try them out but if someone has working Docker Compose configs, I would be very thankful for the head start.
Edit3: there is also Eureka [7]. Seems less powerful but easier to deploy.
[1]: https://github.com/Debian/dcs
[2]: https://github.com/Debian/dcs/blob/main/howto/building.md
[3]: https://github.com/sourcegraph/zoekt
[4]: https://github.com/google/codesearch
[5]: https://github.com/livegrep/livegrep
[6]: https://oracle.github.io/opengrok/
[7]: https://github.com/Rajeev-K/eureka
[+] [-] __float|2 years ago|reply
[3] used to be a Google open source project as well, but it fell out of maintenance, and Sourcegraph took it over. It powers most of the basic regex/literal search in Sourcegraph.
Mozilla's code is searchable in Searchfox (https://searchfox.org/) which uses the indexer from Livegrep, combined with their own Git indexer and language-specific cross reference databases.
OpenGrok (https://github.com/oracle/opengrok) is also rather well known, but I have found it to have a slightly worse UI than alternatives.
[+] [-] smarx007|2 years ago|reply
[9]: https://github.com/boyter/cs
[+] [-] smarx007|2 years ago|reply
[8]: https://github.com/hound-search/hound
[+] [-] nullcipher|2 years ago|reply
[+] [-] CAP_NET_ADMIN|2 years ago|reply
Let's see if we get the same amount of upvotes their post got when they open sourced the thing.
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] josephcsible|2 years ago|reply
[+] [-] charcircuit|2 years ago|reply
[+] [-] elzbardico|2 years ago|reply
[+] [-] anotherhue|2 years ago|reply
[+] [-] Pet_Ant|2 years ago|reply
https://about.sourcegraph.com/blog/introducing-steve-yegge
[+] [-] jmclnx|2 years ago|reply
> This blob took too long to generate. But you can view the raw file.
[+] [-] gigatexal|2 years ago|reply
[+] [-] lallysingh|2 years ago|reply
[+] [-] rbanffy|2 years ago|reply
[+] [-] mechanicker|2 years ago|reply
I used it quite a lot before completely moving to Emacs.