(no title)
thanatos_dem | 6 years ago
(See the "Theory: the search problem" section)
Size: This is only indexing ~500k public repos. A first party solution would be expected to index all of it, public and private.
Indexing speed: This can take up to a few days to index. A first party solution would be expected to have a much lower index latency - seconds to minutes.
Query language: This can (and does) have its own simple query language. A first party solution would need to have support embedded into and not break backwards compatibility with the current query language.
Context-dependence: A first party solution would be expected to index private repos as well, and now the query context (logged in user) becomes another variable in an already multi-variate problem space.
Latency: Gets harder with scale, and a first party solution would likely provide a SLA/SLO around latency.
Access control: Same issue as context-dependence, with private repos being included.
There's also unknown but likely considerations around compliance and internationalization, which are quite tricky problems.
Note - I don't mean for this to be critical of the author at all. This is an awesome and useful tool, with a fantastic UX. I just want to make it clear that search at scale is a lot harder than it seems at first glance, especially as the feature requirements increase.
fjania|6 years ago
sdesol|6 years ago
The more reasons you give people to go to GitHub, the better off they will be in the future. So I do agree with you that as a commercial solution, this may not be viable, but for GitHub's public repos, this can turn into a very positive thing.
marceloabsousa|6 years ago