top | item 22398625

(no title)

thanatos_dem | 6 years ago

Next post from danfox - “how to get 3 job offers in 3 hours”.

Already has been publicly contacted by:

- GitHub CTO

- SerpApi CEO

- SourceGraph CEO

Search is hot right now!

discuss

order

swat535|6 years ago

Actually, It would more be like: "How I failed at 3 interviews, despite being directly contacted by execs."

wolco|6 years ago

Couldn't whiteboard a solution without the temp variable.

TheSpiceIsLife|6 years ago

New game show idea:

CTOs from software companies interview at other software companies.

nickthemagicman|6 years ago

Sure you built app on multi 20 core machines with functionality to search hundreds of millions of lines of code almost instantaneously, but are you someone I'd drink a beer with?

runawaybottle|6 years ago

That would be quite the dystopian interview nightmare.

nonbirithm|6 years ago

If only the answer to "how" was as simple as "writing a web service for searching GitHub repos with regexes," even though the problem is probably in itself non-trivial if there's this much interest in search at all. At least the specification is clear enough.

I guess what I mean to ask is, how would people know this is a "correct" answer to the "how" question beforehand? Is the answer literally just "search" because that's simply what's trending right now?

sdesol|6 years ago

It also probably goes without saying he should be careful with what details to share.

Existenceblinks|6 years ago

I'm surprised as well, think why big tech companies didn't have this awesome search already.

thanatos_dem|6 years ago

If this were to be offered by an actual company (a first party solution), there are some features that'd be expected that make the problem space a lot harder. Here's an "intro to search" article that's a good read, and I'll use it to highlight some of the things that'd be different in a first party solution - https://medium.com/startup-grind/what-every-software-enginee...

(See the "Theory: the search problem" section)

Size: This is only indexing ~500k public repos. A first party solution would be expected to index all of it, public and private.

Indexing speed: This can take up to a few days to index. A first party solution would be expected to have a much lower index latency - seconds to minutes.

Query language: This can (and does) have its own simple query language. A first party solution would need to have support embedded into and not break backwards compatibility with the current query language.

Context-dependence: A first party solution would be expected to index private repos as well, and now the query context (logged in user) becomes another variable in an already multi-variate problem space.

Latency: Gets harder with scale, and a first party solution would likely provide a SLA/SLO around latency.

Access control: Same issue as context-dependence, with private repos being included.

There's also unknown but likely considerations around compliance and internationalization, which are quite tricky problems.

Note - I don't mean for this to be critical of the author at all. This is an awesome and useful tool, with a fantastic UX. I just want to make it clear that search at scale is a lot harder than it seems at first glance, especially as the feature requirements increase.