top | item 29490001

(no title)

samueldr | 4 years ago

Only thing missing is indexing of branches and forks.

My main use case for GitHub search is identifying provenance of misc. changes in vendor source code tarballs for e.g. Android kernel releases. It's hard, but sometimes possible to rehydrate most of the existing commits through cherry-picks and careful rebases.

The biggest problem with the lack of indexing branches and forks is that sometimes vendors makes releases through branches, or that sometimes repos of interests are forks of e.g. `torvalds/linux`.

Hopefully we can see those being indexed in the future.

I'm also curious: has the plan to drop "less active" repos from the index gone through? Has anything changed?

discuss

order

alufers|4 years ago

> I'm also curious: has the plan to drop "less active" repos from the index gone through? Has anything changed?

Whaaat? I hope it doesn't go through. I use GitHub code search for clues when reverse engineering cheap Chinese IoT crap. Usually I can find some headers / SDKs accidentally uploaded and set to public by a random Chinese guy. Those repos usually have one commit and zero traffic, but they contain invaluable information about proprietary MCUs.

ihnorton|4 years ago

I would personally like to see less indexing of duplicate files! There are many things I’ve searched for which return 100s of results from independent checkin-uploads of big libraries like the Android SDK. It would be great if results were filtered by file similarity regardless of git history (if that is in fact the issue).

dstaheli|4 years ago

Nice idea, ihnorton. Thanks for the feedback.