(no title)
rw | 8 years ago
That is a clever way to make use of a search API like GitHub's. The principled way to do this, though, is to run LDA over all descriptions on GitHub, then use that similarity index to find similar repositories. You could run LDA over code, too.
I'll note that there is a cold start problem with this implementation: using LDA on such a small set of short documents will often lead to uninformative topics with words that are too-specific. You need a big corpus to capture e.g. synonym relationships.
painted|8 years ago
rw|8 years ago
c5urf3r|8 years ago
rw|8 years ago
nl|8 years ago
c5urf3r|8 years ago