top | item 21217787

(no title)

tom_b | 6 years ago

We've found that graph-based community approaches have some really nice benefits in our bioinformatics data.

In particular, we have found that these approaches seem to preserve very small cluster structure "better" than traditional approaches. Meaning, we have a small group of cells that we know belong to their own cluster group and the graph-based community approaches preserve these "small" groups outside of other clusters nicely.

But we have also noticed (and had some feedback) that we windup with final modularity scores that are very high - greater than or equal to 0.90 (on a scale of -1 to 1). Applied math folks in the graph algorithms world kind of seem to look at that and go "eh, that is so high you should probably just do PCA and move on . . . "

Especially given that you could (and people seem to) use UMAP as a precursor to louvain methods, I'll probably be looking into UMAP to see how it goes. But our current performance bottleneck is that the clustering (or community approach of graphs with the louvain method) is our computational bottleneck, so we'd like to whittle that runtime down as much as possible.

discuss

No comments yet.