This has been going on for decades, it's called GWAS [1] and it has had a few successes but basically hasn't worked as well as everyone hoped it would in the 90s. The reason it doesn't work that well is that the human genome has ~3 billion letters and human physiology is complex. So trying to establish stastitically significant correlations between genome variations and human physiology is hard and requires more than Excel. In fact, the computational tools that have been applied to this are incredibly sophisticated and are not the limiting factor. The limiting factor is that you probably need millions or billions of genomes to make it work, and we don't have that yet. Also people are beginning to realize that many disease-relevant traits are caused by rare variants (rather than obvious statistically significant correlations) which are quite hard to detect this way.So... anyway you're right that this is a natural way to approach the question of understanding the genetic basis of disease and physiology. But it's been beaten to death and found to drive fewer insights than were hoped
[1] https://en.wikipedia.org/wiki/Genome-wide_association_study
disgruntledphd2|4 years ago