This is a really interesting direction. There is this big field of Cell Free (cfRNA) cancer detection. We talked to a few people in the field and think that embedding sequences for this direction could be really valuable. One challenge here is that it's hard to set up evaluation tasks since the public data is scarce
carlsborg|6 months ago
antichronology|6 months ago
I find it takes a large amount of effort to parse what the authors are doing, whether the data is high quality, and how to pre-process it in a way that makes sense for the task at hand.
Would love to chat more about how you're thinking of evaluating quality of these agents.