(no title)
antichronology | 6 months ago
We think the situation is similar here - one the challenges is aligning the benchmark with the function of the models. Genomic benchmarks for gLMs and RNA foundation models have been very resistant to staturation.
I think in NLP the problem is that they are victims of their own success where the models can be overfit to particular benchmarks really fast.
In genomics we're a bit behind. A good paper on this is DartEval where they provide levels of complexity https://arxiv.org/abs/2412.05430
in RNA the models work much better than DNA prediction but it's key to have benchmarks to measure progress.
No comments yet.