top | item 43691616

(no title)

miros_love | 10 months ago

>European versions of ARC

But this is an image-like benchmark. Has anyone looked at the article about the EU-ARC, what is the difference? Why can't you measure it on a regular one?

I glanced through it, didn't find it right away, but judging by their tokenizer, they are learning from scratch. In general, I don't like this approach for the task at hand. For large languages, there are already good models that they don't want to compare with. And for low-resource languages, it is very important to take more languages from this language group, which are not necessarily part of the EU

discuss

whiplash451|10 months ago

You might be confusing ARC-AGI and EU-ARC which is a language benchmark [1]

[1] https://arxiv.org/pdf/2410.08928

Etheryte|10 months ago

Why would they want more languages from outside of the EU when they've clearly stated they only target the 24 official languages of the European Union?

miros_love|10 months ago

For example: Slovene language. You simply don't have enough data on it. But if you add all the data that is available on related languages, you will get a higher quality. LLM fails with this property for low-resource languages.