top | item 45575458

(no title)

shawntan | 4 months ago

This would not help if no proper constraints are established on what data can and cannot be trained on. And maybe just figuring out what the goal of the benchmark is.

If it is to test generalisation capability, then what data the model being evaluated is trained on is crucial to making any conclusions.

Look at the construction of this synthetic dataset for example: https://arxiv.org/pdf/1711.00350

discuss

No comments yet.