top | item 45508077

(no title)

shawntan | 4 months ago

That analysis provided a very non-abrasive wording of their evaluation of HRM and its contributions. The comparison with a recursive / universal transformer on the same settings is telling.

"These results suggest that the performance on ARC-AGI is not an effect of the HRM architecture. While it does provide a small benefit, a replacement baseline transformer in the HRM training pipeline achieves comparable performance."

discuss

order

No comments yet.