top | item 44114266

(no title)

gommm | 9 months ago

This looks like LLM blog spam. To be taken seriously, they'd need to publish the implementation in each language of each benchmark which they didn't.

Instead they show pseudocode with very vague descriptions of failure mode that do not really make sense: "Under our error cascade simulation, some low-level failures in unsafe code regions propagated in ways that eventually caused deadlocks in resource management." That doesn't give any details nor does it sound like a realistic failure case to have "failures in unsafe code regions".

discuss

No comments yet.