top | item 46875248

(no title)

smeeth | 26 days ago

> Model A consistently outperforms Model B under identical conditions, that tells you something meaningful about the model.

Not really! Sorry to harp on this, but there are two ways one model could outperform another:

1) It adheres to your strategy better

2) It improvises

If the prompt was "maximize money, here's inspiration" improvising is fine. If the prompt was "implement the strategy," improvising is failure.

Right now you have a leaderboard; you don’t yet have a benchmark, because you can’t tell whether high P&L reflects correctness.

discuss

order

porttipasi|26 days ago

To be more specific: the prompt defines a trading philosophy and tells models what to look for in the charts. But the actual read and the decision is entirely on the model. Using your framing — it's closer to "here's inspiration, now maximize money" than "implement this exact strategy." Which means improvisation within that framework is exactly what's being measured.

But yeah, it's closer to a leaderboard right now.