top | item 42566034

(no title)

x_may | 1 year ago

I believe they are using scalable TTC. The o3 announcement released accuracy numbers for high and low compute usage, which I feel would be hard to do in the same model without TTC.

I also believe that the 200$ subscription they offer is just them allowing the TTC to go for longer before forcing it to answer.

If what you say is true, though, I agree that there is a huge headroom for TTC to improve results if the huggingface experiments on 1/3B models are anything to go off.

discuss

ankit219|1 year ago

The other comment posted YT videos where Open AI researchers are talking about TTC. So, I am wrong. That $200 subscription is just because the number of tokens generated are huge when CoT is involved. Usually inference output is capped at 2000-4000 tokens (max of ~8192) or so, but they cannot do it with o1 and all the thinking tokens involved. This is true with all the approaches - next token prediction, TTC with beam/lookahead search, or MCTS + TTC. If you specify the output token range as high and induce a model to think before it answers, you will get better results on smaller/local models too.

> huge headroom for TTC to improve results ...1B/3B models

Absolutely. How this is productized remains to be seen. I have high hopes with MCTS and Iterative Preference Learning, but it is harder to implement. Not sure if Open AI has done that. Though Deepmind's results are unbelievably good [1].

[1]:https://arxiv.org/pdf/2405.00451v2

whimsicalism|1 year ago

ttc is an incredibly broad term and it is broadening as the hype spreads. people are now calling CoT “TTC” because they are spending compute on reasoning tokens before answering

HarHarVeryFunny|1 year ago

Yes, and HuggingFace have published this outlining some of the potential ways to use TTC, including but not limited to tree search, showing TTC performance gains from LLama.

https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling...