top | item 42851093

(no title)

bottled_poe | 1 year ago

It will be very interesting to see if they can reproduce a similar model on the shoestring budget claimed by Deepseek.

discuss

order

anshumankmr|1 year ago

but deepseek hasn't claimed the figure touted by everyone for this particular R1 model, cause that 5.6mn was apparently for Deepseek's coder model

boroboro4|1 year ago

5.6mn figure is for base Deepseek V3 model. Both instruction and reasoning tuning of it has neglectable cost in comparison with it.