top | item 42851093 (no title) bottled_poe | 1 year ago It will be very interesting to see if they can reproduce a similar model on the shoestring budget claimed by Deepseek. discuss order hn newest anshumankmr|1 year ago but deepseek hasn't claimed the figure touted by everyone for this particular R1 model, cause that 5.6mn was apparently for Deepseek's coder model boroboro4|1 year ago 5.6mn figure is for base Deepseek V3 model. Both instruction and reasoning tuning of it has neglectable cost in comparison with it.
anshumankmr|1 year ago but deepseek hasn't claimed the figure touted by everyone for this particular R1 model, cause that 5.6mn was apparently for Deepseek's coder model boroboro4|1 year ago 5.6mn figure is for base Deepseek V3 model. Both instruction and reasoning tuning of it has neglectable cost in comparison with it.
boroboro4|1 year ago 5.6mn figure is for base Deepseek V3 model. Both instruction and reasoning tuning of it has neglectable cost in comparison with it.
anshumankmr|1 year ago
boroboro4|1 year ago