top | item 44293773

(no title)

maxrmk | 8 months ago

> While the specific internal workings of DeepSeek LLM are still being elucidated, it appears to maintain or approximate the self-attention paradigm to some extent.

Totally nonsensical. Deepseeks architecture is well documented, multiple implementations are available online.

discuss

order

No comments yet.