top | item 44043644

(no title)

rdli | 9 months ago

In this analogy, Dynamo is most definitely not like Django. It includes inference aware routing, KV caching, etc. -- all the stuff you would need to run a modern SOTA inference stack.

discuss

qntty|9 months ago

You're right, I was confusing TensorRT with Dynamo. It looks like the relationship between Dynamo and vLLM is actually the opposite of what I was thinking -- Dynamo can use vLLM as a backend rather than vice versa.