top | item 40814369

(no title)

gabrielruttner | 1 year ago

Folks are using us for long-lived tasks traditionally considered background jobs, as well as near-real-time background jobs. Our latency is acceptable for requests where users may still be waiting, such as LLM/GPU inference. Some concrete examples:

1. Repository/document ingestion and indexing fanout for applications like code generation or legal tech LLM agents

2. Orchestrating cloud deployment pipelines

3. Web scraping and post-processing

4. GPU inference jobs requiring multiple steps, compute classes, or batches

discuss

order

No comments yet.