(no title)
rfgmendoza | 2 years ago
when a query is asked of an AI it has to generate a response from all of that data and the query and response themselves become data
running an LLM on local consumer hardware can take upwards of 20 minutes for a single query, so an AI service that may be responding to up millions of requests a day would need a massive hyper-parallelized server infrastructure
No comments yet.