(no title)
davidsainez | 3 months ago
The real issue is that there is no reliable system currently in place for the end user (other than being willing to burn the cash and run your own benchmarks regularly) to detect changes in performance.
It feels to me like a perfect storm. A combination of high cost of inference, extreme competition, and the statistical nature of LLMs make it very tempting for a provider to tune their infrastructure in order to squeeze more volume from their hardware. I don't mean to imply bad faith actors: things are moving at breakneck speed and people are trying anything that sticks. But the problem persists, people are building on systems that are in constant flux (for better or for worse).
Wowfunhappy|3 months ago
There was one well-documented case of performance degradation which arose from a stupid bug, not some secret cost cutting measure.
davidsainez|3 months ago
I have seen multiple people mention openrouter multiple times here on HN: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
Again, I'm not claiming malicious intent. But model performance depends on a number of factors and the end-user just sees benchmarks for a specific configuration. For me to have a high degree of confidence in a provider I would need to see open and continuous benchmarking of the end-user API.
anon7000|3 months ago
That’s not the point — it’s just a day in the life of ops to tweak your system to improve resource utilization and performance. Which can cause bugs you don’t expect in LLMs. it’s a lot easier to monitor performance in a deterministic system, but harder to see the true impact a change has to the LLM