(no title)
aaa29292 | 4 months ago
https://docs.extend.ai/2025-04-21/product/general/how-credit...
Are those just different SLAs or different APIs or what?
aaa29292 | 4 months ago
https://docs.extend.ai/2025-04-21/product/general/how-credit...
Are those just different SLAs or different APIs or what?
serjester|4 months ago
kbyatnal|4 months ago
Our goal is to provide customers with as much transparency & flexibility as possible. Our pricing has 2 axes:
- the complexity of the task
- performance processing vs cost-optimized processing
Complexity matters because e.g. classification is much easier than extraction, and as such it should be cheaper. That unlocks a wide range of use cases, such as tagging and filtering pipelines.
Toggles for performance is also important because not all use cases are created equal. Similar to how having options between cheaper and the best foundation models is important, the same applies to document tasks.
For certain use cases, you might be willing to take a slight hit to accuracy in exchange for better costs and latency. To support this, we offer a "light" processing mode (with significantly lower prices) that uses smaller models, fewer VLMs, and more heuristics under the hood.
For other use cases, you simply want the highest accuracy possible. Our "performance" processing mode is a great fit for that, which enables layout models, signature detection, handwriting VLMs, and the most performant foundation models.
In fact, most pipelines we seen in production often end up combining the two (cheap classification and splitting, paired with performance extraction).
Without this level of granularity, we'd either be overcharging certain customers or undercharging others. I definitely understand how this is confusing though, we'll work on making our docs better!
cle|4 months ago
The amount that your users care about.
At a large enough scale, users will care about the cost differences between extraction and classification (very different!) and finding the right spot on the accuracy-latency curve for their use case.
kbyatnal|4 months ago
Our goal is to provide customers with as much flexibility as possible. For certain use cases, you might be willing to take a slight hit to accuracy in exchange for better costs and latency. To support this, we offer a "light" processing mode (with significantly lower prices) that uses smaller models, fewer VLMs, and more heuristics under the hood.
For other use cases, you simply want the highest accuracy possible. Our "performance" processing mode is a great fit for that, which enables layout models, signature detection, handwriting VLMs, and the most performant foundation models.
We back this up with a native evals experience in the product, so you can directly measure the % accuracy difference between the two modes for your exact use case.
aaa29292|4 months ago
kbyatnal|4 months ago
As a rule of thumb, light processing mode is great for (1) most classification tasks, (2) splitting on smaller docs, (3) extraction on simpler documents, or (4) latency sensitive use cases.