top | item 45531675

(no title)

aaa29292 | 4 months ago

on the pricing page, what in the world is performance optimized vs cost optimized???

https://docs.extend.ai/2025-04-21/product/general/how-credit...

Are those just different SLAs or different APIs or what?

discuss

This is the most confusing pricing page I’ve ever seen - different options have different credit usage and different cost per credits? How many degrees of freedom do you real need to represent API cost.

kbyatnal|4 months ago

Feedback heard. Pricing is hard, and we've iterated on this multiple times so far.

Our goal is to provide customers with as much transparency & flexibility as possible. Our pricing has 2 axes:

- the complexity of the task

- performance processing vs cost-optimized processing

Complexity matters because e.g. classification is much easier than extraction, and as such it should be cheaper. That unlocks a wide range of use cases, such as tagging and filtering pipelines.

Toggles for performance is also important because not all use cases are created equal. Similar to how having options between cheaper and the best foundation models is important, the same applies to document tasks.

For certain use cases, you might be willing to take a slight hit to accuracy in exchange for better costs and latency. To support this, we offer a "light" processing mode (with significantly lower prices) that uses smaller models, fewer VLMs, and more heuristics under the hood.

For other use cases, you simply want the highest accuracy possible. Our "performance" processing mode is a great fit for that, which enables layout models, signature detection, handwriting VLMs, and the most performant foundation models.

In fact, most pipelines we seen in production often end up combining the two (cheap classification and splitting, paired with performance extraction).

Without this level of granularity, we'd either be overcharging certain customers or undercharging others. I definitely understand how this is confusing though, we'll work on making our docs better!

cle|4 months ago

> How many degrees of freedom do you real need to represent API cost.

The amount that your users care about.

At a large enough scale, users will care about the cost differences between extraction and classification (very different!) and finding the right spot on the accuracy-latency curve for their use case.

kbyatnal|4 months ago

good question!

Our goal is to provide customers with as much flexibility as possible. For certain use cases, you might be willing to take a slight hit to accuracy in exchange for better costs and latency. To support this, we offer a "light" processing mode (with significantly lower prices) that uses smaller models, fewer VLMs, and more heuristics under the hood.

We back this up with a native evals experience in the product, so you can directly measure the % accuracy difference between the two modes for your exact use case.

aaa29292|4 months ago

How different are the extraction qualities, any benchmarks or other info you can share?

kbyatnal|4 months ago

It's very dependent on the use case. That's why we offer a native evals experience in the product, so you can directly measure the % accuracy diffs between the two modes for your exact docs.

As a rule of thumb, light processing mode is great for (1) most classification tasks, (2) splitting on smaller docs, (3) extraction on simpler documents, or (4) latency sensitive use cases.