top | item 42436490

(no title)

1. Yes that's correct to some degree. Depending on the model details we might need to do some manual tweaking to get everything up and running, but generally we can get a model up within a day. There's always optimizations and tests we like to run before listing something as publically available to ensure the best experience for our users.

If a fully self-serve system is something you would like to see, we would love to hear more!

2. Could you please elaborate on the 50% cheaper option? If you're referring to the line on our website, that is due to our efficiency at scale. This efficiency benefit allows us to provide the models at the price that we do without implementing rate limits to manage our costs. Additionally, this 50% more efficient GPU utilization also benefits anyone looking to use our infrastructure for on-prem solutions.

discuss

qeternity|1 year ago

> Reduce AI GPU Infrastructure Bills by 50%

Ok so how does #2 help me do this?

DiederikVink|1 year ago

If you deploy our solution on-prem, you would be able to handle 2x the workload on the same amount of hardware. This ensures you scale up your hardware 2x slower, giving you a ~50% reduction in your GPU Infrastructure bills.