top | item 35623193 90% savings using optimised Whisper API and distributed computing 4 points| svij137 | 2 years ago |qblocks.cloud 1 comment order hn newest svij137|2 years ago As a developer, I achieved 90% cost-effective transcription and translation by Optimizing OpenAI Whisper and using an innovative approach to get GPUs.TL;DR :1.CTranslate2 is used to optimize the OpenAI Whisper model for efficient inference with Transformer models.2. It can be easily installed on a Q Blocks GPU instance with a simple command.3. The optimized Whisper model can be run on a Q Blocks decentralized 3090 GPU instance. A comparison showed that using the optimized model on Q Blocks resulted in a 12x cost reduction compared to the default model on AWS.
svij137|2 years ago As a developer, I achieved 90% cost-effective transcription and translation by Optimizing OpenAI Whisper and using an innovative approach to get GPUs.TL;DR :1.CTranslate2 is used to optimize the OpenAI Whisper model for efficient inference with Transformer models.2. It can be easily installed on a Q Blocks GPU instance with a simple command.3. The optimized Whisper model can be run on a Q Blocks decentralized 3090 GPU instance. A comparison showed that using the optimized model on Q Blocks resulted in a 12x cost reduction compared to the default model on AWS.
svij137|2 years ago
TL;DR :
1.CTranslate2 is used to optimize the OpenAI Whisper model for efficient inference with Transformer models.
2. It can be easily installed on a Q Blocks GPU instance with a simple command.
3. The optimized Whisper model can be run on a Q Blocks decentralized 3090 GPU instance.