top | item 38634369

(no title)

atty | 2 years ago

Our comparisons were a little while ago so I apologize I can’t remember if we used BS 1 or 5 - whichever we picked, we were consistent across models.

Insanely fast whisper (god I hate the name) is really a CLI around Transformers’ whisper pipeline, so you can just use that and use any of the settings Transformers exposes, which includes beam size.

We also deal with very poor audio, which is one of the reasons we went with faster whisper. However, we have identified failure modes in faster whisper that are only present because of the conditioning on the previous segment, so everything is really a trade off.

discuss

sanchit-gandhi|2 years ago

Indeed, insanely-fast-whisper supports beam-search with a small code modification to this code snippet: https://huggingface.co/openai/whisper-large-v3

Just call the pipeline with:

result = pipe(sample, generate_kwargs={"num_beams": 5})