top | item 47176240

(no title)

I built a C++ inference engine for NVIDIA's Parakeet speech recognition models using Axiom(https://github.com/Frikallo/axiom) my tensor library.

What it does: - Runs 7 model families: offline transcription (CTC, RNNT, TDT, TDT-CTC), streaming (EOU, Nemotron), and speaker diarization (Sortformer) - Word-level timestamps - Streaming transcription from microphone input - Speaker diarization detecting up to 4 speakers

discuss

aaronbrethorst|3 days ago

I see a number of references to macOS support in your docs for Axiom. Can this run on iOS?

noahkay13|2 days ago

Theoretically, yes? This hasent been tested but xcode has great c++ interop and the goal with Axiom and now parakeet.cpp is to be used for portable deployments so making that process easier is definitely on the roadmap.

computerex|2 days ago

Oh hey I just implemented this in golang. Mine implementation heavily optimized for cpu.

pdyc|2 days ago

can you share your repo.