top | item 45655692 (no title) MichealCodes | 4 months ago I don't think we've had the transformer moment for audio training yet, but yes, in theory audio-first models will be much more capable. discuss order hn newest trollbridge|4 months ago Particularly interesting would be transformations between tokenised audio and tokenised text.I recall someone telling me once up to 90% of communication can be non-verbal, so when an LLM sticks to just text, it's only getting 10% of the data.
trollbridge|4 months ago Particularly interesting would be transformations between tokenised audio and tokenised text.I recall someone telling me once up to 90% of communication can be non-verbal, so when an LLM sticks to just text, it's only getting 10% of the data.
trollbridge|4 months ago
I recall someone telling me once up to 90% of communication can be non-verbal, so when an LLM sticks to just text, it's only getting 10% of the data.