I work at Google on the Gemma team, and while not on the core team for this model, participated a bit on this project.
I personally was happy to see this project get built. The dolphin researchers have been doing great science for years, from the computational/mathematics side it was quite neat see how that was combined with the Gemma models.
It's great that dolphins are getting audio decoders in language models first, does the Gemma team intend to roll that out for human speech at some point eventually too?
This sounds very cool at a conceptual level, but the article left me in the dark about what they're actually doing with DolphinGemma. The closest to an answer is:
>By identifying recurring sound patterns, clusters and reliable sequences, the model can help researchers uncover hidden structures and potential meanings within the dolphins' natural communication — a task previously requiring immense human effort.
But this doesn't really tell me anything. What does it mean to "help researchers uncover" this stuff? What is the model actually doing?
As far as I can tell, it hasn't actually done anything yet.
The article reads like the press releases you see from academic departments, where an earth shattering breakthrough is juuuuust around the corner. In every single department, of every single university.
Cool to see the use of consumer phones as part of the setup. Having a suite of powerful sensors, processing, display, and battery in a single, compact, sealed package must be immensely useful for science.
Tangential, but this brings up a really interesting question for me.
LLMs are multi-lingual without really trying assuming the languages in question are sufficiently well-represented in their training corpus.
I presume their ability to translate comes from the fact that there are lots of human-translated passages in their corpus; the same work in multiple languages, which lets them figure out the necessary mappings between semantic points (words.)
But I wonder about the translation capability of a model trained on multiple languages but with completely disjoint documents (no documents that were translations of another, no dictionaries, etc).
Could the emerging latent "concept space" of two completely different human languages be similar enough that the model could translate well, even without ever seeing examples of how a multilingual human would do a translation?
I don't have a strong intuition here but it seems plausible. And if so, that's remarkable because that's basically a science-fiction babelfish or universal translator.
Check out this recent benchmark MTOB (Machine Translation from One Book) -- relevant to your comment, though the book does have parallel passages so not exactly what you have in mind: https://arxiv.org/pdf/2309.16575
In the case of non-human communication, I know there has been some fairly well-motivated theorizing about the semantics of individual whale vocalizations. You could imagine a first pass at something like this if the meaning of (say) a couple dozen vocalizations could be characterized with a reasonable degree of confidence.
Super interesting domain that's ripe for some fresh perspectives imo. Feels like at this stage, all people can really do is throw stuff at the wall. The interesting part will begin when someone can get something to stick!
> that's basically a science-fiction babelfish or universal translator
Ten years ago I would have laughed at this notion, but today it doesn't feel that crazy.
I'd conjecture that over the next ten years, this general line of research will yield some non-obvious insights into the structure of non-human communication systems.
Increasingly feels like the sci-fi era has begun -- what a time to be alive.
>lots of human-translated passages in their corpus
Yes. I remember reading that the EU parliamentary proceedings in particular are used to train machine translation models. Unfortunately, I cant remember where I read that. I did find the dataset: https://paperswithcode.com/dataset/europarl
Languages encode similar human experiences, so their conceptual spaces probably have natural alignments even without translation examples. Words for common objects or emotions might cluster similarly.
But without seeing actual translations, a model would miss nuances, idioms, and how languages carve up meaning differently. It might grasp that "dog" and "perro" relate to similar concepts without knowing they're direct translations.
Wow, there's a lot of cynicism in this thread, even for HN.
Regardless of whether or not it works perfectly, surely we can all relate to the childhood desire to 'speak' to animals at one point or another?
You can call it a waste of resources or someones desperate attempt at keeping their job if you want, but these are marine biologists. I imagine cross species communication would be a major achievement and seems like a worthwhile endeavor to me.
I'm as or more cynical than the next guy - but it seems to me that being able to communicate with animals has high utility for humans. Partly from an emotional or companionship perspective as we've been doing with dogs for a long time, but maybe even on purely utilitarian grounds.
If we want to know something about what's going on in the ocean, or high on a mountain or in the sky or whatever - what if we can just ask some animals about it? What about for things that animals can naturally perceive that humans have trouble with - certain wavelengths of light or magnetic fields for example? How about being able to recruit animals to do specific tasks that they are better suited for? Seems like a win for us, and maybe a win for them as well.
Not sure what else, but history suggests that the more people have been able to communicate with each other, the better the outcomes. I assume this holds true more broadly as well.
It's not even about the communication! Just having more insight into the brains and communication of other mammals has a ton of scientific value in its own right.
Sometimes it's good just to know things. If we needed to find a practical justification for everything before we started exploring it, we'd still be animals.
I for one am simply happy to see us trying to apply LLMs to something other than replacing call centers... humankind SHOULD be exploring and learning sometimes even when there isn't an ROI.
Don’t understand the cynicism either. Is this not way cooler than the latest pre-revenue series F marketing copy slop bot startup?
To me this task looks less like next token prediction language modeling and more like translating a single “word” at a time into English. It’s a pretty tractable problem. The harder parts probably come from all the messiness of hearing and playing sounds underwater.
I would imagine adapting to new vocab would be pretty clunky in an LLM based system. It would be interesting if it were able to add new words in real time.
Gemini supposedly allows for conversational speech w/your data. Have you tried it? We have; it's laughably bad and can't get the most basic stuff right from a well-crafted datamart.
If it can't do the most basic stuff, please explain to be how in the fuck it is going to understand dolphin language and why would should believe its results anyway?
This crowd seems to be cross pollinated with the sci-fi / space exploration set. Communication with cetaceans seems like such an obvious springboard for developing frameworks and techniques for first contact with E.T. /If/ you believe they're out there... And if not, what an incredible opportunity in its own right.
But, since context is so important to communication, I think this would be easier to accomplish with carefully built experiments with captive dolphin populations first. Beginning with wild dolphins is like dropping a guy from New York City into rural Mongolia and hoping he'll learn the language.
I wonder what's the status quo on the non-LLM side; are we able to manually decode sound patterns to recognize dolphin's communication to some degree? If that's the case, I guess this may have a chance.
The experiment design sounds pretty cool. I hope they see some cool results. It would be very cool if humans could talk to another intelligent creature here on earth. This is certainly a step on the way there.
The article says that they've only just begun deploying it, and that it will merely be used to speed up the process of recognizing patterns.
> WDP is beginning to deploy DolphinGemma this field season with immediate potential benefits. By identifying recurring sound patterns, clusters and reliable sequences, the model can help researchers uncover hidden structures and potential meanings within the dolphins' natural communication — a task previously requiring immense human effort. Eventually, these patterns, augmented with synthetic sounds created by the researchers to refer to objects with which the dolphins like to play, may establish a shared vocabulary with the dolphins for interactive communication.
My secret wish is that once they decode the language, they hear the dolphin say to themselves: look, it's again those idiot humans trying to bother us, why can't they just live happily like we do?
Can a powerful model become a fantastic autocomplete for dolphins ? Sure. Someday soon that's very likely to happen. But that alone would tell us almost nothing of what dolphin dialogue means.
To understand their language we need shared experiences, shared emotions, common internal worlds. Observation of dolphin-dolphin interaction would help but to a limited degree.
It would help if the dolphins are also interested in teaching us. Dolphins or we could say to the other '... that is how we pronounce sea-cucumber'. Shared nouns would be the easiest.
The next level, a far harder level, would be to reach the stage where we can say 'the emotion that you are feeling now, that we call "anger"'.
We will no quite have the right word for "anxiety that I feel when my baby's blood flow doesn't sound right in Doppler".
Teaching or learning 'ennui' and 'schadenfreude' would be a whole lot harder.
This begs a question can one fully feel and understand an emotion we do not have a word for ? Perhaps Wittgenstein has an answer.
Postscript: I seem to have triggered quite a few of you and that has me surprised. I thought this would be neither controversial nor unpopular. It's ironic in a way. If we can't understand each other, understanding dolphin "speech" would be a tough hill to climb.
The fact that you cannot wrap your head around something doesn't mean that it's not possible. I do not claim that it is surely possible nor that it isn't. But it sure as hell looks possible.
You also probably don't have kids.
For example: how do you teach a child to speak? Or someone a new language? You show them some objects and their pronunciation. The same with the seagrass and/or a scarf. That's one way. Dolphins can also see (divers with) some objects and name them. We can also guess what they are saying from the sounds plus the actions they do. That's probably how we got "seagrass" in the first place.
For all the word that they don't have in their language, we/they can invent them. Just like we do all the time: artificial intelligence, social network, skyscraper, surfboard, tuxedo, black hole, whatever...
It might also be possible that dolphins' language uses the same patterns as our language(s) and that an LLM that knows both can manage to translate between the two.
I suggest a bit more optimistic look on the world, especially on something that's pretty-much impossible to have any negative consequences for humanity.
I think you are describing more of an edge case than you might think for a vertebrate, mammal, social, warm blooded, air breathing, earth living, pack hunter.
>To understand their language we need shared experiences, shared emotions, common internal worlds
Why? With modern AI there exists unsupervised learning for translation where you don't have to explicitly make translation pairs between the 2 languages. It seems possible to eventually create a way to translate without having to have a teaching process for individual words like you describe.
Not directly related, but one of those stories that is so bizarre you almost can't believe it isn't made up.
There was a NASA funded attempt to communicate with Dolphins. This eccentric scientist created a house that was half water (a series of connected pools) and half dry spaces. A woman named Margaret Howe Lovatt lived full-time with the Dolphins attempting to learn a shared language between them.
Things went completely off the rails in many, many ways. The lead scientist became obsessed with LSD and built an isolation chamber above the house. This was like the sensory deprivation tanks you get now (often called float tanks). He would take LSD and place himself in the tank and believed he was psychically communicating with the Dolphins.
>A woman named Margaret Howe Lovatt lived full-time with the Dolphins attempting to learn a shared language between them.
She also had sex with a male dolphin called Peter.
>He would take LSD and place himself in the tank and believed he was psychically communicating with the Dolphins.
He eventually came to believe he was communicating with a cosmic entity called ECCO (Earth Coincidence Control Office). The story of the Sega game "Ecco the Dolphin" [1] is a tongue-in-cheek reference to this. I recommend watching the Atrocity Guide episode on John C. Lily and his dolphin "science" [2]. It's on par with The Men Who Stare at Goats (the non-fiction book [3], not the movie).
It's funny you were thinking that, because I was thinking, "how would you teach a japanese man english?." The obvious answer is to jerk him off and give him high doses of LSD first. I immediately came to the same conclusion with this AI-dolphin stuff. Have they tried jerking off the dolphin and giving it LSD first? Apparently - yes.
The bad outcome is the "AI" will translate our hellos as an insult, the dolphins will drop the masquerade, reveal themselves as our superiors and pound us into dust once and forever.
Picture the last surviving human surrounded by dolphins floating in the air with frickin laser beams coming out of their heads... all angrily asking "why did you say that about our mother?".
And in the background, ChatGPT is saying "I apologize if my previous response was not helpful".
They’ve been working on decoding dolphin sounds for a long time - Thad was telling me about this project in 2015 and it had been ongoing for a while. One challenge is doing this realtime is extremely difficult because of the frequency the dolphin speech occurs in. And they want to do this realtime which adds to the difficulty level. The other challenge on the AI side is that traditional AI is done using supervised learning whereas dolphin speech would require unsupervised learning. It would be interesting to learn more about how Gemma is helping here.
canyon289|10 months ago
I personally was happy to see this project get built. The dolphin researchers have been doing great science for years, from the computational/mathematics side it was quite neat see how that was combined with the Gemma models.
moffkalast|10 months ago
Imnimo|10 months ago
>By identifying recurring sound patterns, clusters and reliable sequences, the model can help researchers uncover hidden structures and potential meanings within the dolphins' natural communication — a task previously requiring immense human effort.
But this doesn't really tell me anything. What does it mean to "help researchers uncover" this stuff? What is the model actually doing?
bjt|10 months ago
The article reads like the press releases you see from academic departments, where an earth shattering breakthrough is juuuuust around the corner. In every single department, of every single university.
It's more PR fluff than substance.
xnx|10 months ago
lukev|10 months ago
LLMs are multi-lingual without really trying assuming the languages in question are sufficiently well-represented in their training corpus.
I presume their ability to translate comes from the fact that there are lots of human-translated passages in their corpus; the same work in multiple languages, which lets them figure out the necessary mappings between semantic points (words.)
But I wonder about the translation capability of a model trained on multiple languages but with completely disjoint documents (no documents that were translations of another, no dictionaries, etc).
Could the emerging latent "concept space" of two completely different human languages be similar enough that the model could translate well, even without ever seeing examples of how a multilingual human would do a translation?
I don't have a strong intuition here but it seems plausible. And if so, that's remarkable because that's basically a science-fiction babelfish or universal translator.
glomgril|10 months ago
In the case of non-human communication, I know there has been some fairly well-motivated theorizing about the semantics of individual whale vocalizations. You could imagine a first pass at something like this if the meaning of (say) a couple dozen vocalizations could be characterized with a reasonable degree of confidence.
Super interesting domain that's ripe for some fresh perspectives imo. Feels like at this stage, all people can really do is throw stuff at the wall. The interesting part will begin when someone can get something to stick!
> that's basically a science-fiction babelfish or universal translator
Ten years ago I would have laughed at this notion, but today it doesn't feel that crazy.
I'd conjecture that over the next ten years, this general line of research will yield some non-obvious insights into the structure of non-human communication systems.
Increasingly feels like the sci-fi era has begun -- what a time to be alive.
ahartman00|10 months ago
Yes. I remember reading that the EU parliamentary proceedings in particular are used to train machine translation models. Unfortunately, I cant remember where I read that. I did find the dataset: https://paperswithcode.com/dataset/europarl
beernet|10 months ago
Languages encode similar human experiences, so their conceptual spaces probably have natural alignments even without translation examples. Words for common objects or emotions might cluster similarly.
But without seeing actual translations, a model would miss nuances, idioms, and how languages carve up meaning differently. It might grasp that "dog" and "perro" relate to similar concepts without knowing they're direct translations.
ZeroCool2u|10 months ago
Regardless of whether or not it works perfectly, surely we can all relate to the childhood desire to 'speak' to animals at one point or another?
You can call it a waste of resources or someones desperate attempt at keeping their job if you want, but these are marine biologists. I imagine cross species communication would be a major achievement and seems like a worthwhile endeavor to me.
Nifty3929|10 months ago
If we want to know something about what's going on in the ocean, or high on a mountain or in the sky or whatever - what if we can just ask some animals about it? What about for things that animals can naturally perceive that humans have trouble with - certain wavelengths of light or magnetic fields for example? How about being able to recruit animals to do specific tasks that they are better suited for? Seems like a win for us, and maybe a win for them as well.
Not sure what else, but history suggests that the more people have been able to communicate with each other, the better the outcomes. I assume this holds true more broadly as well.
lukev|10 months ago
Sometimes it's good just to know things. If we needed to find a practical justification for everything before we started exploring it, we'd still be animals.
davedigerati|10 months ago
j45|10 months ago
amarant|10 months ago
The cynicism on display here is little more than virtue signalling and/or upvote farming.
Sad to see such thoughtless behaviour has reached even this bastion of reason.
janalsncm|10 months ago
To me this task looks less like next token prediction language modeling and more like translating a single “word” at a time into English. It’s a pretty tractable problem. The harder parts probably come from all the messiness of hearing and playing sounds underwater.
I would imagine adapting to new vocab would be pretty clunky in an LLM based system. It would be interesting if it were able to add new words in real time.
unknown|10 months ago
[deleted]
nsonha|10 months ago
unknown|10 months ago
[deleted]
morkalork|10 months ago
https://www.nytimes.com/2017/12/08/science/dolphins-machine-...
garciasn|10 months ago
If it can't do the most basic stuff, please explain to be how in the fuck it is going to understand dolphin language and why would should believe its results anyway?
engineer_22|10 months ago
But, since context is so important to communication, I think this would be easier to accomplish with carefully built experiments with captive dolphin populations first. Beginning with wild dolphins is like dropping a guy from New York City into rural Mongolia and hoping he'll learn the language.
summerlight|10 months ago
tpl|10 months ago
meindnoch|10 months ago
nikolayasdf123|10 months ago
rideontime|10 months ago
> WDP is beginning to deploy DolphinGemma this field season with immediate potential benefits. By identifying recurring sound patterns, clusters and reliable sequences, the model can help researchers uncover hidden structures and potential meanings within the dolphins' natural communication — a task previously requiring immense human effort. Eventually, these patterns, augmented with synthetic sounds created by the researchers to refer to objects with which the dolphins like to play, may establish a shared vocabulary with the dolphins for interactive communication.
oulipo|10 months ago
And then the world will suddenly understand...
dcl|10 months ago
floridianfisher|10 months ago
unknown|10 months ago
[deleted]
srean|10 months ago
To understand their language we need shared experiences, shared emotions, common internal worlds. Observation of dolphin-dolphin interaction would help but to a limited degree.
It would help if the dolphins are also interested in teaching us. Dolphins or we could say to the other '... that is how we pronounce sea-cucumber'. Shared nouns would be the easiest.
The next level, a far harder level, would be to reach the stage where we can say 'the emotion that you are feeling now, that we call "anger"'.
We will no quite have the right word for "anxiety that I feel when my baby's blood flow doesn't sound right in Doppler".
Teaching or learning 'ennui' and 'schadenfreude' would be a whole lot harder.
This begs a question can one fully feel and understand an emotion we do not have a word for ? Perhaps Wittgenstein has an answer.
Postscript: I seem to have triggered quite a few of you and that has me surprised. I thought this would be neither controversial nor unpopular. It's ironic in a way. If we can't understand each other, understanding dolphin "speech" would be a tough hill to climb.
Mystery-Machine|10 months ago
For all the word that they don't have in their language, we/they can invent them. Just like we do all the time: artificial intelligence, social network, skyscraper, surfboard, tuxedo, black hole, whatever...
It might also be possible that dolphins' language uses the same patterns as our language(s) and that an LLM that knows both can manage to translate between the two.
I suggest a bit more optimistic look on the world, especially on something that's pretty-much impossible to have any negative consequences for humanity.
ruthvik947|10 months ago
weard_beard|10 months ago
charcircuit|10 months ago
Why? With modern AI there exists unsupervised learning for translation where you don't have to explicitly make translation pairs between the 2 languages. It seems possible to eventually create a way to translate without having to have a teaching process for individual words like you describe.
neuroelectron|10 months ago
exe34|10 months ago
unknown|10 months ago
[deleted]
zoogeny|10 months ago
There was a NASA funded attempt to communicate with Dolphins. This eccentric scientist created a house that was half water (a series of connected pools) and half dry spaces. A woman named Margaret Howe Lovatt lived full-time with the Dolphins attempting to learn a shared language between them.
Things went completely off the rails in many, many ways. The lead scientist became obsessed with LSD and built an isolation chamber above the house. This was like the sensory deprivation tanks you get now (often called float tanks). He would take LSD and place himself in the tank and believed he was psychically communicating with the Dolphins.
1. https://www.theguardian.com/environment/2014/jun/08/the-dolp...
meindnoch|10 months ago
She also had sex with a male dolphin called Peter.
>He would take LSD and place himself in the tank and believed he was psychically communicating with the Dolphins.
He eventually came to believe he was communicating with a cosmic entity called ECCO (Earth Coincidence Control Office). The story of the Sega game "Ecco the Dolphin" [1] is a tongue-in-cheek reference to this. I recommend watching the Atrocity Guide episode on John C. Lily and his dolphin "science" [2]. It's on par with The Men Who Stare at Goats (the non-fiction book [3], not the movie).
He has a website that looks like it's been untouched since his death, 2001: http://www.johnclilly.com/
[1] https://en.wikipedia.org/wiki/Ecco_the_Dolphin
[2] https://www.youtube.com/watch?v=UziFw-jQSks
[3] https://en.wikipedia.org/wiki/The_Men_Who_Stare_at_Goats
maebert|10 months ago
Paraphrasing carl sagan: "You don't go to Japan and kidnap a Japanese man start jking him off, give him fing acid, and then ask him to learn English!"
amy214|10 months ago
trollied|10 months ago
srean|10 months ago
sarreph|10 months ago
nottorp|10 months ago
The bad outcome is the "AI" will translate our hellos as an insult, the dolphins will drop the masquerade, reveal themselves as our superiors and pound us into dust once and forever.
Picture the last surviving human surrounded by dolphins floating in the air with frickin laser beams coming out of their heads... all angrily asking "why did you say that about our mother?".
And in the background, ChatGPT is saying "I apologize if my previous response was not helpful".
timonofathens|10 months ago
[deleted]
givemeethekeys|10 months ago
rcarmo|10 months ago
rcarmo|10 months ago
rambambram|10 months ago
[deleted]
GuinansEyebrows|10 months ago
[deleted]
bigdatajs|10 months ago
[deleted]
nightfly|10 months ago
xena|10 months ago
vlovich123|10 months ago
Philpax|10 months ago
palashkulsh|10 months ago