I'm conflicted about this. On one hand, it makes content more accessible to a larger audience. On the other hand, it leverages copyrighted material without crediting or compensating creators, potentially puts those same creators out of work, and finally, reduces the likelihood of more such (human) creators arising in the future. My worry is that a few generations hence, human beings will forget many skills like this, and if model collapse occurs due to LLMs ingesting their own data over successive iterations, future generations will be in for a difficult time. Reminiscent of Asimov's "The Feeling of Power".
Wondercraft have been offering this service for a while, and produce some of their own auto-generated podcasts including the Hacker News Recap which does an excellent job of summarizing the most engaged posts on HN.
https://www.wondercraft.ai/our-podcasts
I made one for fun last year. It was quite easy to get two hosts talking to each other in a natural manner. It's just a python script where I tell it which Reddit discussion or other topic to make an episode segment about, and it works fine as long as I cherry-picked out of a few generations.
Here's an example segment, demonstrating an extra feature where they can call an expert to weigh in on whatever they are talking about:
https://soundcloud.com/bemmu/19animals
I hate the robo voiced videos. I watch a lot of space content and run into them often on the homepage. Usually easy to spot with low views and 1k subs.
Very clever use case. I'm presuming the set up here is as follows:
- LLM-driven back and forth with the paper as context
- Text-to-speech
Pricing for high quality text to speech with Google's studio voices run at USD 160.00/1M count. And given the average 10 minute recording at the average 130 WPM is 1,300 words and at 5 characters per word is 6500, we can estimate an audio cost of $1. LLM cost is probably about the same given the research paper processing and conversation.
So only costs about $2-3 per 10 minute recording. Wild.
One problem I see with this is legitimizing LLM-extracted content as canon. The realistic human speech masks the fact that the LLM might be hallucinating or highlighting the wrong parts of a book/paper as important.
The top list of Apple Podcasts is full of real humans intentionally lying or manipulating information, it makes me worry much less about computer generated lies
We'll have to see how it holds up for general books. The books they highlighted are all very old and very famous, so the training set of whatever LLM they use definitely has a huge amount of human-written content about them, and the papers are all relatively short.
We can find thousands of hours of discussions about popular papers such as "Attention is All You Need". It should be possible to generate something similar without using the paper as a source -- and I suspect that's what the AI is doing here.
In other words: it's not summarising the paper in a clever way, it is summarising all the discussions that have been made about it.
This is really cool, and it got me thinking - is there any missing piece to creating a full AI lecturer based on this?
What I'm thinking of is that I'd input a pdf, and the AI will do a bit of preprocessing leading to the creation of learning outcomes, talking points, visual aids and comprehension questions for me; and then once it's ready, will begin to lecture to me about the topic, allowing me to interrupt it at any point with my questions, after which it'll resume the lecture while adapting to any new context from my interruptions.
Listening to an AI generated discussion-based podcast on the topic of anticipating the scraping of deceased people's digital footprint to create an AI copy of your loved one makes the cells that make up my body want to give up on fighting entropy.
With Google's 1 million token and Sonnet 3.5's 200,000 token limit, is there any advantage of using this over just uploading the pdf files and ask questions about it.
I was under the impression that you will get more accurate results by adding the data in chat.
I’ve been using the ElevenLabs Reader app to read some articles during my drive and it’s been amazing. It’s great to be able to listen to Money Stuff whenever I want to. The audio quality is about 90% there. Occasionally, the tone of the sentence is wrong (like surprised when it should be sad) and the wrong enunciation (bow, like bowing down or tying a bow) but still very listenable.
The reading is very natural overall, though sometimes the emphasis is a bit off. What catches my ear is when Word A in a sentence receives stronger stress than Word B, but the longer context suggests that actually it should be Word B with the greater emphasis. An inexperienced human reader might miss that as well, but a professional narrator who is thinking about the overall meaning would get it right.
I prefer professional human narration when it is available, but the Reader app’s ability to handle nearly any text is wonderful. AI-read narration can have another advantage: clarity of enunciation. Even the most skillful human narrator sometimes slurs a consonant or two; the ElevenLabs voices render speech sounds distinctly while still sounding natural.
1. Take a science book. I used one Einstein loved as a kid, in German. But I can also use Asimov in English. Or anything else. We’ll handle language and outdated information on the LLM level.
2. Extract the core ideas and narrative with an LLM and rewrite it into a conversation, say, between a curious 7 year old girl and her dad. We can take into account what my kids are interested in, what they already know, facts from their own life, comparisons with their surroundings etc. to make it more engaging.
3. Turn it into audio using Text-to-Speech (multiple voices).
While this is very nice what I need is my computer to take voice commands, read content in various formats and structure, and take dictation for all of my apps. I need this in my phone too. I can do this now but I have to use a bunch of different tools that don't work seamless together. I need the Voice and Conversational User Interface that is built into the operating system.
That sounds like a great broader vision, but let's also celebrate the significant step in that direction that this work presents. This appears to be very useful as is.
I like how it generates a conversation, rather than just "reading out" or simplifying the content. You can extend this idea to enhance the dynamics of agent interactions
I think the obvious next feature for this specific thing is to be able to click to begin asking questions in the context of the audio you just listened to. You can basically become one of the hosts- “You mentioned before about RNNs, tell me more about that”
One useful use case would be helping making academic papers more accessible. It would be useful also for people to listen to arxiv papers that seems interesting. It would be useful tool in academic world. Also useful for students who would have more accessible form of learning.
I have a project idea already to use arxiv RSS API to fetch interesting papers based on keywords (or some LLM summary) and then pass it to something like illuminate and then you have a listening queue to follow latest in the field. Though there will be some problems with formatting but then you could just open the pdf to see the plots and equations.
I can see this working reasonably for text that you can understand without referring to figures, and for texts for which there is external content available that such a conversation could be based on. For a new, say, math paper, without prose interspersed, I’d be surprised if the generated conversation will be worth much. On the other hand, that is a corner case and, personally, I suspect I will be using this for the many texts where all I need is a presentation of the material that is easy to listen to.
Occasionally there's a podcast or video I'd like to listen to, but one of the voices is either difficult to understand, or in some way awful to listen to, or maybe the sound quality is really bad. It would be nice to have a an option for an automatically redubbed audio.
I sure do wish podcasters would learn about compression. I am constantly getting my ears blown out in the car from a podcast with multiple speakers who are at different volumes.
I listen to 5 mins of this and all I can feel is sadness and how cringe it is.
Please do not replace humanity with a faint imitation of what makes use human, actual spontaneity.
If you produce AI content, don't emulate small talk and quirky side jabs. It's pathetic.
This is just more hot garbage on top of a pile of junk.
I imagine a brighter future where we can choose to turn that off and remove it from search, like the low quality content it is. I would rather read imperfect content from human beings, coming from the source, than perfectly redigested AI clown vomit.
Note: I use AI tools every day. I have nothing against AI generated content, I have everything against AI advancements in human replacement, the "pretend" part. Classifying and returning knowledge is great. But I really dislike the trend of making AI more "human like", to the point of deceiving, such as pretending small talk and perfect human voice synthesis.
If AI-generated speech is robot-like, dull and monotonous, it will be boring.
I think we need human-like speech to make it interesting to listen to.
What's your solution to this problem?
OTOH, i think the AI generated stuff should be clearly marked as such so there is no pretending.
I think they've set it up to sound like NPR meets patronizing customer support agent. They could easily set it up to sound exactly the way you / any listener would like to hear their podcasts.
But yeah - like electronic instruments, AI will take away the blue collar creative jobs, leaving behind a lot more noise and an even greater economic imbalance.
>don't emulate small talk and quirky side jabs. It's pathetic.
>all I can feel is sadness and how cringe it is.
Hm, really? I came to the opposite conclusion. I explained this to a friend who can see very little, and usually relies on audio to experience a lot of the world and written content - it is especially hard because a lot of written content isn't available in audio form or isn't talked about it.
He was pretty excited about it, and so am I. Maybe it's not the use case for you, and that's fine, but going "this is pathetic, no one is using it, le cringe" is a bit far.
"Illuminate is an experimental technology that uses AI to adapt content to your learning preferences. Illuminate generates audio with two AI-generated voices in conversation, discussing the key points of select papers. Illuminate is currently optimized for published computer science academic papers.
As an experimental product, the generated audio with two AI-generated voices in conversation may not always perfectly capture the nuances of the original research papers. Please be aware that there may be occasional errors or inconsistencies and that we are continually iterating to improve the user experience."
Looks like you can generate from Website URLs if you add them as sources to your notebook, as well as Slides, Docs, PDFs etc. Anything NotebookLM supports.
What a fantastic idea!
Great way to learn about those pesky research papers I keep downloading (but never get to reading them).
I tried a few, e.g. Attention is All You Need, etc. The summary was fantastic, and the discussion was, well, informative.
Does anyone know how the summary was generated? (text summarization, I suppose?) Is there a bias towards "podcast-style discussion"? Not that I'm complaining about it - just that I found it helpful.
Like SSML? See azure tts or google cloud tts, or ibm Watson or even old school system tts like SAPI voices on windows. But I hear you. In a VITS typical model system ssml isn’t standard. Piper tts does have it on the roadmap.
AI voices sound particularly good at higher playback rates, with silence removal. Which is granted is an acquired taste, but common feature for podcast players so there's audience for it. Fast talkers feel more competent and one kind of stops interrogating on quality of speech.
What does this accomplish? Who does this help? How does this make the world a better place?
This only seems like it would be useful for spammers trying to game platforms, which is silly because spam is probably the number one thing bringing down the quality of Google's own products and services.
How about making the program work in the other direction. It could take one of those 30 minute youtube tutorial videos that is full of fluff and music, and turn it into an instructables-like text article with a few still pictures.
It also tells us something about humans, because it really does feel more engaging having two voices discussing a subject than simple text-to-speech, even though the information density is smaller.
The choice of intonement even mimics creatives which I'm sure they'll love. The vocal fry, talking through a forced smile, bumbling host is so typical. Only, no one minds demanding better from a robot so it's even more excruciating fluff with no possible parasocial angle.
Limiting choice to frivolous voices is really testing the waters for how people will respond to fully acted voice gen from them, they want that trust from the creative guild first. But for users who run into this rigid stuff it's going to be like fake generated grandma pics in your google recipe modals.
Books I can understand, but I'm genuinely curious: would anyone here find it useful to hear scientific papers as narrated audio? Maybe it depends on the field, but when I read e.g. an ML paper, I almost always have to go through it line-by-line with a pen and scratchpad, jumping back and forth and taking notes, to be sure I've actually "got it". Sometimes I might read a paragraph a dozen times. I can't see myself getting any value out of this, but I'm interested if others would find it useful.
Maybe I'm the odd one out but "That's interesting. Can you elaborate more?", "Good question", "That sounds like a clever way" etc were annoying filler.
Synthesized voices are legitimately a great way to read more and give your eyes a break. I personally prefer just converting a page or book to an audiobook myself locally.
The new piper TTS models are easy to run locally and work very well. I made a simple CLI application and some other folks here liked it so figured I post it.
I'm fairly excited for this use case. I recently made the switch from Audible to Libby for my audiobook needs. Overall, it's been good/fine, but I get disappointed when the library only has text copies of a book I want to listen to. Often times they aren't especially popular books so it seems unlikely they'll get a voiceover anytime soon. Using AI to narrate these books will solve a real problem I experience currently :)
If can tell where content came from, it's fine with me. If a host of paid spammers or bots can astroturf an opinion and fool me into thinking they are a wide demographic, that's a problem. And it is -- but it predates LLMs.
I honestly don't think this is all that big. What we are seeing has been possible for more than 6 months now(?) with gpt4 and elevenlabs, its just put together in a nice little demo website and with what seems like a multi-modal model(?) trained on nytimes the daily episodes lol. And no i don't think this will gain all that much traction. We will keep valuing authentic human interaction more and more.
I think it's more likely this will end up merged as part of another offering. If it feels more like a feature than a product, which I think is true of a lot of things on that list.
Works surprisingly well. I actually bothered to listen "discussions" about these boring-looking papers.
English is particularly bad to read aloud because it is like programming language Fortran based on immutable tokens. If you want tonal variety, you have to understand the content.
Some other languages modify the tokens themselves, so just one word can be pompous, comical, uneducated etc.
I'm bullish on podcasts as a Passive learning counterpart to the Active learning style in traditional educational instruction. Will be releasing a general purpose podcast generator for educational purposes in reasonote.com within the next few days, along with the rest of the core featureset.
Self-answer but leaving in case anyone else has the same question... seems there are some new options in GCP TTS. Both "studio" and "jorney" are new since I last checked (and I check pretty often).
We are working on something content driven (for an ad or subscription model) with lot of effort and time and I am concerned how this technology will affect all that effort and eventually monetization ideas. But I can see how helpful this tool can be for learning new stuff.
I've been meaning be the all you need is attention paper for yours and never have. And I finally listened to that little generated interview as their first example. I think this is going to be very very useful to me!
founder of podera.ai here, we're building this right now (turn anything into a podcast) with custom voices, customization, and more. would love some hn feedback!
Amazing. I see great future ahead. We are already able to turn audiobooks into eBooks and Illuminate finally completes the circle of content regurgitation.
many times I’ve wanted to listen to a summarisation of a chapter from a textbook I’m reading. this can be useful in at least 3 ways:
1) it prepares me for the real studying. by being exposed to the gist of the material before actual studying, im very confident that the subsequent real study session would be more effective
2) i can brush up easily on key concepts, if im unable to sit properly, eg while commuting. but even if i were, a math textbook can be too dense for this purpose, and i often just want to refresh my memory on key concepts. and often im tired of _reading_ symbols or words, that’s when id prefer to actually _listen_, in a way, using a muscle that’s not tired
3) if im struggling with something, i can play this 5min chapter explanation multiple times a day throughout the week, while doing stuff, and engaging with it in a casual way. i think this would “soften” the struggle tremendously, and increase the chances of grasping the thing next time i tackle it
also id like a “temperature” knob, that i could tweak for how much in detail i want it to go
By now, we can find thousands of hours of discussions online about popular papers such as "Attention is All You Need". It should be possible to generate something similar without using the paper as a source -- and I suspect that's what the AI does.
In other words: I suspect that the output is heavily derivative from online discussions, and not based on the papers.
Of course, the real proof would be to see the output for entirely new papers.
errata. Also real humans often make mistakes in live interviews. The biggest difference is that eventually these fake humans will have lower error rates than real ones.
This is insane! To be able to listen to a conversation to learn about any topic is amazing. Maybe it's just me because I listen to so many podcasts but this is Planet Money or The Indicator from NPR about anything.
Definitely one of the coolest things I have seen an LLM do.
I guess I am in my grouchy old person phase but all I could think of what the Gilfoyle quote from Silicon Valley when presented with a talking refrigerator.
> "Bad enough it has to talk, does it need fake vocal tics...?" - Gilfoyle
I think I just discovered a new emotion. Simultaneous feelings of
excitement and disappointment.
No matter how great the idea, it's hard to stay excited for more than
a few microseconds at the sight of the word "Google". I can already
hear the gravediggers shovels preparing a plot in the Google
graveyard, and hear the sobs of the people who built their lives,
workflows, even jobs and businesses around something that will be
tossed aside as soon as it stops being someone's pet play-thing at
Google.
A strange ambivalent feeling of hope already tarnished with tragedy.
freefaler|1 year ago
Like with robovoiced videos on YT reading some scraped content.
hliyan|1 year ago
fallinditch|1 year ago
bemmu|1 year ago
Here's an example segment, demonstrating an extra feature where they can call an expert to weigh in on whatever they are talking about: https://soundcloud.com/bemmu/19animals
TranquilMarmot|1 year ago
evilkorn|1 year ago
netdevnet|1 year ago
cut3|1 year ago
OutOfHere|1 year ago
fny|1 year ago
- LLM-driven back and forth with the paper as context
- Text-to-speech
Pricing for high quality text to speech with Google's studio voices run at USD 160.00/1M count. And given the average 10 minute recording at the average 130 WPM is 1,300 words and at 5 characters per word is 6500, we can estimate an audio cost of $1. LLM cost is probably about the same given the research paper processing and conversation.
So only costs about $2-3 per 10 minute recording. Wild.
paxys|1 year ago
wg0|1 year ago
dlisboa|1 year ago
vanishingbee|1 year ago
[Attention is All You Need - 1:07]
> Voice A: How did the "Attention is All You Need" paper address this sequential processing bottleneck of RNNs?
> Voice B: So, instead of going step-by-step like RNNs, they introduced a model called the Transformer - hence the title.
What title? The paper is entitled "Attention is All You Need".
People are fooling themselves. These are stochastic parrots cosplaying as academics.
shmatt|1 year ago
gs17|1 year ago
ec109685|1 year ago
It would be good to lead off with a disclaimer.
nine_k|1 year ago
In this regard, LLMs are imperfect like ourselves, just to a different extent.
jamalaramala|1 year ago
In other words: it's not summarising the paper in a clever way, it is summarising all the discussions that have been made about it.
falcor84|1 year ago
What I'm thinking of is that I'd input a pdf, and the AI will do a bit of preprocessing leading to the creation of learning outcomes, talking points, visual aids and comprehension questions for me; and then once it's ready, will begin to lecture to me about the topic, allowing me to interrupt it at any point with my questions, after which it'll resume the lecture while adapting to any new context from my interruptions.
Are we there yet?
marviel|1 year ago
Sign up and I'll let you in very soon.
MaxSebasti|1 year ago
vincentpants|1 year ago
gherkinnn|1 year ago
And before you know it, there is a story of David Cameron diddling a pig's head in his youth and now our deceased are being brought back to life.
Charlie Brooker was ahead of us all.
nxobject|1 year ago
I wish Google would make these experiments more well-known!
sagarpatil|1 year ago
yangcheng|1 year ago
timmg|1 year ago
lasermike026|1 year ago
syntaxing|1 year ago
tkgally|1 year ago
The reading is very natural overall, though sometimes the emphasis is a bit off. What catches my ear is when Word A in a sentence receives stronger stress than Word B, but the longer context suggests that actually it should be Word B with the greater emphasis. An inexperienced human reader might miss that as well, but a professional narrator who is thinking about the overall meaning would get it right.
I prefer professional human narration when it is available, but the Reader app’s ability to handle nearly any text is wonderful. AI-read narration can have another advantage: clarity of enunciation. Even the most skillful human narrator sometimes slurs a consonant or two; the ElevenLabs voices render speech sounds distinctly while still sounding natural.
leobg|1 year ago
1. Take a science book. I used one Einstein loved as a kid, in German. But I can also use Asimov in English. Or anything else. We’ll handle language and outdated information on the LLM level.
2. Extract the core ideas and narrative with an LLM and rewrite it into a conversation, say, between a curious 7 year old girl and her dad. We can take into account what my kids are interested in, what they already know, facts from their own life, comparisons with their surroundings etc. to make it more engaging.
3. Turn it into audio using Text-to-Speech (multiple voices).
GeoAtreides|1 year ago
flakiness|1 year ago
lasermike026|1 year ago
lordswork|1 year ago
banku|1 year ago
awongh|1 year ago
elashri|1 year ago
I have a project idea already to use arxiv RSS API to fetch interesting papers based on keywords (or some LLM summary) and then pass it to something like illuminate and then you have a listening queue to follow latest in the field. Though there will be some problems with formatting but then you could just open the pdf to see the plots and equations.
banach|1 year ago
bitshiftfaced|1 year ago
wintermutestwin|1 year ago
dgellow|1 year ago
jhickok|1 year ago
keyle|1 year ago
Please do not replace humanity with a faint imitation of what makes use human, actual spontaneity.
If you produce AI content, don't emulate small talk and quirky side jabs. It's pathetic.
This is just more hot garbage on top of a pile of junk.
I imagine a brighter future where we can choose to turn that off and remove it from search, like the low quality content it is. I would rather read imperfect content from human beings, coming from the source, than perfectly redigested AI clown vomit.
Note: I use AI tools every day. I have nothing against AI generated content, I have everything against AI advancements in human replacement, the "pretend" part. Classifying and returning knowledge is great. But I really dislike the trend of making AI more "human like", to the point of deceiving, such as pretending small talk and perfect human voice synthesis.
Tepix|1 year ago
OTOH, i think the AI generated stuff should be clearly marked as such so there is no pretending.
givemeethekeys|1 year ago
But yeah - like electronic instruments, AI will take away the blue collar creative jobs, leaving behind a lot more noise and an even greater economic imbalance.
lannisterstark|1 year ago
>all I can feel is sadness and how cringe it is.
Hm, really? I came to the opposite conclusion. I explained this to a friend who can see very little, and usually relies on audio to experience a lot of the world and written content - it is especially hard because a lot of written content isn't available in audio form or isn't talked about it.
He was pretty excited about it, and so am I. Maybe it's not the use case for you, and that's fine, but going "this is pathetic, no one is using it, le cringe" is a bit far.
smusamashah|1 year ago
TranquilMarmot|1 year ago
"Illuminate is an experimental technology that uses AI to adapt content to your learning preferences. Illuminate generates audio with two AI-generated voices in conversation, discussing the key points of select papers. Illuminate is currently optimized for published computer science academic papers.
As an experimental product, the generated audio with two AI-generated voices in conversation may not always perfectly capture the nuances of the original research papers. Please be aware that there may be occasional errors or inconsistencies and that we are continually iterating to improve the user experience."
achow|1 year ago
https://cloud.google.com/text-to-speech/docs/voice-types#cha...
simon_kun|1 year ago
Looks like you can generate from Website URLs if you add them as sources to your notebook, as well as Slides, Docs, PDFs etc. Anything NotebookLM supports.
bahmboo|1 year ago
aanet|1 year ago
Does anyone know how the summary was generated? (text summarization, I suppose?) Is there a bias towards "podcast-style discussion"? Not that I'm complaining about it - just that I found it helpful.
antirez|1 year ago
lxgr|1 year ago
nnx|1 year ago
oidar|1 year ago
willwade|1 year ago
maxglute|1 year ago
bogwog|1 year ago
This only seems like it would be useful for spammers trying to game platforms, which is silly because spam is probably the number one thing bringing down the quality of Google's own products and services.
throwaway81523|1 year ago
tambourine_man|1 year ago
It also tells us something about humans, because it really does feel more engaging having two voices discussing a subject than simple text-to-speech, even though the information density is smaller.
disqard|1 year ago
LLMs have "hacked" this channel, and can participate in a 1:1 conversation with a human (via text chat).
With good text <--> speech, machines can participate in a 1:1 oral conversation with a human.
I'm with you: this is hella scary and creepy.
[0] Walter J Ong: "Orality and Literacy".
theage|1 year ago
Limiting choice to frivolous voices is really testing the waters for how people will respond to fully acted voice gen from them, they want that trust from the creative guild first. But for users who run into this rigid stuff it's going to be like fake generated grandma pics in your google recipe modals.
Analemma_|1 year ago
creativenolo|1 year ago
> Illuminate generates audio with two AI-generated voices in conversation, discussing the key points of select papers.
yencabulator|1 year ago
fabmilo|1 year ago
C-Loftus|1 year ago
https://github.com/C-Loftus/QuickPiperAudiobook
frays|1 year ago
This is a very useful tool, I will Star it and wait until Piper supports MacOS in the future.
SeanAnderson|1 year ago
colesantiago|1 year ago
Is this supposed to be a good thing that we want to accelerate (e/acc) towards?
Jeff_Brown|1 year ago
throwthrowuknow|1 year ago
thisoneworks|1 year ago
drivers99|1 year ago
consf|1 year ago
[deleted]
hiby007|1 year ago
pb7|1 year ago
gundmc|1 year ago
israrkhan|1 year ago
timonoko|1 year ago
English is particularly bad to read aloud because it is like programming language Fortran based on immutable tokens. If you want tonal variety, you have to understand the content.
Some other languages modify the tokens themselves, so just one word can be pompous, comical, uneducated etc.
ancorevard|1 year ago
I would like to send a text and then get back a podcast dialog between two people.
layman51|1 year ago
oneepic|1 year ago
e12e|1 year ago
[1] https://illuminate.google.com/home?pli=1&play=SKUdNc_PPLL8
marviel|1 year ago
bluelightning2k|1 year ago
More of a tech demo than anything else.
What's wild about this is that the voices seem way better than GCP's TTS that I've seen. Any way to get those voices as an API?
bluelightning2k|1 year ago
srameshc|1 year ago
oulipo|1 year ago
Also it's weird that they focus only on AI papers in the demo, and not more interesting social stuff, like environment protection, climate change, etc
ftmch|1 year ago
sandspar|1 year ago
ants_everywhere|1 year ago
If it's just used for generating low quality robo content like we see on TikTok and YouTube then it's not so interesting.
ElijahLynn|1 year ago
greesil|1 year ago
Ninjinka|1 year ago
yismail|1 year ago
[0] https://news.ycombinator.com/item?id=41020635
SpencerBratman|1 year ago
unknown|1 year ago
[deleted]
surfingdino|1 year ago
yunohn|1 year ago
dpflan|1 year ago
Why would one prefer this AI conversation to the actual source?
Can these be agents and allow the listener to ask questions / interact?
lying4fun|1 year ago
1) it prepares me for the real studying. by being exposed to the gist of the material before actual studying, im very confident that the subsequent real study session would be more effective
2) i can brush up easily on key concepts, if im unable to sit properly, eg while commuting. but even if i were, a math textbook can be too dense for this purpose, and i often just want to refresh my memory on key concepts. and often im tired of _reading_ symbols or words, that’s when id prefer to actually _listen_, in a way, using a muscle that’s not tired
3) if im struggling with something, i can play this 5min chapter explanation multiple times a day throughout the week, while doing stuff, and engaging with it in a casual way. i think this would “soften” the struggle tremendously, and increase the chances of grasping the thing next time i tackle it
also id like a “temperature” knob, that i could tweak for how much in detail i want it to go
jamalaramala|1 year ago
In other words: I suspect that the output is heavily derivative from online discussions, and not based on the papers.
Of course, the real proof would be to see the output for entirely new papers.
GaggiX|1 year ago
It shouldn't be surprising that a LLM is able to understand a paper, just upload one to Claude 3.5 Sonnet.
GaggiX|1 year ago
ansk|1 year ago
sno129|1 year ago
throwthrowuknow|1 year ago
WalterBright|1 year ago
motoxpro|1 year ago
Definitely one of the coolest things I have seen an LLM do.
Animats|1 year ago
cma|1 year ago
srik|1 year ago
airstrike|1 year ago
kornhole|1 year ago
RobMurray|1 year ago
MailleQuiMaille|1 year ago
OutOfHere|1 year ago
Tepix|1 year ago
alganet|1 year ago
Legend2440|1 year ago
danesparza|1 year ago
Building trust with your users is important, Google.
albert_e|1 year ago
alenwithoutproc|1 year ago
to me
belval|1 year ago
> "Bad enough it has to talk, does it need fake vocal tics...?" - Gilfoyle
Found it: https://youtu.be/APlmfdbjmUY?si=b4-rgkxeXigU_un_&t=179
drivers99|1 year ago
richardreeze|1 year ago
I saw they launched NotebookLM Audio Overview today: https://blog.google/technology/ai/notebooklm-audio-overviews...
So what the heck is illuminate and why would they simultaneously launch a competing product?
CatWChainsaw|1 year ago
nonrandomstring|1 year ago
No matter how great the idea, it's hard to stay excited for more than a few microseconds at the sight of the word "Google". I can already hear the gravediggers shovels preparing a plot in the Google graveyard, and hear the sobs of the people who built their lives, workflows, even jobs and businesses around something that will be tossed aside as soon as it stops being someone's pet play-thing at Google.
A strange ambivalent feeling of hope already tarnished with tragedy.
franze|1 year ago
0xedd|1 year ago
[deleted]
consf|1 year ago
[deleted]