top | item 42431365

(no title)

kemiller2002 | 1 year ago

We have a family friend who is blind and a programmer. It's interesting to hear his perspective. His hope and expectation are that it will greatly increase usability.

I've been thrown into the usability deep end due to my wife also losing her sight due to an autoimmune disorder, and my dad losing his sight due to Macular Degeneration. Honestly, it sucks, and I mean like rage quitting, phone throwing sucks. (Try it. Turn on voice assist and close your eyes.) If Apple can improve it through AI, where someone can just talk to the phone to do a series of tasks. It will honestly change everything. The number of aging people who are going to lose their vision in the U.S. is set to go up exponentially in the coming years. This could be an unprecedented win for them, if they solve this issue with AI.

discuss

order

tkgally|1 year ago

A few days ago, OpenAI released live video integration with Advanced Voice mode for ChatGPT—point your phone at something and ask what it sees, and it will tell you pretty accurately. I thought it was just a cool trick until I read the top comment on their YouTube video announcement: “I'm screaming. As a visually impaired person, this is what I was eagerly waiting for. Still screaming! Thank you, Sam, Kev and the entire team over at OpenAI.”

https://www.youtube.com/live/NIQDnWlwYyQ

Google released a similar feature with Gemini 2.0 last week. While it doesn’t seem to be integrated with a smartphone app yet (at least on iOS), it can be used through the AI Studio browser interface.

https://news.ycombinator.com/item?id=42394998

MobiusHorizons|1 year ago

Is this feature somehow different than what Google has had with lens and what Apple has had with the info button in regular photos for a while now?

jprete|1 year ago

I don't have experience with this kind of problem. But I don't think GenAI is the best tool for this, at least not until it's so rock-solid trustworthy that everyone uses such an interface. Even leaving aside AI questions, if I'm looking for a human personal assistant for someone who's blind, and that person will have unlimited access to their electronic life, I'm going to vet that person very, very carefully.

unethical_ban|1 year ago

I don't understand the point.

Apple users already let apple (or at least their device) know everything about them.

If a person is blind and can't read or type onto their phone, a tool that can reliably pull up messages app and send Dad a letter is a godsend.

BrandiATMuhkuh|1 year ago

Sorry to hear about what is happening in your family.

I think your perspective is spot on. VUI (voice user interfaces) will absolutely change the way we interact with computers. After all, talking comes naturally to humans.

The digit divide (old people, very young people, illiterate) still exists. And will likely get bigger if VUIs don't get wide spread adoption.

DiggyJohnson|1 year ago

  <<<<
  digit divide
  ====
  digital divide
  >>>>
For some reason I spent a few minutes trying to understand the digit divide before realizing it was a typo.

I do think VUI as a concept is in its infancy and will (like it or not) both hasten and address the decline of written communication.

pxmpxm|1 year ago

> Sorry to hear about what is happening in your family.

Non-sequitur, but I cannot be the only person to find this sort performative empathy odd/out of place in this the context of HID accessiblity discussion.

brandon272|1 year ago

While I use LLMs I also consider myself an LLM skeptic in terms of its role in upending the world and delivering the value promised by the folks hyping it up most aggressively.

However, using ChatGPT voice mode and considering the impacts on accessibility, especially if that quality of interactive voice functionality is able to be integrated well into the operating systems of devices we use every day, is very exciting.

buryat|1 year ago

in order to cure Macular Degeneration we have to develop many different technologies that can be used for power control, it's inevitable as our history shows cyclical nature and behaviors of humans are predefined throughout the history because conceptually the same ideas and thoughts are being encoded and rehashed and decoded by newer generations.

MobiusHorizons|1 year ago

Is this generated by AI? Also how does power control or history cycles have anything to do with curing macular degeneration?

wizzwizz4|1 year ago

LLM-based AI is not needed, or even useful. We know how to make voice interfaces that work, and work well: have done since the 80s. It's just expensive; and it's an expense that nobody in the industry is willing to pay, therefore nobody needs to do it in order to differentiate their product.

Etheryte|1 year ago

What you're missing is that AI solves the expense problem. As the OS vendor you already have an overview and easy access to all interfaces that you expose and it's straightforward to feed that into an integrated AI agent. Add a bit of glue code here and there and a simple implementation is nearly free. Of course, the real value lies in ironing out all the edge cases, but compared to doing all of that manually, it should still be orders of magnitude cheaper.