The transition from primarily visual UX towards an auditorial UX is really powerful.
Looking at screens to get key information distracts me from my surroundings and seems archaic.
My wife is a sound designer who has opened my eyes to the importance of sounds both in film and in the world. It's not that I was unaware of sounds, but I didn't realize how important they are to centering me in this world and the made up worlds of films and games. Try watching a scary movie with the sound turned off, it turns into a comedy.
I think its unexplored territory that has huge potential to impact the way we interact with the real world, even more so then Glass or Hololens.
When I listen to music as I walk down the street I change, my mood, my posture and the way I look at the world. The music augments the reality around me in a way that visual UX never can because it's a lens between my eyes and the world.
The problem is that voice interfaces break down pretty quickly once you try to do anything complicated. The Echo has pretty solid voice recognition--far better than anything else I've ever used--but it's still hard to get it to do anything useful once you get beyond a pretty narrow script. (e.g. what's the weather forecast, play this artist, etc.)
I think the main problem with voice interfaces is that it's not discoverable. You need a good understanding of what the system can and cannot do, its current state etc before even speaking.
CLI has the same issue, but at least you can man-xxx, which I imagine works a lot better in text than it does in audio.
I really liked how this was done in the movie 'Her.' There's something especially nice about only having your attention distracted audibly and not visually, especially in public.
I wonder if the smartphone age will go away as quickly as it came. I picture a world where we just have smart wearables like a watch which has a tiny visual interface, but a powerful audio one (speaker, earpiece, put watch up to ear, etc). It seems a lot less intrusive. I imagine as we get better with AI and voice recognition, it'll be as practical as a phone. What I'm able to do with Google Now on my watch is fairly impressive today. We already have the technology to understand things in context like "Navigate to Katz's deli" brings up Google Maps to the deli as opposed to a google search results page about navigating to a cat themed deli, which was the status quo not too long ago with voice search.
I imagine carrying around this big selfie/facebook machine around, constantly charging it, whipping it out all the time, etc will be pretty gauche if wearable-only solutions become competitive.
> The transition from primarily visual UX towards an auditorial UX is really powerful.
It's also less accessible. I'm sure auditory UI is useful in many cases, but it also seems to be more cumbersome in others. In any case, I hope that pervasive auditory UI doesn't become any sort of standard without an accompanying visual/physical interface.
> Try watching a scary movie with the sound turned off, it turns into a comedy
Allow me to be pedantic and say it is that being fully immersed in the context of the movie that really matters. You could probably achieve a similar suspenseful effect with silence+subtitles, although I'm sure the experience isn't identical. Otherwise, the deaf could never enjoy scary movies, including me.
"If you have more than one Echo or Echo Dot, you can set a different wake word for each".
This is something I've been thinking is becoming more problematic as well as an opportunity for real ubiquity. I have 3 separate devices nearby that are Google Now voice activated (the newer devices support this even if the screen is off), and they will sometimes trigger at the same time accidentally.
Since the processing is cloud based, and they know my identity, why don't the devices recognize this fact and cooperate. Instead of just 7 beam forming mics in the Echo, if you have two within hearing distance you could have the benefit of 14 and a unified response. Don't tie the request & response to a particular device, instead think of it as ubiquitous network that moves with you as you walk around the household, you should be able to continue your conversation from one room to the next seamlessly.
Since the processing is cloud based, and they know my identity, why don't the devices recognize this fact and cooperate. Instead of just 7 beam forming mics in the Echo, if you have two within hearing distance you could have the benefit of 14 and a unified response.
The echo and noise reduction software that I'm aware of can't really do that in a reasonable fashion.
With current solutions, you've got one DSP that's receiving all the audio streams simultaneously, and they need to be exactly synchronized in time. Then, using basically pattern-matching, it figures out what direction the user's voice is coming from, and combines some/all of the audio streams together to eliminate environmental noise and make the speech as clear as possible.
To do this with separate devices, you'd want extremely precise time synchronization. Which is possible, but I wouldn't want to implement it.
The extra processing and synchronization would take longer, and delay input to the speech recognition engine. I don't think it would enhance the user experience.
I learned to not have the wake word be "Amazon" when I was watching online training for AWS. The Echo went nuts until I finally paused everything and changed the wake word back to "Alexa".
They really need to make it so that all of the Amazon Echos on the same network use a proximity algorithm to determine which one responds. Simply: The Echo that hears you best should be the one to respond.
I want to have an Echo in every room, and I don't want to have to remember all their different names!
> I have 3 separate devices nearby that are Google Now voice activated (the newer devices support this even if the screen is off), and they will sometimes trigger at the same time accidentally.
> Since the processing is cloud based, and they know my identity,
Interesting, so everything said in that room gets processed and potentially sent to Google for indefinite storage? What a 1984-style luxury.
I agree with this entirely. I've been waiting patiently for a way to add microphone distance to my Echo and this is perfect for that... except it doesn't work that way.
I am very much hoping they fix it in the future and add a software layer to combine/route commands with one single wake word.
It's also a bit annoying that the Android Wear version of Now doesn't work the same as the regular Android version. For example, the full-sized one seems much more flexible with wording, and supports listening in several languages at once, while Wear is limited to one language.
When did I turn from the enthusiastic kid who dreamed of audio-controlled personal assistants like this to a cranky old man who doesn't want anything remotely spy-possible in his house?
I think when we were kids we didn't think that the personal assistant would have to communicate with the outside world via the internet in order to perform its function.
If all of "Alexa" was included in a disconnected local database I bet it would still be as appealing.
Couldn't legally the FBI get a court oder to be able to listen in on conversions in a room that has one of these? They already do that with car assistance services. [1]
I love "smart" devices, but hate "devices that needlessly insist on connecting to the Internet".
One of the worst offenders is Dropcam. They have a super camera, easy to set up and use. Great picture quality. Would be an awesome baby monitor or "closed circuit TV replacement". But why the goddamn hell does it need to connect to the Internet? Why is the only option available to needlessly stream video out of my home network to the cloud, only so that I can then stream it back into my home network for viewing??? WTF? That's both a waste of outbound bandwidth and a waste of inbound bandwidth. I should be able to put it on my network, switch off the cable modem, and still be able to view video locally. How hard is that? I could do that with a webcam and a really long USB cable!
This is probably a function of the amount of bad news you've read over the years about people getting exploited, taken advantage of, spied on, etc. When you're a kid it doesn't even really seem like a thing.
The enthusiastic kid would probably get distracted and discouraged when X can not do "What I really want, like Ironman." While the "cranky" old man has been mis-characterized as "cranky" because "cranky" is often confused with wisdom and experience.
When you realized that the government was making an all out assault on the most fundamental American rights and the civilian sector did absolutely nothing to assure your privacy and anonymity out of sheer greed and narrow minded foolishness that they would be undermining their own success.
I am sure you would not have a problem using these kinds of systems if it were assured that you could not be tracked or monitored because the devices and systems were secured in overlapping ways.
In hindsight it all sounds amazing and ignoring the spy-possibilties, it gets old fast. I don't use Siri, and I can do alot of this with it. But since I got the first Siri enabled device, I've used it mostly just for joking around and my daughter asks her hockey scores. That's the extent of it.
Because when we dreamed of this as kids, the thought of the corporations behind these technologies that harvest our data for their gain didn't come up.
There is something delightfully ballsy about making this only available to users of Alexa Voice shopping:
"Echo Dot is available in limited quantities and exclusively for Prime members through Alexa Voice Shopping. To order your Echo Dot, use your Amazon Echo or Amazon Fire TV and just ask: "Alexa, order an Echo dot"
Also, this makes me sad. I'd kind of like to try this out, but I have no Alexa voice service currently (I don't think)
Somewhat related, but if I don't subscribe to any of the services listed, this is a pretty useless product for me. I don't listen to internet radio, I don't stream music, I don't order delivery, I don't use uber, there's already 10 million ways to check the weather, and my life isn't busy enough to need a voice-activated calendar.
Is this the future of tech? Like do I need to have some kind of urban-go-getter lifestyle to find use in any of this? When can I get something useful, rather than "thing I already do, but in a new package"?
My problem with Alexa is, I don't want to invest in a new ecosystem. I'm fine with Amazon being the hub that connects all of my services, but I don't want to use Amazon To-Do List, Amazon Prime Radio, Amazon Traffic, Amazon Sports, Amazon Calendar, Amazon Weather.
That being said, they announce partnerships with more and more services every month. Things are looking up.
Obviously, actually having bluetooth speakers with the Echo Dot is a much better solution, but after using the Sonos setup for 3-4 weeks I must say that it works surprisingly well, and despite the audio hack the sound quality is excellent on my Play 1's.
Be forewarned - if I am invited into your home for any reason, and I see an Alexa device, I will vocally add a large shopping list of nonsense to your Amazon cart :)
Will this be linked together with my echo? One thing I do quite often since my echo is in my kitchen is use it to set a timer. I'd like to be able to go to my office upstairs, and ask it how much time is left. Today, i don't think that's possible even with a second echo.
Amazon was the only Big Four company silent on the data privacy lawsuit with Apple. Why would I place one of their always-listening products in my living room?
I would love to have something similar as open source software. How can I trust this device if I can't examine the code used for hotword recognition?
Also, it would be great to be able to put the software on different hardware - something with digital audio output for example. The concept of Alexa is amazing, but distributing it as properitary software limits its potential.
I'm not entirely clear on the difference between the regular Echo and the Echo Dot. It appears you have to have an original Echo in order to purchase a Dot. Is this simply an extension that proxies all of the requests back to the original Echo?
I love my echo! I probably use it 15-25 times a day.
1) Acts as my alarm
2) Turn on my favorite radio station while I make breakfast.
3) Timers for cooking breakfast.
4) Listen to flash news
5) Alarm again if I need a nap.
6) Timers for lunch meal
7) Add item to shopping list.
8) Add todo items.
9) Plays spotify while I work on my computer from across the room.
10) More flash news (its really quite extensive)
11) more naps
12) dinner timer
13) news
14) word definitions
15) Tell it to stop when it starts talking in the middle of a conversation (a bit annoying).
16) more todos
17) Order more dogs treats
18) Play bedtime music
Worth every penny.
Where did the strange sense of "everyone is spying on you" come from? A bloated sense of self importance?
Still too expensive, imo. I've read a lot about "Alexa" and Echo... and beside the privacy issues, in many cases the Echo quickly becomes an expensive speaker (after the kids and everyone else gets tired of asking "Alexa" questions).
$89 is not in my compulsion buy price range. I may be in the minority on that though...
Echo Dot ($89.99) is available exclusively for Prime Members through Alexa Voice Shopping. To order your Echo Dot, use your Echo or Fire TV and just ask: “Alexa, order Echo Dot.”
Man... I had my audrey doing this in the '90s. I can't believe I missed the boat and somebody else is making a bajillion dollars. It's time to search through the archives of all the cool stuff we did 20 years ago and put it in a shiny new wrapper.
[+] [-] startupfounder|10 years ago|reply
Looking at screens to get key information distracts me from my surroundings and seems archaic.
My wife is a sound designer who has opened my eyes to the importance of sounds both in film and in the world. It's not that I was unaware of sounds, but I didn't realize how important they are to centering me in this world and the made up worlds of films and games. Try watching a scary movie with the sound turned off, it turns into a comedy.
I think its unexplored territory that has huge potential to impact the way we interact with the real world, even more so then Glass or Hololens.
When I listen to music as I walk down the street I change, my mood, my posture and the way I look at the world. The music augments the reality around me in a way that visual UX never can because it's a lens between my eyes and the world.
[+] [-] ghaff|10 years ago|reply
[+] [-] Jack000|10 years ago|reply
CLI has the same issue, but at least you can man-xxx, which I imagine works a lot better in text than it does in audio.
[+] [-] drzaiusapelord|10 years ago|reply
I wonder if the smartphone age will go away as quickly as it came. I picture a world where we just have smart wearables like a watch which has a tiny visual interface, but a powerful audio one (speaker, earpiece, put watch up to ear, etc). It seems a lot less intrusive. I imagine as we get better with AI and voice recognition, it'll be as practical as a phone. What I'm able to do with Google Now on my watch is fairly impressive today. We already have the technology to understand things in context like "Navigate to Katz's deli" brings up Google Maps to the deli as opposed to a google search results page about navigating to a cat themed deli, which was the status quo not too long ago with voice search.
I imagine carrying around this big selfie/facebook machine around, constantly charging it, whipping it out all the time, etc will be pretty gauche if wearable-only solutions become competitive.
[+] [-] jallmann|10 years ago|reply
It's also less accessible. I'm sure auditory UI is useful in many cases, but it also seems to be more cumbersome in others. In any case, I hope that pervasive auditory UI doesn't become any sort of standard without an accompanying visual/physical interface.
> Try watching a scary movie with the sound turned off, it turns into a comedy
Allow me to be pedantic and say it is that being fully immersed in the context of the movie that really matters. You could probably achieve a similar suspenseful effect with silence+subtitles, although I'm sure the experience isn't identical. Otherwise, the deaf could never enjoy scary movies, including me.
[+] [-] bruceboughton|10 years ago|reply
Don't you mean oral or aural?
[+] [-] bckmn|10 years ago|reply
[+] [-] pbreit|10 years ago|reply
[+] [-] jordache|10 years ago|reply
no thank you.. i will use my hand
[+] [-] samstave|10 years ago|reply
[+] [-] Rezo|10 years ago|reply
This is something I've been thinking is becoming more problematic as well as an opportunity for real ubiquity. I have 3 separate devices nearby that are Google Now voice activated (the newer devices support this even if the screen is off), and they will sometimes trigger at the same time accidentally.
Since the processing is cloud based, and they know my identity, why don't the devices recognize this fact and cooperate. Instead of just 7 beam forming mics in the Echo, if you have two within hearing distance you could have the benefit of 14 and a unified response. Don't tie the request & response to a particular device, instead think of it as ubiquitous network that moves with you as you walk around the household, you should be able to continue your conversation from one room to the next seamlessly.
[+] [-] ansible|10 years ago|reply
The echo and noise reduction software that I'm aware of can't really do that in a reasonable fashion.
With current solutions, you've got one DSP that's receiving all the audio streams simultaneously, and they need to be exactly synchronized in time. Then, using basically pattern-matching, it figures out what direction the user's voice is coming from, and combines some/all of the audio streams together to eliminate environmental noise and make the speech as clear as possible.
To do this with separate devices, you'd want extremely precise time synchronization. Which is possible, but I wouldn't want to implement it.
The extra processing and synchronization would take longer, and delay input to the speech recognition engine. I don't think it would enhance the user experience.
Edit: spelling.
[+] [-] mrbill|10 years ago|reply
[+] [-] t0mbstone|10 years ago|reply
I want to have an Echo in every room, and I don't want to have to remember all their different names!
[+] [-] jimktrains2|10 years ago|reply
> Since the processing is cloud based, and they know my identity,
Interesting, so everything said in that room gets processed and potentially sent to Google for indefinite storage? What a 1984-style luxury.
[+] [-] masonhipp|10 years ago|reply
I am very much hoping they fix it in the future and add a software layer to combine/route commands with one single wake word.
[+] [-] Nullabillity|10 years ago|reply
[+] [-] Touche|10 years ago|reply
[+] [-] caractacus|10 years ago|reply
[+] [-] fluxquanta|10 years ago|reply
If all of "Alexa" was included in a disconnected local database I bet it would still be as appealing.
Rosie on the Jetsons didn't have to "phone home".
[+] [-] sschueller|10 years ago|reply
[1] http://www.cnet.com/news/court-to-fbi-no-spying-on-in-car-co...
[+] [-] ryandrake|10 years ago|reply
One of the worst offenders is Dropcam. They have a super camera, easy to set up and use. Great picture quality. Would be an awesome baby monitor or "closed circuit TV replacement". But why the goddamn hell does it need to connect to the Internet? Why is the only option available to needlessly stream video out of my home network to the cloud, only so that I can then stream it back into my home network for viewing??? WTF? That's both a waste of outbound bandwidth and a waste of inbound bandwidth. I should be able to put it on my network, switch off the cable modem, and still be able to view video locally. How hard is that? I could do that with a webcam and a really long USB cable!
[+] [-] visakanv|10 years ago|reply
[+] [-] danielrm26|10 years ago|reply
[+] [-] hyperbovine|10 years ago|reply
[+] [-] officemonkey|10 years ago|reply
These things would be a lot less "Big Brother" for me if I had a mic key in my pocket that would only turn the mic on when I squeezed it.
[+] [-] monkmartinez|10 years ago|reply
[+] [-] wahsd|10 years ago|reply
I am sure you would not have a problem using these kinds of systems if it were assured that you could not be tracked or monitored because the devices and systems were secured in overlapping ways.
[+] [-] joshmanders|10 years ago|reply
[+] [-] runjake|10 years ago|reply
[+] [-] edw519|10 years ago|reply
[+] [-] revelation|10 years ago|reply
So now this cool audio controlled personal assistant is just another gadget to buy more stuff from Amazon, instead of something you control.
[+] [-] danesparza|10 years ago|reply
"Echo Dot is available in limited quantities and exclusively for Prime members through Alexa Voice Shopping. To order your Echo Dot, use your Amazon Echo or Amazon Fire TV and just ask: "Alexa, order an Echo dot"
Also, this makes me sad. I'd kind of like to try this out, but I have no Alexa voice service currently (I don't think)
[+] [-] xauronx|10 years ago|reply
[+] [-] jbob2000|10 years ago|reply
Is this the future of tech? Like do I need to have some kind of urban-go-getter lifestyle to find use in any of this? When can I get something useful, rather than "thing I already do, but in a new package"?
[+] [-] xd1936|10 years ago|reply
That being said, they announce partnerships with more and more services every month. Things are looking up.
[+] [-] rdl|10 years ago|reply
http://www.theverge.com/2016/3/3/11148776/amazon-echo-tap-sp...
Ahh -- the Tap is a portable device with wifi speaker.
(Probably wouldn't call an audio monitoring box the "tap"
[+] [-] thecodemonkey|10 years ago|reply
I wrote up a little post on it here: https://medium.com/@MathiasHansen/hacking-an-amazon-echo-and...
Obviously, actually having bluetooth speakers with the Echo Dot is a much better solution, but after using the Sonos setup for 3-4 weeks I must say that it works surprisingly well, and despite the audio hack the sound quality is excellent on my Play 1's.
[+] [-] binarymax|10 years ago|reply
[+] [-] swalsh|10 years ago|reply
[+] [-] Fluid_Mechanics|10 years ago|reply
[+] [-] nilsjuenemann|10 years ago|reply
"Requirements
* A U.S. Amazon account
* A U.S. shipping address (50 United States and the District of Columbia only)
* An annual Amazon Prime membership or 30-day Amazon Prime free trial
* A payment method issued by a U.S. bank with a U.S. billing address in your 1-Click settings
* A device with access to the Alexa Voice Service (such as Amazon Echo)"
[+] [-] pierrebeaucamp|10 years ago|reply
Also, it would be great to be able to put the software on different hardware - something with digital audio output for example. The concept of Alexa is amazing, but distributing it as properitary software limits its potential.
[+] [-] davis_m|10 years ago|reply
[+] [-] BatFastard|10 years ago|reply
[+] [-] rogerb|10 years ago|reply
[+] [-] monkmartinez|10 years ago|reply
$89 is not in my compulsion buy price range. I may be in the minority on that though...
[+] [-] tnorthcutt|10 years ago|reply
[+] [-] Gratsby|10 years ago|reply
[+] [-] gizmodo59|10 years ago|reply