Divergent Views on Communicating with Machines

ford improves voice recognition for sync adds send to car for google maps video Divergent Views on Communicating with Machines

By guest writer, Frank Tobe, Editor/Publisher, The Robot Report www.therobotreport.com

Much of what I saw at CES 2012 was about products being upgraded to “smart” under the premise that smart connectivity enables consumer convenience. It was definitely on the minds of most of those attending. That’s why the CES keynote speeches were so well attended: they were slated to offer insight into the near-term future. But this year there were competing visions of that future. The industry leaders seemed to have divergent approaches to the development and marketing of “smart.”

The Apple iPad, with its multi-touch capability, has already changed our expectations as to how we interact with our computers, tablets and phones. And Apple’s Siri and Microsoft’s Kinect are leading us on to even newer ways — smart ways of interaction between users and their devices.

To make my point I must first describe three seemingly disparate events:

nprford 300x168 Divergent Views on Communicating with MachinesFord and NPR Press Conference

Ford and NPR held a joint press conference to launch NPR’s new app, which runs within the infotainment system — called Sync AppLink — on new Ford cars. NPR has the 1st and 2nd rankings of Morning Edition and All Things Considered among U.S. news radio programs. Their new app gives Ford drivers voice control over their NPR programming. In a menu-driven series of commands, a driver can call up the latest news of the hour, select a live stream of his or her favorite station, or access programs or topics from NPR’s large library of podcasts by using a set of simple commands like “hourly news” or “stations” or “programs” followed by the name of the program.

The resulting selection may be playing on the FM band, streaming live, or streaming from the archives of NPR over the Internet. It could take as many as five commands to get the desired program. Underneath the Ford Sync system is Microsoft’s operating system. Executives from Ford and NPR, when asked about future improvements to the system, said that a more free-form natural language voice recognition system would be ideal but is not yet capable and reliable enough to work with safety and convenience in a car. But think how Siri would get to the same program in just one short sentence: “Find and play today’s ‘All Things Considered.’”

ballmer 300x200 Divergent Views on Communicating with MachinesKeynote Speech by Microsoft CEO Steve Ballmer

Shortly after this presentation I went to the Steve Ballmer Microsoft Keynote Speech. Bizarre is a charitable word to describe this off-putting, fever-pitched yet unexciting sales pitch for everything Microsoft. Very little news, less information about new product introductions, and much puffery about the new Windows 8 Operating System coming sometime this year. Not a word about robotics even though Microsoft supports and sells a robotic operating system.

Ballmer presented Windows 8 and Metro — the same systems that are limiting NPR’s app by not having a capable and reliable free-form voice recognition system similar to Apple’s Siri — as the cat’s meow; the very highest tech and best you can buy anywhere. I actually felt bad from the presentation – to see such an unappealing sales pitch while omitting Microsoft’s vision for the future. [MS announced that this would be Ballmer's and their last keynote at CESa fact which underscores how the shift toward mobile devices has kept MS re-allocating talent and resources to adapt.] 

As an aside, Bloomberg Businessweek Magazine just did a cover story about Ballmer turning the company into a more relevant powerhouse with cooler technology and also a serious player in cloud computing. The article goes on to describe Ballmer as pretty normal except in public presentations. Still, I left that night without any new information.

mercedes 300x181 Divergent Views on Communicating with MachinesKeynote Speech by Mercedes Chairman Dieter Zetsche

The next morning I went to see Dr Dieter Zetsche present Mercedes’ first-ever CES keynote speech, an inspiring, informative and well thought out “big picture” focus on the next generation of connected cars. When asked whether cars were going to become autonomously-driven commodities built to carry around consumer products, he responded that Mercedes builds cars that people want to drive and that will continue — but when the traffic or the road is boring, there will be a switch to turn on a temporary autopilot. Zetsche, in his interesting and responsible presentation, described the auto industry and Mercedes cars in terms of freedom and included new offerings within each of five “freedoms.”

  1. Freedom not only from the horses, buses and trains of the past, but from the limits of distance, from the tethers of things local, to distancing yourself from your parents.
  2. Freedom of time via connectivity so that seamless updates are pushed to in-vehicle communication systems negating the need to bring your car in for system updates. Their new MBrace2 system regularly updates and monitors their cars but also connects today’s digital lifestyle into a digital drive style.
  3. Freedom of speech to communicate with your car in the most safe and expeditious manner. The current iteration of MBrace2 has a much enhanced (but not yet freeform) voice recognition system and in many instances the system will be proactive, eg, choosing to not answer phone calls or read messages at those times when the driver is fully occupied with hazardous driving situations.
  4. Freedom of energy — where Zetsche described new hydrogen-based fuel packs just waiting for the national (political) infrastructure to support them.
  5. Freedom of information where car-to-car communication can provide alerts about road hazards and conditions by taking advantage of the already present in-car virtual private network system and link.

All three of these presentations occurred before the doors for CES opened, and when I walked the massive exhibition space, those visions peppered what I saw with what I believed to be the immediate future in mobility, communication and apps. It is clear to me that despite Mr. Ballmer’s sales pitch to buy today’s systems and products because they were great, free form voice recognition — a la Siri — is the future of communication with our machines and “smart” is the pathway we are following to that end goal.

CES is a show focused on near-term product releases… those that will be launched later this year, in time for Christmas, and into next year. Throughout CES, almost all of the consumer products were demonstrating smarter products – smarter in the sense that they are connected to the Internet or a local net and have sensors or artificial intelligence that gather and process data and make decisions based on that information… all to add value to the product by providing convenience or entertainment to the buyer.

The most advanced manner of communicating with smart products is by voice and gesture. Today’s technology is menu-driven — like the NPR example — but the future is freeform and natural — think IBM’s Watson or Apple’s Siri. Hence the flurry of acquisitions into the language processing space: Apple’s acquisition of Siri; Google just bought CleverSense; Aldebaran recently purchased Karotz. And Nuance of Dragon Dictate fame is already established in this arena. Nuance voice processing is repackaged and used by many car companies for their in-car systems including both Ford and Mercedes.

kinect 300x117 Divergent Views on Communicating with MachinesIt appeared to be an afterthought in Ballmer’s presentation — Microsoft has been slow to react to its popularity and multiple uses — Kinect’s voice and gesture recognition device was the wonder of 2011 and seen in many non-Microsoft booths at the 2012 show. Hacked from its Xbox gaming origins, it provides a low-cost alternative to expensive LIDAR and collision avoidance systems, and all sorts of other applications. It is a wonderful invention that other companies are hacking and incorporating into their products. PrimeSense, the Israeli inventor of the Kinect device and it’s software, has been doing a booming business selling the device for non-gaming applications, research and who knows what else.

Consequently, it was easy to see that at CES 2012 the path to the next level of “smart” products is through the use of better communication with those products — gesture and voice recognition, and natural language, to command and control them — just like Tom Cruise in Mission Impossible!

- Frank Tobe

Related Stories

Leave a comment

Alternatively

This will only be used to quickly provide signup information and will not allow us to post to your account or appear on your timeline.