An intelligent virtual assistant (IVA) or intelligent personal assistant (IPA) is a software agent that can perform tasks or services for an individual based on commands or questions. Sometimes the term "chatbot" is used to refer to virtual assistants generally or specifically accessed by online chat. In some cases, online chat programs are exclusively for entertainment purposes. Some virtual assistants are able to interpret human speech and respond via synthesized voices. Users can ask their assistants questions, control home automation devices and media playback via voice, and manage other basic tasks such as email, to-do lists, and calendars with verbal (spoken?) commands.
As of 2017, the capabilities and usage of virtual assistants are expanding rapidly, with new products entering the market and a strong emphasis on both email and voice user interfaces. Apple and Google have large installed bases of users on smartphones. Microsoft has a large installed base of Windows-based personal computers, smartphones and smart speakers. Amazon has a large install base for smart speakers.Conversica has over 100 million engagements via its email and sms interface Intelligent Virtual Assistants for business.
Radio Rex was the first voice activated toy released in 1911.
Another early tool which was enabled to perform digital speech recognition was the IBM Shoebox, presented to the general public during the 1962 Seattle World's Fair after its initial market launch in 1961. This early computer, developed almost 20 years before the introduction of the first IBM Personal Computer in 1981, was able to recognize 16 spoken words and the digits 0 to 9. The next milestone in the development of voice recognition technology was achieved in the 1970s at the Carnegie Mellon University in Pittsburgh, Pennsylvania with substantial support of the United States Department of Defense and its DARPA agency. Their tool "Harpy" mastered about 1000 words, the vocabulary of a three-year-old. About ten years later the same group of scientists developed a system that could analyze not only individual words but entire word sequences enabled by a Hidden Markov Model. Thus, the earliest virtual assistants, which applied speech recognition software were automated attendant and medical digital dictation software. In the 1990s digital speech recognition technology became a feature of the personal computer with Microsoft, IBM, Philips and Lernout & Hauspie fighting for customers. Much later the market launch of the first smartphone IBM Simon in 1994 laid the foundation for smart virtual assistants as we know them today. The first modern digital virtual assistant installed on a smartphone was Siri, which was introduced as a feature of the iPhone 4S on October 4, 2011.Apple Inc. developed Siri following the 2010 acquisition of Siri Inc., a spin-off of SRI International, which is a research institute financed by DARPA and the United States Department of Defense.
Virtual assistants make work via:
Virtual assistants use natural language processing (NLP) to match user text or voice input to executable commands. Many continually learn using artificial intelligence techniques including machine learning.
To activate a virtual assistant using the voice, a wake word might be used. This is a word or groups of words such as "Hey Siri", "OK Google" or "Hey Google", "Alexa", and "Hey Microsoft".
Virtual assistants may be integrated into many types of platforms or, like Amazon Alexa, across several of them:
Virtual assistants can provide a wide variety of services. These include:
Conversational commerce is e-commerce via various means of messaging, including via voice assistants but also live chat on e-commerce Web sites, live chat on messaging apps such as WeChat, Facebook Messenger and WhatsApp and chatbots on messaging apps or Web sites.
Amazon enables Alexa "Skills" and Google "Actions", essentially apps that run on the assistant platforms.
Virtual assistants have a variety of privacy concerns associated with them. Features such as activation by voice pose a threat, as such features requires the device to always be listening. Modes of privacy such as the virtual security button have been proposed to create a multilayer authentication for virtual assistants.
Notable developer platforms for virtual assistants include:
In previous generations of text chat-based virtual assistants, the assistant was often represented by an avatar of (a.k.a. 'interactive online character or automated character) -- this was known as an embodied agent.
|Intelligent personal assistant||Developer||Free software||Free and open-source hardware||HDMI out||External I/O||IOT||Chromecast integration||Smart phone app||Always on||Unit to unit voice channel||Skill language|
|Alexa (a.k.a. Echo)||Amazon.com||No||No||No||No||Yes||No||Yes||Yes||?||?|
|BlackBerry Assistant||BlackBerry Limited||No||N/A||N/A||N/A||No||No||Yes||No||N/A||?|
|Evi||Amazon.com True Knowledge||No||N/A||N/A||N/A||No||No||Yes||No||N/A||?|
Digital experiences enabled by virtual assistants are considered to be among the major recent technological advances and most promising consumer trends. Experts claim that digital experiences will achieve a status-weight comparable to 'real' experiences, if not become more sought-after and prized. The trend is verified by a high number of frequent users and the substantial growth of worldwide user numbers of virtual digital assistants. In mid-2017, the number of frequent users of digital virtual assistants is estimated to be around 1bn worldwide. In addition, it can be observed that virtual digital assistant technology is no longer restricted to smartphone applications, but present across many industry sectors (incl. automotive, telecommunications, retail, healthcare and education). In response to the significant R&D expenses of firms across all sectors and an increasing implementation of mobile devices, the market for speech recognition technology is predicted to grow at a CAGR of 34.9% globally over the period of 2016 to 2024 and thereby surpass a global market size of USD 7.5 billion by 2024. According to an Ovum study, the "native digital assistant installed base" is projected to exceed the world's population by 2021, with 7.5 billion active voice AI-capable devices. According to Ovum, by that time "Google Assistant will dominate the voice AI-capable device market with 23.3% market share, followed by Samsung's Bixby (14.5%), Apple's Siri (13.1%), Amazon's Alexa (3.9%), and Microsoft's Cortana (2.3%)."
Taking into consideration the regional distribution of market leaders, North American companies (e.g. Nuance Communications, IBM, eGain) are expected to dominate the industry over the next years, due to the significant impact of BYOD (Bring Your Own Device) and enterprise mobility business models. Furthermore, the increasing demand for smartphone-assisted platforms are expected to further boost the North American Intelligent Virtual Assistant (IVA) industry growth. Despite its smaller size in comparison to the North American market, the intelligent virtual assistant industry from the Asia-Pacific region, with its main players located in India and China is predicted to grow at an annual growth rate of 40% (above global average) over the 2016-2024 period.
In May 2018, researchers from the University of California, Berkeley, published a paper that showed audio commands undetectable for the human ear could be directly embedded into music or spoken text, thereby manipulating virtual assistants into performing certain actions without the user taking note of it. The researchers made small changes to audio files, which cancelled out the sound patterns that speech recognition systems are meant to detect. These were replaced with sounds that would be interpreted differently by the system and command it to dial phone numbers, open websites or even transfer money. The possibility of this has been known since 2016, and affects devices from Apple, Amazon and Google.
In addition to unintentional actions and voice recording, another security and privacy risk associated with intelligent virtual assistants is malicious voice commands: An attacker who impersonates a user and issues malicious voice commands to, for example, unlock a smart door to gain unauthorized entry to a home or garage or order items online without the user's knowledge. Although some IVAs provide a voice-training feature to prevent such impersonation, it can be difficult for the system to distinguish between similar voices. Thus, a malicious person who is able to access an IVA-enabled device might be able to fool the system into thinking that he or she is the real owner and carry out criminal or mischievous acts.
YouTube title: Airline Information System, 1989 - AT&T Archives - speech recognition