HomeTrainingTechnology TrendsAboutCalendar

Voice XML: Chapter 2 Exercises

 


VoiceXML: Introduction to Developing
Speech Applications


In addition to the below text areas, there is a link for a Microsoft Word version of the code.

2-1 The grammar for telephoneNumber.grxml is:

This grammar is used by the VoiceXML code for Example 2.1:

An equivalent grammar that enables the caller to press the keys on a touch-tone phone follows:

This revised grammar is used by the following VoiceXML application. The only changes to the application is the p rompt which solicits the one digit phone number from the caller, and the reference to the revised grammar.

2-2 A VoiceXML form that reuses the grammars month.grxml, day.grxml, and year.grxml that solicits a date.

2-3 See the resources web page for information about HTML verbal browsers

2-4 The W3C home page looks like this (click for larger image):

A. What information from this site should be extracted and included in a voice Web page?

  • Navigation bar. From the Navigational bar at the top of the W3C home page, include each of the five links.
  • Working Group list. From the list of Working Groups in the left-hand column, include each of the Working Group links.
  • General information. From the General information column in the right-hand column, include each of the links.
  • News section. From the news section in the middle of the web page, include the text and links.

B. Briefly describe the menus and forms in the voice Web page that presents this information to the caller.

  • Navigation bar. Voice menu that presents five choices to the caller.
  • Working Group list. A voice menu that presents the lists of Working Groups to the caller. Because this is such a long list, consider using a scrollable voice menu-a voice menu with commands for ship ahead and ship backwards.
  • General information. This section lends itself to a two-level hierarchy of voice menus. The top level contains options for mission, contact, get involved, member area, W3C Tem, and past news. The second level of menus contains the corresponding links.
  • News section. From the news section in the middle of the web page, include the text and links. Use a scrollable voice menu for the news items. Each news item is sequence of prompts that read the text to the users. Between some of the prompts are one-option menus that enable the user to jump another page.

C. Indicate the sequence that the menus and forms from part (b) above should be presented to the caller.

First, welcome the user with the following prompt: Welcome to the W3C-The World Wide Web Consortium-leading the web to its full potential.

The four main sections of the visual web page should be presented in order of most frequently accessed. For example, if most of the callers want to access the Working Group page to which they belong, then the Working Group List option should be presented next, followed by news with general information last.

D. What prompt wording do you recommend for each menu and form from part (b)?

  • Navigation bar. Say the name of one of the following information categories: activities, technical reports site index, about W3C, or Contact us.
  • Working Group list. Speak the name of one of the following working groups: Accessibility, Amaya, .
  • General information. Speak the name of one of the following categories: mission, contact us, get involved, member area, W3C team, and past news.
  • News section. News headlines. Say this one to hear details. W3C team presentations in January (pause), SVG 1.1 and Mobile SVG Profiles Working Drafts Published (pause), .

E. What words should be in the grammar for each menu and form from part (b)?

  • Navigation bar. Activities, Technical reports, Site index, About W3C, contact us, and the appropriate synonyms.
  • Working Group list. The names of each of the Working Groups and appropriate synonyms (for example, speech is a frequent synonym for voice.) and browsing commands such as skip ahead and skip backward.
  • General information. Mission, contact us, get involved, member area, W3C Team, past news, and synonyms.
  • News section. This one, first, second plus the keywords in each headline.

F. What error handlers should be specified for each menu and form from part (b)?

Each menu and form field should have event handlers that respond to the three most common errors and events caused by users:

  • Failure to response-no response.
  • Respond with a word or phrase not covered by the grammar-mismatch.
  • User asks for assistance-help.

2-5 Discuss the possibility of constructing a "transcoding" procedure that produces a VoiceXML document for the home page of the W3C.

A. Which of the steps (a) - (f) from question 2-6 do you think can be automated as part of the transcoding procedure?

(a) Determine what info to extract. While it is easy to automate the extraction of data from the W3C home page, it is not, in general, possible to automatically determine what to extract. For example, if most of the callers are members of Working Groups, then it may not be necessary to extract general information that all members already know. On the other hand, if most are new to the W3C, then the general information should be extracted, but not the detailed Working Groups.

(b) Determine whether to use menus or forms. It may be possible to automatic heuristics that determines whether to use menus or forms for each information set. However, these decisions should be reviewed by a designer and then user tested to make sure the decision is correct.

(c) Determine the sequence that the menus and forms should be presented to the user. Because this is very dependent upon how callers will use the voice site, it may be impossible to automatically predict.

(d) Specify the grammar. The words used as link names on the visual page can be automatically extracted for use in the grammar. A grammar specialists will be necessary to extend the grammar to include synonyms and phrases that callers frequently speak yet do not appear on the home web page.

(e) Determine what event handlers to write. Every menu and form item should have the three major event handlers-mismatch, no response, and help. However, it may not be possible to automatically determine how to phrase each prompt in the error handlers.

B. Do you recommend implementing a transcoding procedure for the W3C web site?

Only under two conditions does it make sense to implement a transcoding procdeure:

(a) A designer fine tunes the transcoding procedure, providing the information that can not be automatically derived by the transcoding procedure.

(b) The web site does not change so that the extraction task does not need to be modified.

C. Do you recommend implementing a transcoding procedure that can be applied to any visual web site?

No, Web sites that frequently change their layout and/or content will break transcoding procedures.

2-6 WML is a language for writing applications that display data, menus, and forms in a small screen on many of today's cell phones. See http://www.wapforum.org/ to download a copy of the WML language. The browser for WML resides on the cell-phone itself, while the browser for VoiceXML resides on a server that is connected to the cell phone. By combining the functionality of VoiceXML and WML, it is possible to enable multimodal applications that can both speak and listen to the caller as well as display info and accept touch-tone button input. Discuss the advantages and disadvantages of each of the following architectural approaches, a, b, and c:

a. Integrate the WML and VoiceXML browsers for execution on the cell phone. b. Integrate the WML and VoiceXML browsers for execution on a connected server. c. Enable the synchronization between the WML browser executing on the cell phone and the VoiceXML browser executing on the server.
Service delays due to communication delays No Yes Yes
Service disruptions then communication facility is "out of range) No Yes Yes
Extra expense for memory and processing capability Yes No No
Extra complexity due to synchronization between client and server No No Yes

a