| 
 CS 410/510 SLA-Spoken Language ApplicationsFall Quarter 2006
  Saturdays 9:00 -12:30Room 103 Engineering Building
 September 30-December 9, 2006
 (updated 25 May, 2006)
 InstructorJim Larson
 jim@larson-tech.com
 (503) 645-3598
 Student ContestSubmit Project 2 or Project 3 to the AVIOS Student Voice Application Programming Contest and possible win up to $2000.
 Course motivationThe Internet is changing the very fabric of society and business. It enables people to communicate with each other, to access nearly unlimited information, and to do business with one another. Voice is the bridge connecting the Internet and telephone network. With the widespread availability of telephones and the explosion of the number of cell phones, anyone can access the Internet from home, at work, away from their office, or on the road by speaking and listening. For these users, the phone IS the Internet.
 VoiceXML is affecting the speech industry dramatically. VoiceXML is changing how developers create speech-enabled Internet applications. By hiding many low-level details, developers use VoiceXML to create speech-enabled applications by specifying high-level prompt messages for menus and forms rather than the detailed, procedural level of programming languages. Decreasing the programming time and effort enables developers to perform additional iterations of usability testing and design refinement. VoiceXML is lowering the entry barrier to creating speech applications. While VoiceXML makes it easy to create a speech-enabled application, it is difficult to create a good one. An HTML programmer can easily learn the how to write VoiceXML scripts, but designing a usable VoiceXML form or menu is still an art more than a science. VoiceXML language manuals do not offer much in the way of advice for how to phrase a prompt, what to include in the grammar describing what a user may say in response to a prompt, and what to do if the user does not respond appropriately. This course answers fundamental speech user interface questions, including: 
           How to involve users in every stage of the design and implementation of speech-enabled applicationsHow to enable the computer to listen to users by writing speech grammarsHow to enable the computer to speak to users by preparing textual prompts which are converted to speech by a speech synthesizer or are prerecorded by a professional voice actorHow to enable the computer to listen to the user's speech by creating grammars that guide a speech recognizerHow to write error handlers that deal with events such as no response by the user, unrecognizable words, and helpHow to choose the appropriate speech dialog style and implement the style using VoiceXMLHow to create new speech applications by reusing pieces of existing speech applications VoiceXML makes iterative design and testing of speech-enabled applications possible. Developers can quickly mock up designs for evaluation by prospective users. Developers quickly identify and fix trouble spots. VoiceXML hides the complex programming details, and enables the developer to concentrate on developing the overall design refining the detailed wording of prompts and messages spoken to the user. VoiceXML does NOT displace the need for user testing; it makes it possible to perform more user testing. Designing voice user interfaces is still an art. This course presents numerous guidelines suggestions, and conventional wisdom, but each day voice dialog designers are learning more so guidelines are evolving quickly. The VoiceXML language itself is evolving. The W3C Voice Browser Working Group meets at least three hours each week to discuss and modify VoiceXML and its related languages. While this course uses an early version of VoiceXML 2.0, some of the examples may be out of date by the time you read them. For the latest language specifications of VoiceXML and its related languages, see http://www.w3.org/voice With the introduction of mobil devices that integrate the functionality of cell phones and PDAs, multimodal applications that provide both a visual and verbal user interfaces will be popular. "X+V" for speech enabling (X)HTML pages provide one approach for implementing this important and exciting new class of applications  Course GoalPrepare students to design, construct, and evaluate spoken language applications.
 Course ContentThis course will consist of a combination of four activitieslectures, projects, evaluations, and texts/quizes.
 
          LecturesSummarize the state-of-the art practices in constructing spoken language applications. ProjectsDesign and implement three spoken-language applications, including a voice-driven fast food ordering application, a voice portal to a Web site, and an interactive-animated adventure story.EvaluationsDemonstrations and usability tests to review, evaluate, and improve spoken language applicationsTests/QuizzesMidterm and final exams; surprise quizzes. PrerequisiteUnderstanding of HTML or XHTML
 Grading  
          The midterm and final exams are each worth 100 points.Each surprise quiz is worth 10 points. (The lowest surprise quiz score is dropped.) There will usually be a surprise quiz each week.Each homework assignment is worth 10 points.Each of the three projects is worth 50 points.Students earning more than 90% of the possible points will earn a course grade of A.Students earning between 80% and 89% of the possible points will earn a course grade of B.Students earning between 70% and 79% of the possible points will earn a course grade of C.Students earning less than 70% will not earn credit for the course.The exams will be closed book. Cheating during the quizzes or exams will result in no credit for the course. Text We will use the VoiceXML Guide, a CD rom home study guide, available from http://www.vxmlguide.com or the PSU bookstore.  Course Schedule 
          
            | Date | Material covered | Due on this date |  
            | September 1 | This syllabus  | Due on this date |  
            | Oct. 7 | Lesson 1:  XML BackgroundLesson 2:  VoiceXML Background
 Lesson 3:  VoiceXML Application Structure
 Lesson 4:  Menus
 | Exercises from Lessons 1-4 |  
            | Oct. 14 | Lesson 5:  Forms and the Form Interpretation Algorithm (FIA)Lesson 6:  Input Form Items—<field> and <record> Elements
 Lesson 7:  Executable Content and Navigation
 Lesson 8:  Procedural Elements
 | Exercises from Lessons 5-8.Project 1.
 |  
            | Oct. 21 | Lesson 9:  Input-form Items—<object>, <subdialog>, and <transfer> ElementsLesson 10:  Variables
 Lesson 11:  Events
 Lesson 12:  Resource Management
 | Exercises from Lessons 9-12Revised project 1
 Project 2 proposal
 |  
            | Oct. 28 | Lesson 13:  PropertiesLesson 14:  Grammars
 Lesson 15:  Use of Grammars in VoiceXML
 Lesson 16:  Writing Complex Grammars
 | Exercises from Lessons 13-16 |  
            | Nov. 4 | Lesson 17:  Speech Synthesis Markup Language (SSML)Lesson 18: Introduction to Semantic Interpretation
 Lesson 19: Semantic Interpretation—Towards Natural Language Understanding
 Lesson 20:  Dialog Design
 | Exercises from Lessons 17-20 Project 2
 |  
            | Nov. 11 | Multimodal user interfaces using "X+V" | Project 3 proposal |  
            | Nov. 18 | Multimodal user interfaces using "X+V" |  |  
            | Nov. 25 | No class, Thanksgiving weekend |  |  
            | Dec. 2 | Project 3 demonstrations | Project 3 |  
            | Dec. 9 | Final Exam |  |    Project 1:Download the Prophecy voice platform http://www.voxeo.com/prophecy/ to your PC and implement the voice equivalent of the following paper form:             Ajax University  Student Name____________________Student ID _______________________
 Today's Date _____________________
 Course Number___________________
 Reason for droping course___________
 _______________________________
 Student signature _______________
 Project 2:Step 1: Turn in a written proposal describing (a) the purpose of your proposed speech application, (b) An example scenario of the use of your application, and (c) the target use of your application. Step 2. After instructor approval, implement and demonstrate your project.  Project 3:Step 1: Turn in a written proposal describing (a) the purpose of your proposed multimodal application, (b) An example scenario of the use of your application, and (c) the target use of your application. Step 2. After instructor approval, implement and demonstrate your multimodal project.          |