430 likes | 618 Views
Speech Recognition Technology Applications . Denise Bilyeu, M.S. CCC-SLP Scottish Rite Computer Supported Literacy Program Munroe-Meyer Institute Omaha, NE. Speech Recognition. Utilizes hardware and software to transcribe spoken words into orthographic text
E N D
Speech Recognition Technology Applications Denise Bilyeu, M.S. CCC-SLP Scottish Rite Computer Supported Literacy Program Munroe-Meyer Institute Omaha, NE
Speech Recognition • Utilizes hardware and software to transcribe spoken words into orthographic text • Allows users hands free operation of computer systems
Applications for Persons with Disabilities • Academic opportunities • Vocational opportunities • Access to WWW
Implementation Issues • System Training Requirements • Dictation in Written Form • Absence of Graphical Representation • Functional Grade Level • Dictation Environment • Higher Order Organizational Skills/Strategies
System Training Requirements • Samples of training protocol text (500 words) were taken from each of the following programs • Dragon Naturally Speaking Standard* • Dragon Naturally Speaking Teen* • IBM Via Voice Gold • L & H Voice XPress *two samples were analyzed and averaged
Samples were analyzed using Readability Stack (Tice, B. 1990) • Flesch Index • Dale Index • Dale-Chall Formula • Fry Readability Graph
Flesch Index • (RE= 206.835 - (1.015 x words/sentence) - (84.6 x syllables/word) • Rates text on a 100 point scale • High scores indicate easier reading levels • Reading Ease based upon • Mean Sentence Length • Syllables per 100 words
Dale Index • DI = 11.534 - (.053 x RE) • Based on the Flesch Index Reading Ease Score
Dale-Chall Formula • Reading Grade Score (RGS) = .1579 x DS (Dale Score) + .0496 x SL (Sentence Length) + 3.6365 • Dale Score = % of words not on Dale list of 3000 • Sentence Length = average # of words per sentence
Fry Readability Graph • Yields Readability Grade Score (RGS) based upon: • Syllables per 100 words • Sentences per 100 words • Average the RGS for 3+ random passages for reliable score
Conclusions • 4th grade minimum literacy level required to train voice recognition programs (most programs need 6th to 8th grade reading levels) • Respiratory support sufficient to produce sentences of M = 10.44 words • No statistically significant differences in training protocols
Dictation in Written Form • Dictation vs. Conversational speech • Children produce 86% more words in slow dictation than in writing and 163% more words in normal dictation than in writing (Breeder & Scardamalia) • Process is vastly different • Dictation skills must be taught
Absence of Graphical Representation • Difficulty with dictation is often attributed to absence of graphical representation; may cause problems in text development and revision (Wetzel) • Speech Recognition has graphical representation, but often with a delay that interrupts the dictation process
Functional Grade Level • Classroom placement and curriculum demands contribute to written text needs • Written text requirements may not be extensive enough to warrant a Speech Recognition system • Consider cognitive and/or language skills
Dictating Environment • Voice recognition requires an environment relatively free of auditory stimuli • Ambient noise will effect the system’s ability to function well • Dictating may be disruptive to others • Removal from the environment may solve dictation problems, but result in educational or vocational disruptions
Higher Order Organizational Skills / Strategies • Persons must have cognitive abilities to dictate and often need strategies to help with the process • Pre-Writing Strategies • Writing instruction • Planning • Outlining/Mapping • Inspiration
Evaluation • Intelligibility • Sentence Intelligibility Test (Yorkston, Beukelman & Tice, 1991) • Utilizes ten unrelated sentences • Transcribed by unfamiliar listeners • Variables elicited • Intelligibility (% of intelligible speech to unfamiliar listener without context) • Rate of speech • Grade/Literacy Level • Fluency of Dictation
Attention to task • Writing/Dictating environment
Trial with voice recognition system • set up microphone/sound system to see if voice is perceived • run system training session if user is capable • dictate known passages that require little cognitive demand e.g., pledge of allegiance • dictate text that requires cognitive demand, short expository
Alternate means for training systems • Utilize another person with similar voice characteristics • Transcribe training protocols and allow user to learn and practice dictating • Transcribe training protocols and dictate to tape for user to listen to while dictating
Janae • 9 years old • Athetoid Cerebral Palsy • Sentence Intelligibility Test Score - 10% • Current System • Discover Board, Mouse Key • Reason for Referral • Mousing slow and fatiguing
Evaluation Tool • Dragon Dictate v. 3.0 • Evaluation Results • With no training, could utilize Mouse Grid with 80% accuracy, after one hour session, could utilize Mouse Grid with 95% accuracy. • With extensive training could dictate small amounts of text
Voice Recognition Status • Utilizing Dragon Dictate Mouse Grid on trial basis • Training on selected, commonly used words in progress to determine efficiency and fatigue effect of dictating text
James • 14 years old • Learning Disabled, reading and writing skills 4 years below grade level • Sentence Intelligibility Test score = 100% • Reason for Referral • Slow input method • Input impeded cognitive writing process • Inability to monitor written work
Evaluation Tool • Dragon Naturally Speaking Standard • Dragon Naturally Speaking Teen • Evaluation Results • Training materials printed and practiced before actual program training • Training required 2 weeks, 3 sessions/week • Needed alternate text program to review text • Worked on phrasing, assisted punctuation
Voice Recognition Status • Uses voice recognition at home for homework and correspondence • Does not use voice recognition at school
Brett • 18 years old • Quadriplegia, ventilator dependent • Sentence Intelligibility Test score = 100% • Current system • EZKeys for Windows with Morse Code input via pneumatic switch • Reason for referral • slow input method
Evaluation Tool • Kurzweil • Evaluation Results • Ventilator had to be physically blocked at Brett’s neck and in the back of the wheelchair • Training on segmentation of words and phrases was necessary • Training required one month
Voice Recognition Status • Able to use voice recognition at home for homework, correspondence and the internet • Unable to use voice recognition at school because of ambient noise and disruption that dictation causes
Katie • 16 years old • Traumatic brain injury • Sentence Intelligibility Test score = 89% • Current system • regular keyboard with track ball • Reason for referral • slow input method • fine motor movement fatiguing
Evaluation tool • Dragon Naturally Speaking Standard • Evaluation results • Unable to train system during evaluation likely because of nasal emission on specific sounds, effecting the intelligibility of surrounding sounds • Palatal lift was fitted subsequent to evaluation, but further voice recognition evaluation was not done
Voice Recognition Status • Unable to utilize voice recognition at time of evaluation • Further evaluation was not done as fine motor abilities were improving and alternate strategies (word prediction, abbreviation-expansion) were effective
John • 53 years old • Friedrich’s Ataxia • Sentence Intelligibility Test score = 53% • Current system • EZKeys for Windows scanning via pneumatic switch • Reason for referral • slow input method • alternate access for versatility and fatigue
Evaluation Tool • Dragon Naturally Speaking Standard • IBM Via Voice Gold • Evaluation Results • Unable to train system after extensive trial period (4 weeks, daily) • System would not “perceive” John’s voice
Voice Recognition Status • Unable to utilize voice recognition • Trial with Dragon Dictate scheduled
Clinical Implications • Decrease intelligibility results in decreased success with voice recognition • Intelligibility may NOT predict success with voice recognition • Rate of speech may effect success with voice recognition
Future Directions • New voice recognition programs require minimal training • New programs that do not learn as they are used are in development • New programs that utilize a standard set of distinct “sounds” are in development