A related FAQ, the a2x FAQ contains information about using DragonDictate with the a2x program, which allows you to connect a PC running DragonDictate to a workstation which uses the X Windows environment, and control the workstation's graphical display and applications. For more information on speech recognition technology and research try the comp.speech archive.
These pages are by now a little rusty -- for a start Dragon Systems have excellent on-line advice and their own FAQ pages. I will no longer try to keep my copy of the Dragon Dictate FAQ up to date (Apr 96) because (1) I don't work for them and (2) they can do it themselves very well and (3) a2x is what this is really all about -- ie getting voice users to be able to control X-based workstations (though a3x lets you control NT -- see below). There are loads of other voice products out there now, for example Kurzweil Voice, Articulate Systems, IBM VoiceType.
These FAQs are direct copies (shortened where possible) of postings to the a2x mailing list, the SOREHAND mailing list, and the C+HEALTH mailing list. Send contributions/corrections/updates/ideas to me.
Collections of DragonDictate macros contributed mostly by a2x users, for a wide range of tasks including programming in various languages, and using word processing programs, latex, emacs, vi, and other applications can be found off the a2x FAQ.
The macros, and ASCII versions of the FAQs are also available by ftp . These will be updated periodically to reflect the current status and contents of these web pages.
Announcing: The voice-users mailing list. This list is for discussing all aspects of using voice recognition input systems; DragonDictate, Kurzweil Voice, IBM Voicetype, IN3, and others. Sample topics might include: - Using such systems safely, without muscle or voice strain - Techniques for improving recognition accuracy - How to set up the physical voice workstation optimally - General tips for effective use of voice interfaces - Configuration of specific systems, troubleshooting, etc To subscribe, send mail to: email@example.com with subject line "subscribe" (without the quotes). Random administrivia should also be directed to the request address. Should you need to reach the list-maintainer non-electronically for some reason (limited ability to use a computer, for instance), his current work number is (312) 702-7142. Posts to the list should go to: firstname.lastname@example.org
Return To Contents List
From: "Carlos M. Puig"
Date: Wed, 30 Nov 1994 20:46:01 -0800 The current issue of PC Magazine (December 20, 1994) has a major review section on voice recognition products (pages 203-219). The following products are covered in detail: o Dragon Dictate for Windows o IBM Personal Dictation System o IBM Continous Speech Series o Kurzweil Voice for Windows o Listen for Windows o Phonetic Engine 500 Speech Recognition In addition, there is a table of "Other Voice-Recognition Products" (p. 209) and a sidebar on "Navigation in a Mouseless World" (pp. 212-13) covering briefly: o Voice Assist (bundled with the Sound Blaster 16) o VoiceMouse o IBM Navigation Product (name not finalized) o QuickSwitch for OS/2 This time there is no editor's choice: "While voice-recognition technology for the PC has finally advanced enough to yield productive tools, we feel that it's still too early in the game to pick a clear winner in any of the the three categories we examined [dictation, navigators, and application development]."
Return To Contents List
Return To Contents List
From: Dana Bergen
>I had a demo of DragonDictate today. I have a couple of questions, >though. How easy is it to install and learn? The reseller wants what I >consider a hefty chunk of money ($250 to install + ($250 x 2 for two >4-hour training sessions) = $750 total) to install and train me on it. This is insane. You don't need anyone to train you on it. It comes with a tutorial and it's not difficult to use. The "more advanced" features like creating macros and using dcoms are reasonably well explained in the manual. >I'm pretty comfortable installing things in my PC (I put in a sound >card and a tape backup unit myself), but he was talking about >configuring interrupts, which I know nothing about. I didn't install the card myself because my hands aren't up to using a screwdriver but I don't think it's any big deal. I think that configuring the interrupt is just a matter of setting a switch on the board. If you're on a network there are some additional issues. If you're concerned about this you might try negotiating a much lower price for him to just do the install. Better yet, try it yourself and only pay for help if you get stuck. >following a tutorial and reading a manual. On the other hand, I want >DD to work, and I want to feel comfortable using it, otherwise it's a >waste of $$$. With respect to the installation, it will either work or not work. It's not going to work badly or differently because you installed it wrong. I think your reseller is trying to recoup the money he's not making since they cut the price.
Return To Contents List
Date: Mon, 20 Mar 1995 08:54:13 -0500 From: croom
My DD for Windows is getting me through Law School. With my multiple RSI's, I have to tape record my classes and library research, and then transcribe them using DD. I do most of my research online to avoid handling large law books. I have written 50-100 page papers on DD and I take the computer to school to take my 3 hr final exams. I don't even need more time to take tests than the other students. The software cost $2000 last August (upgraded to Windows in October without charge). I have the Power edition which includes legal and medical terminology. I realize I sound like a commercial, but I lost my job and my former career due to RSI (still in litigation) but DD has given me a 2d chance.
DD can also be used for programming .
Return To Contents List
From: Jeff DelPapa
At my employer, a private office is now policy for dragon users (there are 3 of us now). Being in the same room with a dragon user is _WORSE_ than with a full time phone user, as the pauses trigger the "start of conversation" break... (most people can tune out continuous speech, the discrete speech drives people nuts.) This is recognized by "both sides" -- I did start out in a cube, and as soon as we moved into new digs, I got an office, with storage spaces on 3 sides. Whenever the subject of cubes as a space crunch cure arises, the others that remeber me in the next cube say "give him an office, we won't be jealous" From: Dana Bergen I work in a cubicle, and most of the people around me say it doesn't bother them. One of my neighbors has complained, however, so I'm going to get a private office when our group moves. That move keeps getting postponed, however, and no one seems to consider it urgent that I be moved. I think sharing a cubicle or office would be intolerable for the other person, though. Other people's ordinary noise and conversation doesn't cause me problems. Dragon only picks up sharp loud noises like a door slamming or construction sounds, and these are generally not mistaken for words so it doesn't cause problems either. Sending email to friends about private matters -- now *that's* a problem I don't have a solution for! From: "Eric S. Johansson" >Sending email to friends about private matters -- now *that's* a problem >I don't have a solution for! 1: have an email account at home with dd 2: train a "privacy" vocabulary/cypher From: Ned I work in a busy newsroom at a daily newspaper. I have two "pod mates" and a steady stream of coworkers who frequent my little area. My dictation is no louder than a normal telephone conversation, and my fellow workers tell me they don't even notice me babbling to myself anymore. :-}
Return To Contents List
From: email@example.com (Washington Taylor) Joe Steffen writes: I moved my DragonDictate 2.0 voice files to a PC with DragonDictate 3.0 and converted them with the modvoc program that's used for upgrading to 3.0, then used DragonDictate in my normal work for a few hours. Modvoc copies all your words and macros and preserves your voice training, but it still has a few problems: 1. Word punctuation attributes are not preserved, both for words you added and special character punctuation you changed, e.g. I changed "*" to cling left and right so it's easy to use in file name and regular expressions. Do you keep most of your macros in files? If you do, it is easier to keep track of and edit them. Furthermore, when you upgrade you can simply edit those files to the new format and fix punctuation automatically. I haven't upgraded to 3 yet, but even upgrading from 1 to 2 it only took a few minutes to write emacs macros to update my macro files. 2. Trailing spaces on words are removed. This is a problem for me because I define most computer command names to have a trailing space and cling right so I can say "*" after them. Again, this problem would be fixed by using macro files. If the internal representation changed, you could use a command like add-word /t /g "foo" "foo " r (version 2 syntax) I went through the first part of my vocabulary online and counted over 170 words such as file suffixes like ".bat" and parts of file paths such as "../". I estimate I have over 200 such words and special characters with changed punctuation attributes, so unless modvoc is fixed I'm not spending the money to upgrade. Note that there is no way to dump words in DragonDictate 2.0 so all of the above has to be corrected by voice. Again, using macro files would eliminate this problem. I had the same error rate (3%) as with DragonDictate 2.0. Has anyone seen an improvement in recognition accuracy after upgrading from 2.0 to 3.0? I had 98% recognition with version 1. I have 98% recognition with 2. I expect to have the same with 3 and the windows version unless they have radically improved something. The only difference was the time to reach the plateau, which was months for 1 and days for 2. > From: "Susan Maller ()"
> Has anyone out there upgraded from Dragon 2.0 or 2.01 to Dragon 3.0? How > do you like it? Do you notice a significant difference? PLEASE REPLY. > Susan Maller > firstname.lastname@example.org Yes, I'm on 3.0 Classic - still with a 30K vocabulary. I found that recognition has improved, particularly for words like "last" "cast" "castle" in which the British flat "a" caused a problem. I no longer have to try to sound like an American on these :-) Also slightly faster, I think, but that is subjective and I have no proof. So, no significant difference but I'm happy with the upgrade.
Return To Contents List
From: Scott@ccgate.dragonsys.com >Also, (forgive my ignorance being a non-PC oriented person) could >someone tell me whether there is a PCMCIA, or PCI M-ACPA card >available, or a similar card which DD can use ? There is not a PCMCIA version of the M-ACPA card. At this time, DragonDictate for DOS can only use this card. As you may have gathered from reading the mailing lists, DragonDictate for Windows can operate on a Windows sound card like a SoundBlaster 16, the MediaVision PAS16 and the Microsoft Windows Sound System to name a few. From: Donald Hermes
Subject: Re: SoundBlaster settings From my experience with it, the SoundBlaster seems to need more volume than the MACPA card. For what it's worth it seemed to have more of a 'typeahead' buffer than the MACPA card. Make sure the automatic gain control is off. That needs to be set from the recording settings in the mixer control. I had my microphone volume set at about 4/5th of the way up. From: email@example.com (Martin Carroll) Subject: SoundBlaster settings For the past few months, I had been using a SM10A Shure microphone with a MACPA card, and DragonDictate picked up my soft speaking just fine. I just switched to a SoundBlaster 16 card, and now, no matter what settings I specify in the audio mixer tool, I cannot get DragonDictate to consistently hear every softly spoken word. Sometimes I have to downright shout to get DragonDictate to hear a word.
Return To Contents List
>> but in my opinion, the technology just isn't there to >>give you both adequate performance and to also have the DragonDictate menu pop >>up on the workstation screen. >I'm puzzled by this remark. I have the DragonDictate menu on my workstation >screen, and the performance seems quite adequate to me. (The accuracy, on the >other hand...but that's a different topic.) Do you mean that >you experienced a noticeable delay between saying a word and seeing it >typed, or that you had to wait for menus to be drawn? This has not been >my experience. Or do you mean that you can speak more quickly if the >menu is not exported? What I mean is the following. I regularly use Dragon to run an X display. The way I'm using it is to have a PC monitor next to my workstation monitor. After using this for about 1 1/2 years or so, (the point being that I had become accustomed to a certain level of performance) I saw a demo of the Windows product. Point blank the performance was visibly worse than what I am used to. I surmised that the reason for this was that the Windows display takes processing power to update, which is true. The Dragon Systems rep basically agreed with me, though he was surprised that I noticed. The main thing is, anything you ask the PC to do other than voice recognition is taking cycles away from the recognition engine and giving them to some other task like updating the VU meter, for example, which is constantly responding to the volume of your voice. Someone else said they didn't think that the Windows version was slower, and then went on to explain that they were already using an exported menu that appeared on their actual workstation display and explained that since the exported menu was already taking a certain amount of processing power, that the additional amount required to update and maintain the Windows display was insignificant compared to what they were already putting up with. My problem is that I could easily see that the Windows version was slower than what I was using; since what I'm using needs to get faster, not slower, this is not a solution I am willing to consider. Also, since I rarely have to look at the PC screen anyway as I just watch what happens in my X display, the benefit of running on a single machine or of having an exported menu pop up on my workstation (and cause the problems mentioned on this list with blotches due to bugs in deskview/X) is of little or no use to me. What I want is raw performance, period. Anything that will adversely affect that, and I could easily tell from the demo that the Windows display would, is just not going to make it with me. Added to that, the Dragon rep basically admitted that it *was* slightly slower, so I basically bailed on the Windows product. This is not to say that I'm not into DragonDictate; I am, it saved my career. I just won't do anything that will slow my scenario down. From: firstname.lastname@example.org (Joe Steffen) I installed DragonDictate for Windows last week. The first problem is you have to cancel the new user dialog and follow the README instructions to change the voice board default to the M-ACPA board. The second and worst problem is you can only transfer some macros from DragonDictate for DOS; you lose everything else: voice model training and all added words (I have a significant number of acronyms and UNIX command names and options). After two calls to technical support it wasn't clear which macros can be imported into DragonDictate for Windows; the format of special keys like control keys may have changed, so you will need to write a conversion script. You can't import macros with dcom's because they are replaced by a scripting language. The third problem was the tutorial quit after an internal error 4 times. After doing the quick training, the tutorial finally worked. DragonDictate for Windows operates differently so retraining yourself will be necessary, e.g. you say Voice Menu instead of Voice Console. I'm sure you can define a compatibility macro, but the macro language documentation is on-line and not in the printed manual, which I consider to be a problem since it is easier for me to read a paper manual then control a help program by voice. So now I'm considering upgrading to DragonDictate 3.0 since I will continue to use DragonDictate for DOS for most of my work, which is on UNIX and a Macintosh. From: Sebastian Seung I just got my DOS to Windows upgrade from Dragon Systems. The user interface is well done, but the documentation is not so good. My greatest disappointment is that it doesn't work with PC-XWARE. I called Dragon's technical support, but they were ignorant of the problem. Does anyone know the reason for the incompatibility? Or has anyone else found a compatible X terminal program for the PC? From: Gary Shea Date: Tue, 15 Nov 1994 14:46:50 -0700 Howdy folks! I finally got Lan WorkPlace going again (these Windows apps are mighty picky...), so now I can connect to my Unix box via the tnvt2270 app and dictate into that window. In Windows, recognition isn't exactly peppy but it's maybe 20 words a minute? I'm guessing. But with 24M of memory, in the tnvt220 window I only get about 10-12 words/minute. I say my sentence, then sit there for 20 seconds while it gradually gets created in front of me. I figured it was memory when I had 16Meg, but now with 24 it's no better! Anyone else have this experience, or better yet anyone that's figured out that it's something simple? Oh, please??? Anyone that's used DDDOS and then switched to DDWIN? Notice any change in recognition accuracy? What I get is just terrible... no better than 75% in dictation mode, even when I'm really careful about corrections. Quite a bit better in command mode, but I'm never there so it doesn't help much... From: Jack Date: Wed, 30 Nov 1994 16:50:35 -0800 >i've got a 486 box and a sparc 20. i've been using dragondictate for >windows for a couple of months now, and i am *nowhere* near normal speech >rate. more like 15-20 words a minute i'd guess, taking corrections into >account. >am i doing something wrong YES! You're using the Windows version! No, really though, no insult intended but it really does perform that much worse, there was a sizable flap regarding this issue on the list a while back, but if you think about it for a while it makes perfect sense. The PC processor has to manage the Windows display, thereby consuming cycles the recognition engine could otherwise have used. The other part of the story is that I am well acquainted with this speech technology which also improves performance. It won't account for the entire discrepancy in performance though, the main bulk is just that the Windows version is slower. For Unix users there is little if any reason I can think of to use the Windows version. If you're a DOS user, maybe it's a win...
Return To Contents List
From: Scott@ccgate.dragonsys.com > Sender: email@example.com > Does anyone have a clever idea for how to move the (windows) mouse > in ddwin using a voice macro/script? I would really like to have a set of > macros which either > 1) move the mouse to one of a set of fixed points on the screen > or > 2) move the mouse a relative distance (like "[two inches up]"). > I tried using the scripting language to start the mouse moving, then wait, > then stop, to achieve (2), but it gave me an error message about not > being able to wait in that state. Someone in my department here at Dragon has recently written a DLL that will allow you to do exactly what you want through the DLLCall scripting command in DDWin. Using it, you can move the mouse pointer to x,y coords relative to the screen or window and also offset the placement of the mouse pointer any number of pixels up, down, left or right. I have to get the file from him and I will make it available to you somehow. I don't have an FTP server that you can connect to, maybe I can attach it to an e-mail (I think my mailer uencodes) or I can post it on our BBS for you.
Return To Contents List
Here's a zipped up version of the mouse movement DLL and macros. I've renamed them from DDHELP to DDMOUSE for descriptive purposes.
There are three files in this DDMOUSE.ZIP file,
Wati has contributed these additional macros for mouse control in DD for windows, which make use of the DLL.
Return To Contents List
From: George Hu
Subject: Kurzweil Voice I am fortunate to have both Dragon Dictate and Kurzweil Voice for Windows, and would like to provide some comparison. It seems that Dragon really dominates this mailing list, possibly due to having multiple dealers on the alias. While Dragon is a very good product which has developed many features which are very useful, Kurzweil has some advantages which everyone should examine. In my opinion, Kurzweil has a much better recognition engine. Out of the box, Kurzweil beats a trained Dragon, and over time it still beats Dragon. When I use Dragon, I find it mistakes words a lot, and often doesn't have the right word in a choice list of 10 words. Kurzweil, however, very rarely makes an error, when it does make an error it is usually on a word which does sound ambiguous, and the choice list of 5 words has the correct word in the list much more often. Dictating this whole document, I probably made less than a dozen errors! Kurzweil is also faster than Dragon on the exact same system. This may be do to Kurzweil still using a specific hardware board whereas Dragon is running off of a Windows sound system card. These two advantages mean that I want to use Kurzweil, but do not look forward to using dragon. Dragon has a much better set of macros and customization. If you intend to do programming, or other non-dictation activities, Dragon may be the only system which can work for you. Dragon allows you to control the mouse and buttons by voice whereas Kurzweil does not. Dragon can read text off of buttons and allow you to speak them whereas Kurzweil only has a few predefined buttons you can say. Dragon allows you to create hierarchical macros and has sophisticated things you can do inside macros such as control spacing and capitalization, Kurzweil does not. Kurzweil is a simpler product to use. There is no command vs. dictate mode. There are no flags for spacing and capitalization you have to set before saying a word. Kurzweil allows you to correct spacing and capitalization afterwards. For correcting errors in previous words, you don't need to enter a special oops mode; you just say " backup 2" or whatever. Kurzweil has a manual of about 70 pages; Dragon comes with three manuals totaling over 315 pages. Kurzweil also comes with many application macros which I have found easier to use than for Dragon, although that probably varies a lot depending upon which programs you run. Eventually, Kurzweil will probably have all the features of Dragon. It will be difficult, however, for Dragon to change their whole engine. Eventually, both these products will be put out to pasture by continuous recognition systems. Today, I think the choice is between recognition accuracy, and features. If you want a dictation system for actual English dictation and moderate application control, then Kurzweil deserves serious attention. If you intend to do programming, or other things which must be highly customized, then you probably need the features of Dragon. Lastly, Kurzweil is retailing at about 1000, which is a bit more than Dragon. Both systems can run on similar platforms, but I think Kurzweil is faster. Sometimes, Kurzweil can require more virtual memory than Dragon, but both are basically memory hogs. I run both on a 66 megahertz 486 with 16 megabytes. From: Gary R Noonan Subject: Re: Kurzweil Voice My approximately year long experience with the DOS version and approximately 2 month experience with the Windows version of Kurzweil support the comparison provided below. The Dragon dealer was unable to make his system perform with anywhere near the accuracy of Kurzweil when I examined DOS systems. The Windows version of Kurzweil is indeed very good at voice recognition--even without training. I recently found a method to have Kurzweil produce mouse clicks on selected windows. Simply start the Windows Recorder macro system (found in accessories window) and assign unique keystroke to the macro (such a Shift + Alt + Ctrl + a) and make the desired mouse clicks. Then create a macro in Kurzweil that issues the hotkey for the Windows Recorder macro. This procedure allows you to have the Kurzweil system perform mouse clicks by voice. You can of course also create macros within individual programs and have Kurzweil call them. I have created numerous voice activated macros in WordPerfect for Windows and call them from Kurzweil, often performing complex tasks with a single voice command.
Return To Contents List
From: Dana Bergen
>> I've been looking into IN3 and am considering buying it. However, there >> is no demo version, and no money-back guarantee. So as far as the >> company is concerned, I have to put down $700 sight unseen with no >> guarantees. Is there anyone in the Boston area (I'm in Waltham) who is >> using the product and would be willing to give me a short demo so I can >> get a better feel for this product before plunking down $700? IN3 costs $700? Why spend $700 for a limited command set when you can get a full-fledged dictation system (DragonDictate) for $1000? I have no financial interest in this recommendation. I use DragonDictate and a coworker used to use IN3, and they are in two completely different leagues. I don't understand how IN3 can get away with charging that much given what DragonDictate costs now. From: Nelson Sproul >> I just got DD for Windows to use it with Unix. I don't know how smart that was >> (I'm beginning to wonder if I should have gotten IN3)... you are better off with DragonDictate than in3. when my hands first went bad, I wasn't aware of the DragonDictate/a2x option. Despite warnings from in3's customer service that I would "go crazy" using in3 as a substitute for typing (as opposed to just using it to handle mouse functions), I used in3 for one year, getting barely acceptable performance with a vocabulary of about 150 items. This vocabulary allowed me to spell words out, which was of course an excruciatingly slow method, though one I was still grateful for since it allowed me to continue work in this field. Now, using DragonDictate, I have superior performance AND a vocabulary of 30,000 items. I mentioned this to an in3 rep, and all he could say was that my Sun, a sparc ipc, is not a very strong machine. I don't think this accounts for a difference in performance/utility of two orders of magnitude. I haven't seen in3 run on a PC, but I called Command Corp. (in3's manufacturer) today and they say in3's performance is comparable on the two platforms. I also asked what was the largest vocabulary size they would expect, and I was told "at least a couple hundred." Needless to say, a vocabulary limited to hundreds of words is not acceptable for an English dictation system for adults. I have never flamed anyone/anything in my life, but it appears to me that, given the availability of DragonDictate and a2x, in3 is a hoax. From: Nick Parker Nelson Sproul wrote: > never flamed anyone/anything in my life, but it appears > to me that, given the availability of DragonDictate and a2x, > in3 is a hoax. I can see how using IN3 for text dictation would be very frustrating. I'd compare it to chopping down a tree with fingernail clippers. It's possible, but geeeeeeeeeeeez! It's certainly unfair to complain afterward about the design of the fingernail clippers... I've never seen INCUBED represented as a "dictation system." It certainly isn't called that in the Command Corp marketing literature I received, nor in the IN3 product documentation I received when I bought the package. It's all a matter of selecting the right tool for the job. Almost all applications require a large amount of command selection -- that's the nature of graphical user interfaces, and feature laden software. When you count up cuts and pastes, font changes, saves, moves, etc, even writing a simple text document has a LOT of command and data selection. In the case of CAD, DTP (which is almost indistinguishable from word processors these days), or other similar applications, the user-computer interface is comprised almost entirely of command and data selection. A good voice command system, like IN3, can perform these functions via voice, and drastically reduce the amount of button pushing you do in a day. A continuous dictation system might do the same, but at a much higher cost. Both dictation systems and command systems have their place -- it's a matter of selecting the right tool for the job. Command systems will benefit everyone, and dictation systems will give an additional benefit to those who need it. I don't think IN3 is a "hoax" -- at all. It is very robust and effective software. The last literature I saw lists the prices at $179 for the basic version, and $395 for the Pro version, which includes a very nice Audio Technica Pro 8 headset microphone. Not bad. And no, I don't have any affiliation with Command Corp. I'm just a satisfied customer: IN3 did what they said it would do. From: Ann Marie Lawler > IN3 costs $700? Why spend $700 for a limited command set when you can > get a full-fledged dictation system (DragonDictate) for $1000? Well, that's IN3 plus a top-of-the-line noise-canceling microphone. If you're on MS-Windows, IN3 doesn't cost anywhere near $700 and if you're on a SPARC station you've got to add the cost of the PC, the sound card, and possibly the communication connection to DD. Maybe you've got a free com port, maybe you don't, it's just that much more hardware. That plus find a place for that second box, and keyboard, and screen... Sigh... Then you also have to get a2x running. According to the documentation with the copy of a2x that I have (distributed with the X11R6 sources) it won't work with the Sun OpenLook version 3 (current release) out of the box. You have to have X11R6 running or a patched version of X11R5 with the XTEST extensions patched in. > I have no financial interest in this recommendation. I use DragonDictate > and a coworker used to use IN3, and they are in two completely different > leagues. I don't understand how IN3 can get away with charging that much > given what DragonDictate costs now. Well, IN3 and DD are in different leagues. They are different products designed to do different jobs. One is a command system, one is a dictation system. I type just great. In fact, some of my co-workers think I type faster than I talk. My typing was not where my problems were starting. If you can't type at all I guess you need a dictation system. If you can still type and want to prevent further damage, maybe a voice command system is a better choice. One product is not necessarily better than the other, they're just designed for different jobs. My problems started with that darn rat (ahhh... I mean mouse). When you squeeze both sides and hold down a button while dragging something around on a desktop, the stress goes right to the wrist. Added to that were multi key cords. ~F and ~B in vi as well as some of those lovely Alt-Function keys for Wordperfect had me stretched accross the keyboard like I was palming a basketball. Ouch! If your hands are already so injured that even normal typing is no longer possible, I'm sorry to hear that and I guess you don't have much choice. I prefer to not get that bad. I replaced all the fancy command operations with voice macros. Now my typing doesn't bother me at all. I can even type faster than I used to because I've reduced some of the distraction when doing commands. I caught my problem early enough so I guess I'm one of the lucky ones. I'd rather do the commands and functions by voice and still type rather than be forced to do all of my typing by voice. From: Simon Crosby > From: Ann Marie Lawler > Actually, I would not be suprised at all, having worked with a2x as > well as other products which require "clever window manager key bindings". > That's actually why I made the statement. There was just too much you cannot > do with key bindings. After a while you also get tired of cluttering up > configurations with obscure kludged together key bindings. And changing key > bindings "on the fly" is not a lot of laughs. Most people don't what to be > forced to be "clever" just to use something for a purpose to which it was > not designed. Most people would rather use something that is easy to use > and designed for the purpose to which they are applying it. On the other hand, a2x is not a product, nor does it claim to be. It solves a problem, and does it exceedingly well. There are lots of people out there (like me) who cannot type. My career would be in the dustbin (US = garbage can) by now if I were not in a position to: 1. Write technical documents 2. Write programs in lisp, C, assembler and various other gunk 3. Manage a computer network All by voice, on a unix workstation. DD is the ONLY product which I have seen which will currently allow all of these, in spite of its flaws. > > Actually, according to the sources, a2x has some pretty fancy features > built into it that many people don't even know about. Maybe because they > are so archane to try and figure out. "That's a control T ?what? for a > window class?" "But that command is suppose to do different things in > those different windows." And once you're set up, changing them on the fly > is difficult at best. And the recognizer still doesn't know what's happening > on the screen or if something succeeded or not. Then there's that extra > box and screen and keyboard and sound card... Yes, a2x needs a language front end, and DD needs to be "application context aware". It also needs a *much* better undo mechanism than its current "throw out backspaces" idea. Ideally voice recognition/input should be an integral part of each application to guide recognition and so on ... One of these days perhaps... But remember, a2x is free. It also works well enough in conjunction with DD to be very useful. > > The thing about IN3 is that it is easy to setup and use and to change > commands while remaining a powerful command processor. It just does a lot > of things which cannot be done through key bindings. Things like warping > to a location within a window of a particular title. Or combining functions > where one function depends on the success or failure of a previous >function. Great. I have no problem with IN3 -- If it solves a problem or has a niche, then I'm 100% behind it. Remember though, that not everyone has a sparcstation. What do I run on my alpha box ? > > It might be possible to combine key bindings with piles of shell > scripts and come close to some of the advanced features of IN3, but not all. > There are things which you still cannot do, even with key bindings and shell > scripts. But do you really want to spend all of that time writing clever > shell scripts to go with all those clever key bindings? Just to do a few > of those things which key bindings cannot do? Just to use a product for > something it was not designed for? No I don't want to spend hours doing this. I spent about 1.5 weeks, once, and now I can do almost everything I need. Admittedly not very sophisticated, but it works. If you can add value in a product and people will buy it, then great. I wish you the very best of luck. And yes, I'm sure you have fantastic functions which I'd love to have. > > If you need a dictation system, get a dictation system. If you don't > need a dictation why get one to do something other than dictation? Different > products for different jobs. My DD world is more than a dictation system, it is my whole human-computer interface. This may well be true of IN3, now or someday, and I agree IN3 can help save people's hands by reducing hand strain.
Return To Contents List
IBM Announces the VoiceType Speech Recognition Family
IBM announced a new brand name for its family of speech recognition products--VoiceType. Included in this family is a new product, VoiceType Dictation, which is a high-accuracy, large-vocabulary speech recognition system for dictation. VoiceType Dictation was previously known as the IBM Personal Dictation System but has been enhanced. This system has the capability of recognizing 32,000 words at approximately 70 to 100 words per minute, with 97 percent accuracy. The system is compatible with many existing applications such as Lotus Notes, AmiPro, cc:Mail, Microsoft Excel, Word, and Quicken.
The initial product is based on OS/2, but IBM has also created a Windows version and a version for notebook and laptop computers that is supported through a PCMCIA digital signal adapter card. VoiceType Dictation for OS/2 is available in American English, British English, French, German, Italian, and Spanish. These languages for the Windows version will be available in 1995. The suggested retail price for the OS/2 product is $999; the PCMCIA model is $1,099.
Other products in this family include IBM's Continuous Speech Series, which is a speaker-independent continuous speech toolkit, and IBM VoiceType Control II, which provides speech navigation in selected models of the IBM ThinkPads.
DQ Take: The VoiceType family is yet another enhancement to the growing list of voice recognition products that we have seen for the consumer market. The benefit of using voice recognition in a PC environment is that it can greatly speed up the completion of forms and documents in which a small amount of information changes, and it can also allow users to create forms and documents in a hands-free environment. This is done through straight speech recognition dictation; speech recognition also drives the menu options of common word processing and other software packages. The target users for these products are typically lawyers, lab technicians, radiologists, surgeons, other professionals, and physically challenged individuals. [ed: is this us??]
A number of competing products are on the market. However, several features differentiate VoiceType from the competition. The first is the ability for the user to create an entire document before making corrections. Many other products make the user stop and correct the error immediately, potentially interrupting the thought process. The next is the ability for users to capture their voice for playback in case any input is in dispute. Finally, IBM has developed special vocabulary packages for radiology, emergency medicine, and journalism in order to further the accuracy of input in these segments.
Although the market for speech recognition is growing, the deterrents to that growth still include appropriate applications and the accuracy of the system. Dataquest sees any improvement to the use of speech recognition technology as a positive sign because improvements will eliminate the barriers to growth. Although the market is growing in several segments, such as voice dialing for telephony applications and speech recognition in voice processing applications, any push that creates user acceptance will also push sales of products that incorporate this technology.
Info: Nancy Jamison (408)437-8182 (firstname.lastname@example.org)
Return To Contents List
From: email@example.com From: firstname.lastname@example.org (Dorab Patel) Subject: what helps / hinders recognition In your experience, what do you find that helps recognition accuracy? What seems to hinder recognition accuracy? I keep my office door nearly all the way closed,otherwise the higher background noise increases errors. I'm vigilant about correcting misrecognitions and if I ever leave the mike on during a conversation so there are more false recognitions than I can correct, then I revert to the last saved voice models; otherwise your voice models get messed up and your error rate goes up until they are retrained. I keep track of common misrecognitions of voice macros and change one of them even if I've used one of them for a long time so my mental retraining is difficult. For example, [quit UNIX] often sounded like [quit emacs] so I changed the former macro. In 11 months my error rate dropped gradually from 8% to less than 3%.
Return To Contents List
Articulate Systems' licence DragonDictate for the Mac, and the product is called PowerSecretary. Anyone with a review please let me have it.Articulate Systems 600 West Cummings Park Suite 4500 Woburn, MA 01801, USA +1(617)935-5656 Fax: +1(617)935-0490 +1(800) 443-7077Here's the announcement from articulate for version 2, passed on by email@example.com (Fred Rose) on Date: Wed, 29 Mar 95 12:22:58 CST. Some of the comments which follow it are by now out of date. Anyone with a recent review please let me have it.
Direct Dictation into Applications - Version 2.0 allows you to directly dictate into almost all 3rd party applications such as WordPerfect, ClarisWorks, Excel, QuickMail, America On-line, QuarkXPress, and PageMaker to name just a few. In addition to dictating text, you can use bundled versions of AppleScript and QuicKeys to create whatever voice macros you want to control your favorite applications.
New Auto Correct Feature - Version 2.0 includes a very powerful new feature that enables the system to automatically correct words after you correct a previous word. For example, if you say "2.0" and the system hears "to point zero", when you correct the "to" to "2", the system will automatically correct the "point zero" to ".0". This feature increases the system's performance by automatically making corrections for you in many cases.
More Hands-free Control - For the disabled user, Version 2.0 provides significantly more hands-free control than previous versions. For example, you can now correct misrecognized commands entirely by voice from within the target application. In addition, there is more robust AppleScript control of the finder including moving and clicking the mouse.
PowerBook now Supported - Version 2.0 runs on the PowerBook 540 lines of notebook computers without any additional hardware other than our PowerSecretary microphone. For those of you who need a mobile dictation solution, this is the configuration for you.
This is what others have to say
From: "Zack T. Smith"
Date: Tue, 13 Dec 1994 12:53:46 -0800 Power Secretary costs about $1k more than DDwin. FYI, I recently called "articulate" systems, Inc., to ask the salespeople there whether they could articulate why their price is so much higher. I pretended that I have the option of using either Pentium of PowerPC and that the deciding issue is which product (Powersectretary or DDwin) is cheaper. Rather than admit that their price is significantly higher, the salesperson actually lied to me and said that DDwin costs *more*. An entire rationale was provided as to why this is the case. I called Dragon to attempt to get verification of this new sales-insight, but was told that the 'articulate' person's numbers and rationale were all wrong, and somewhat troublingly so for the Dragon person I spoke with. Date: Wed, 14 Dec 1994 08:01:25 -0500 From: Steve Larose According to PC Magazine, the list price for DragonDictate for Windows, Classic Edition, Version 1.0 is $698. Everything I have seen from Articulate Systems indicates the list price for PowerSecretary is $2,500 (although they sometimes offer a discount, but I don't know how one qualifies for the so-called discount). While I'm a diehard Mac fan, it's hard to justify paying $1800 more for voice recognition on the Macintosh. Does anyone have any insight as to why Articulate feels it can charge so much more, and when the price of their product might become competitive with voice recognition products on the PC platform? From: "Gary L. Karp" Date: Tue, 1 Nov 1994 06:54:27 GMT A brief update on my recent post on the PowerSecretary dictation system for the Mac from Articulate Systems. I had understood it was necessary to dictate into a separate window and then copy into the application. This has turned out to be only partially true. One may dictate into any application, but cannot correct by voice unless they work in the separate window. The good news is, an update will release in November which corrects that. It will be possible to dictate and correct in any application window. I feel confident based on discussion with Articulate that they are committed to eventually meeting the full set of Dragon features on the Mac. I will be getting the system, and will report as I get it running. It will remain necessary to have a minimum 040 system. I am getting a PowerPC. It will also run on PowerBook 540s without added boards or peripherals - except the microphone, of course. A company in the Bay Area, Scott Tech, has a solution whereby Dragon is pumped through a small PC and out to a Mac in ADB format. This remains the answer for pre-040 Macs.
Return To Contents List
DragonDictate and Notebook ComputersFrom: Scott Jangro
To save you a toll call, here is the Windows Sound System compatible sound card that I think somebody here told probably told Bob about: WAV Jammer PCMCIA sound card New Media Corporation Irvine, CA 800-CARDS-4-U This is not an entire notebook solution but a PCMCIA card that will plug into a notebook with a PCMCIA slot. We have tested the card here at Dragon and it does work very well with DragonDictate for Windows. It is a Windows Sound System compatible card and like the real Windows Sound System, it requires the use of a condenser type microphone (as opposed to a dynamic microphone which is what we've traditionally shipped with DragonDictate). We do provide either type microphone with the DragonDictate for Windows. It is true that some sound card emulations do not do a good job at mimicing the real thing, but I would not say that it is necessarily *most*. The truth is that we don't yet know if most of them will work or not. The point is that one shouldn't assume that a card is truly Windows Sound System compatible (or SoundBlaster 16 compatible) just because it says it is. For example, it may allow you to run Windows Sound System's software drivers but the electronic characteristics of the hardware may not be of the same quality as the real Windows Sound System. While this work well enough to do recording and playback in Windows or to play WAV files, this may not work well on a technology like speech recognition that relies heavily on good quality sound. As far as information on any other sound cards goes, we are in data collection mode right now. We do rely quite a bit on user feedback for this infomration in addition to our own testing so if you have any information on notebooks or sound cards that work well or not with DragonDictate for Windows, I'd like to hear from you. Thanks. From: Scott Jangro We're currently putting together lists of laptop computers that work well with DragonDictate for Windows The company that you called has no connection with Dragon. That company is unique because they have a laptop that can accomodate an ISA card. This is important for people who use the ACPA DSP card for DragonDictate (required for the DOS product). Therefore, we refer customers to them as the only vendor that we know of with this kind of laptop. FWIW, I apologize for the difficulty you had in contacting them. Now that DDWin can run on a standard Windows sound card, we have many more options. We're currently running experiments on different systems to put our seal of approval on some laptops that provide the best quality performance. Speech recognition is a very specialized use of sound input and requires high quality speech samples for good performance. Many sound cards just do not provide the quality sound that we feel is important for our product to perform the way we know it can. So while most if not all 16 but sound cards would pass the "go-into- the-software-store- and-see-if-it- runs" test, only a subset of those would produce the quality sound for good speech recognition. It takes hours and hours of testing per card or laptop to determine this level of quality and we will provide this information to you as soon as it is known to us. At this time I can tell you that users are running DDWin successfully on the following computers. We have purchased these machines and are in the process of certifying them. We DO NOT guarantee them at this time but preliminary tests are promising. Toshiba 4800 DX4-75mHz Everex Stepnote DX4-75mHz Ergo 100mHz IBM Thinkpad 755 WAVjammer (PCMCIA card)
Return To Contents List
How About DragonDictate for Windows NT?With the alarming increase of Windows NT systems, people want to know if there is a DD solution for this setup. There are two solutions - hardware and software, as discussed below:
HardwareFrom: Simon Crosby
A person in Cambridge suffering from early stages of RSI wants to know whether there is a DragonDictate in the pipeline for NT. Anybody know? Also, has anybody done the equivalent of a2x for NT? I mean, PC doing the recognition and an a2x-like thing on an NT box. Is this possible (thankfully I have no knowledge of windows or nt) From: Hans Heilman >> Also, has anybody done the equivalent of a2x for NT? There is a hardware solution called a TTAM that takes ascii output from one PC and converts it into the correct electrical interface to go into the keyboard port of a second PC. It also does mouse emulation going into the second PC's mouse port. I think this is the configuration: +-----------------------+ +-------------------------------+ |PC running Dragon under| +---->[mouse port] PC running windows | |DOS, text goes out | | | | |port [output port]-->[TTAM]-->[keyboard port] | +-----------------------+ ^ ^ +-------------------------------+ | | keyboard mouse A friend of mine has been using this setup for a while so he could use Dragon for Windows development, before the DDWIN release (he was willing to live with having to have 2 PC's). As an experiment, he swapped the righthand PC to be one running NT and everything seemed to work fine. The TTAM is around $400 -- a company called something like Prentke-Romich sells it (along with other assistive technology). If you send me mail, I'll get the real name and address. From: firstname.lastname@example.org (Jeff DelPapa) If you have a PS/2 mouse, the ttam will control it. (takes special key sequences, but they can be done as voice macro's) The one I had (it is currently on extended loan) also had a MAC ADB connector, and could spoof both keyboard and mouse on that machine. Mine was made by Words+ in northern CA (who make lots of adaptive products) but they have since stopped. The box was originally designed by the Trace Center at UWisc/Madison, and was licensed to several vendors. Having said all that, there may be a minimal hardware solution. Under DOS, and Windoze 3.1 there is a "handicapped access pack" (available free for the asking from IBM and Microsoft respectively) One of the many things it provides is "serial keys" a setup that lets the keyboard and mouse be driven by data on the serial line. (you send "sentences" to access characters outside of the normal printing ones) I don't know if NT has such a facility, but since it was developed after the access pak became available, they may have built it in from the start. If someone has the NT doc set and wants to take a quick look, it would be appreciated. (I will be looking into this myself shortly, a co-worker will be in this spot soon) All of these presume a two computer solution. In the case of NT, with suitable software, you can use the DV/X trick and get the menu's to share the main screen. (dragon/dos won't run in a less than full screen ms-windows dos box, the video emulation isn't good enough). In the case of the TTAM, it does not support ps/2 scan code 3 keyboards (unfortunate, as that is the code set that many X terms that adopted PC keyboards use), so check yours. The vanilla PC (big 5 pin plug) keyboards are fine. The program to convert key events to the sentences needed by either ttam or the access pack does not exist in the public domain. The Word+ people used to include a copy if you bought a ttam, but they wouldn't sell it separately, and it may be essentially unavailable now. I don't know what the others do. From: email@example.com (Jeff DelPapa) I am glad we have the windoze NT documentation in machine searchable format. NT does indeed have a handicap access pack available for it, the page that tells you how to get it is in the appendix of the SNA Server development kit installation guide (I kid you not). Anyhow it is /softlib/mslfiles/wn0789.exe on ftp.microsoft.com. I got a copy, but haven't tried it yet. (toby will do so on monday). Since I have the disks, I suppose I should try installing dd/win on NT and see what happens. From: Hans Heilman One place I know the T-TAM can be ordered is: Prentke Romich Company 1-800-262-1984 Their catalog carries a number of assistive devices, and also sells the T-TAM, apparently to connect their assistive devices to a PC (although it can also be used to connect 2 PC's together). They have a fact sheet on the T-TAM which can be included. They list several different T-TAM models on their price list, so be clear with them on the details. Apparently, the T-TAM does have special input escape sequences defined which can cause it to generate mouse button actions, although my friend found that behavior to be somewhat flaky in different PC configurations (as opposed to the straight ASCII character input which worked fine). He was fairly frustrated with mouse emulation difficulties. He was running PROCOM+ on the Dragon PC to send Dragon-generated text out the COM port, but ran into a limitation. Apparently, PROCOM+ allows user-defined sequences to be found to different keys, but the limit is 10 characters, which wasn't long enough for al of the special T-TAM sequences. He worked around this by coding a small ad-hoc program which reads input and sends it to the COM port.
The solution is a3x, an equivalent to the a2x program used to connect DragonDictate to workstations using the X graphical interface. The program was written by Nelson Sproul (firstname.lastname@example.org), and works as follows: given DragonDictate running on PC #1 running DOS and MS Access running SerialKeys (or axis, also by Nelson) on PC #2 under Windows or Windows NT it allows PC #1 to control PC #2 over a serial line. Here's what Nelson says about axis:axis: a freeware replacement for Microsoft handicap access serial keys This software is a stand-alone replacement for the serial keys functionality of the Microsoft handicap access pack. This functionality is required to allow an NT host to be controlled via a serial line, instead of by the keyboard and mouse. This software is required by a3x, a package which supports a2x-style control of an NT box by another PC running dos. I wrote this software because the Microsoft version, while working quite nicely for me under Windows NT 3.50, did not work under NT 3.51 (or windows 95, for that matter). After a half dozen phone calls failed to get me any support whatsoever, I gave up and put together my own version.
Send reports of problems to email@example.com.
a3x is a program which accepts as input keystrokes and generates output over a serial line to control a second PC, running Windows.
The following elements are required for DOS PC #1 to control Windows PC #2:
1. PC #1 a. DOS, with DragonDictate b. a3x for DOS c. com1 port attached to null modem cable 2. PC #2 a. Windows b. MS Handicapped Access Pack for Windows c. com1 port attached to the other end of the null modem cable
This program's key handling interface on PC #1 is similar to what a2x expects. Its mouse control is governed by a different, simpler, less sophisticated scheme.
Advantages versus DragonDictate for Windows: 1) works on Windows NT (as well as Windows for Workgroups) 2) offloads voice processing burden to another machine
Here is the a3x distribution with documentation, kindly provided by Nelson. Needless to say Nelson would appreciate any bug reports and positive feedback.
Return To Contents List
TroubleshootingFrom: Scott Jangro
>>>>>>>>>>>>>>> has anyone used PC XWare with dragon? so far it looks like it might have problems... . <<<<<<<<<<<<<<< We have heard of problems between DragonDictate for Windows and PC-Xware and I have followed up with the publisher of PC-Xware. Unfortunately, they have confirmed that their PC-Xware will not be compatible with DragonDictate for Windows. Further explanation below for those interested... As you may know, DragonDictate inserts keystrokes into the active Windows application. DragonDictate uses the same Windows facility as the Windows Macro recorder does to play keystrokes into a Windows application. Any Windows application that follows the Windows application development guidelines will be able to get input from the Macro recorder and from DragonDictate for Windows. PC-Xware doesn't follow the Windows conventions for getting keystrokes. Instead, it goes down to the DOS level, below Windows, to get the keys. Therefore, they miss the keystrokes that DragonDictate for Windows is sending. They also told me that they're aware that they do not work with the Windows macro recorder. I wish there was something that we could do to make DragonDictate for Windows work with this application as it looks like a great solution for X/Windows users. They didn't say if they were planning on changing their application to conform to these Windows standards or not. If anybody does come across a Windows X/Windows emulation, I'd be very interested to hear about it. From: firstname.lastname@example.org (Chris Daft) Subject: Dragondictate and PC-Xware This isn't specifically to do with a2x, but I thought people using PC's and X window machines would be interested in this response from NCD's technical support on the incompatibility between DragonDictate and PC-Xware. Yes, it is true that the reason Dragon Dictate does not work is our fault. The engineers are supposed to be looking into changing this, but it is unlikely that it will be fixed soon. Also, we do not have a scheduled date for this fix. -Stephen Peters PC-Xware is in other regards a very nice way of making an X window connection through a serial line or a LAN, from a PC. If you care about this, you might want to let them know (email@example.com) that they need to fix the problem. From: Jeff DelPapa I used Hummingbird's eXceed with DD/win. Worked fine. I will also give the DEC (eXcursion I think) a try (it is what the UK branch standardized on), when I am in the mood for shuffling floppies again. From: Scott@ccgate.dragonsys.com Subject: re: what do I need to run DragonDictate/a2x? -Reply >>>>>>>>>>>>>>> Has anyone had any trouble using DragonDictate and the Novell ethernet software? It worked find for me under DragonDictate V1, but with V2(DOS) it is extremely unreliable causing DragonDictate to crash with some DOS error like: Error  General Protection Fault in DD30K.EXE at 00A8:DD50k code=0000 ss=00D8 ds=00E8 es=0000 ax=06D4 bx=0130 cx=0011 dx=0008 sp=0604 bp=0614 si=88BE di=06D4 written over the top of a DD pop up box, whose only visible line is "Say OK to continue", presumably above it said: "An error has occurred". <<<<<<<<<<<<<<< Terry, You should definitely be able to get DD2 running in conjunction with a Novell Ethernet network. I've been doing it myself for two years. Now, having said those words of encouragement... The bad news is that countless things can cause a G.P. Fault. It basically means that something is stepping on something else in memory and the program that is running (usually DD because the DOS extender we use is usually the program that detects the problem) throws this error instead of hanging or otherwise going off into computer-never-never land. The good news is that this type of error is usually caused by a hardware conflict or a memory conflict. Here are a few things you can try... HARDWARE: When you moved to Version 2 of DD, you changed hardware. The ACPA card interacts differently with network cards than the old Dragon card did. We have found that you may need to set the ACPA card's IRQ level and IO address level LOWER than those of the ethernet card. So for example, if you have your ethernet card on IRQ3 and your ACPA card on IRQ 5, you may have problems. Switch them around if you can. The ACPA card manual has alternative settings for you to try. The manual doesn't indicate IO addresses lower than 310 but you can physically set the card lower than that. If you are interested in how to set the ACPA card's IO address into the 200s, let me know. My hardware settings that work well are: ACPA IRQ3; IO 310 (Defaults) Net IRQ5; IO360 MEMORY: Try to clean out the config.sys and autoexec.bat files as much as possible. Some EMM386 settings can contribute to a memory conflict. A safe EMM386 configuration, with DOS 6 is: DEVICE=C:\DOS\EMM386.EXE NOEMS Includes and eXcludes can cause some problems. Also, remove as many device drivers and TSRs as possible. The most thorough way to troubleshoot is to REM out everything and see if that helps. Then put things back one at a time to find the offending program or settings. From: Scott@ccgate.dragonsys.com Subject: Deleting words >>>>>>>>>>>>>>> From: Simon.Crosby@cl.cam.ac.uk I'm fed up with the ", (numeric)" version of the comma in DragonDictate. I've tried deleting it from the vocabulary, but it refuses to disappear. I'm using DragonDictate version 3.01 classic. Anybody got any bright ideas? <<<<<<<<<<<<<<< The reason that there are some words that you cannot delete is really because of an oversight during the design of the product. There are some groups of words (I think the same groups that you see in the train menu) that the delete function just cannot get to. No good reason, you just can't do it. When you delete one of these, it may appear that you're deleting it but you aren't. You have perfectly valid reasons for wanting to do this and I have passed your comments on to the developers. Thanks for voicing your feedback. I have seen some good suggestions pass by here and I hope that they are acceptible workarounds for now. BTW, to set the record straight -- Simon is right, DragonDictate is not neural-net based, it is based on HMMs (Hidden Markov Models).
Return To Contents List
Do you get hoarse from using DragonDictate ?From: firstname.lastname@example.org (Robert B Dunn) I am looking for some advice on using DragonDictate. The problem I have (which I think is fairly common) is that I get hoarse fairly quickly. Unfortunately this happens so quickly that I have not been able to make much use of DragonDictate and a2x, although I have had them for a few months. I now frequently get hoarse during normal conversations and occasionally have to stop talking to people. I use the following strategies to reduce the stress of using DragonDictate: 1) speak softly, 2) drink very frequently.
The following summary of a discussion of hoarseness was made by Scott@ccgate.dragonsys.com (Scott Jangro).
++++++++Original E-mail Posting+++++++++++++++++++++++++++++++++++++++++++++ I have been reading peoples' summaries of their experience with voice input for performing programming and related tasks, since I have considered use of this technology myself for a worsening RSI problem. It all sounds quite encouraging. However, at this time I have a few questions/concerns perhaps someone can address: 1) have the users of voice input at work found that their talking disturbs their officemates? Has anyone needed to arrange for having an office to themselves? It seems that when office space is limited this may pose a problem. 2) has anyone experienced hoarseness or similar problems attributable to the increased talking? Is it possible we may be partially substituting one problem for another? 3) speaking of hoarseness, how do Dragon and the other packages perform when the user has hoarseness due to voice strain or a cold? Does recognition suffer significantly? Any insight on these questions would be appreciated. Thanks! +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1) have the users of voice input at work found that their talking disturbs their officemates? Has anyone needed to arrange for having an office to themselves? It seems that when office space is limited this may pose a problem. ----------------------------- this can be a problem. 2) has anyone experienced hoarseness or similar problems attributable to the increased talking? Is it possible we may be partially substituting one problem for another? --------------------------- No problems... 3) speaking of hoarseness, how do Dragon and the other packages perform when the user has hoarseness due to voice strain or a cold? Does recognition suffer significantly? ------------------------------- I've found that if I already have a sore throat, working by voice can aggravate it. However, I have not had recognition problems even when *completely* stopped up (as in runny nose, congestion, etc.). +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >their officemates? Has anyone needed to arrange for having an office to >themselves? It seems that when office space is limited this may pose a >problem. I work in a cubicle. Most of the people around me say that my talking is "white noise" that doesn't bother them. Apparently one person has complained, and as a result I may be getting a private office (which I wouldn't normally rate) when my group moves in a month for two. If this happens I'm certainly not going to complain :-) >2) has anyone experienced hoarseness or similar problems attributable >to the increased talking? Is it possible we may be partially substituting one >problem for another? I had really bad voice strain the first day or two, but I changed the way I was speaking and the problem when away. I had taken a singing class and been taught how to project without straining my voice. When I used that technique I stopped having voice strain. >3) speaking of hoarseness, how do Dragon and the other packages perform when >the user has hoarseness due to voice strain or a cold? Does recognition >suffer significantly? I have allergies and my level of congestion varies a lot. Mostly it's not a problem. Sometimes on a particularly bad day I notice the recognition being a bit worse, but it's not a real big difference. I have two frequently used macros, [sniffle] and [clear throat], which don't type anything. They work really well! Here's the mail I sent in response to the original inquiry, since people other than that person seem to be interested: I'm a Senior Software Engineer at Sybase. I do new feature development on the SQL Server, which is our core product. I investigate problems, write specifications documents, write code, and fix bugs. My work is in C in a Unix environment. I use DragonDictate all day long. I can't type at all without pain, so I do everything with DragonDictate. I think that the effect on my programming productivity is minimal; i.e. I get pretty much the same amount of work done with DragonDictate as I used to get done typing. It's a little more of an impediment when writing a document; I create documents somewhat more slowly than before. However, I'm perfectly capable of doing what's needed for my job. I was a very fast typist; DragonDictate makes me more like a fairly slow typist. There are plenty of engineers here who are not super fast typists who do a good job. My company expects the same amount of work from me that they expect from other people, and this has not been a problem. There was a pretty significant amount of time required to get going with DragonDictate in a Unix programming environment. Getting the various pieces of software and hardware working together correctly took some doing; getting macros and such set up took some doing; getting used to using DragonDictate took some doing. I spent a fair amount of time after hours working on my environment for the first month or two. During that period of time, I was doing my job using DragonDictate but wasn't as productive as I am now with it. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1) have the users of voice input at work found that their talking disturbs their officemates? Has anyone needed to arrange for having an office to themselves? It seems that when office space is limited this may pose a problem. > > > I have found that I can dictate pretty softly, and it seems to work better that way anyhow. Outside noise doesn't seem to disturb DragonDictate much, though. Classical music is just fine, as are conversations in the hallway, but it doesn't seem to like Devo music! As for disturbing my neighbors, as long as I talk relatively softly, that doesn't seem to be a problem (no worse than having a neighbor who spends all day on the phone, which is normal for some environments). I think privacy may be more of an issue: you need to realize that now people can hear you writing your email! (Already a problem with phone conversations). 2) has anyone experienced hoarseness or similar problems attributable to the increased talking? Is it possible we may be partially substituting one problem for another? > > > Yes, hoarseness is a problem. Much better when I talk softly, which is desirable anyway. I also drink a lot of water and tea. You should treat it like typing -- don't do it for hours without stopping once in a while for a break. When I first got the system, my head, neck, and shoulders bothered me more, until I realized that I was essentially shouting at the system (like teaching a class without a microphone and trying to project all the time). This is less of a problem now that I've learned that it works just fine with soft speech. 3) speaking of hoarseness, how do Dragon and the other packages perform when the user has hoarseness due to voice strain or a cold? Does recognition suffer significantly? > > > Yes, it does suffer. It's really fun to dictate while crying! :-( You need to be sure not to save your voice files after such a session so as not to screw it up. It does still work, though. Any insight on these questions would be appreciated. Thanks! ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ hoarseness has not been a problem for me, even when I was pulling a lot of all-nighters last summer/fall. the following will help keep you going longer: 1) sip water frequently, and don't wait for your throat to get dry before you get something. 2) speak softly and steadily 3) avoid drinks with caffeine or the harsher pops. (I don't know what the mechanism is, but avoiding caffeine was suggested by a friend who is a singer. I did find that my throat would feel dryer and more irritated if I had had a lot of pop and I had been working at the computer.) There are many professions where people use their voice all day. Its good to get suggestions from them. We have an advantage, because our voices must only carry to the microphone. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1) have the users of voice input at work found that their talking disturbs their officemates? Has anyone needed to arrange for having an office to themselves? It seems that when office space is limited this may pose a problem. DragonDictate is sensitive to noise, particularly to drawers and binders closing, although you can train an empty macro to these sounds it does interrupt and slow you down. I think a private office is important; don't be afraid to mention the American Disabilities Act. 2) has anyone experienced hoarseness or similar problems attributable to the increased talking? Is it possible we may be partially substituting one problem for another? I sip water all day and have had problems with hoarseness only when I let my cup stay empty. Drinking water also forces me to take breaks to go to the restroom. 3) speaking of hoarseness, how do Dragon and the other packages perform when the user has hoarseness due to voice strain or a cold? Does recognition suffer significantly? DragonDictate works surprisingly well when my voice sounds different from normal. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 2) has anyone experienced hoarseness or similar problems attributable > to the increased talking? Is it possible we may be partially substituting one > problem for another? I am responding to this question because I have had a particularly bad experience in this area. I began using Dragon extensively at the beginning of this year. After a couple of months, I noticed my throat was getting sore each time I used it. Being a performer in amateur theatre, I found that I was having to quit using Dragon by Wednesdays so that my throat could recover for a weekend performance. (In a dozen years of singing and acting, I have never experienced voice problems from performing.) The problems persisted and worsened. I asked our medical department if a voice therapist could be brought in to see what I was doing wrong. Nine weeks of aggravating bureaucracy later, I saw an ENT (ear, nose, throat) doctor who told me I had modules on my vocal cords. I immediately stopped using the system. While the nodules have since faded away, I continue to have pain in my throat from speaking (regular speaking). So now I can neither type nor talk confortably. There is one other person here that uses Dragon, and though he is able to use it, he, too, had some much less severe throat problems along the way. Even now, if his allergies act up, he has some difficulty. I am glad to hear that others have not had much problem with this, but be advised that it can occur. This possibility is also noted in the best book on RSI we have yet to find, on p. 168 (RSI A Computer User's Guide, by Pascarelli and Quilter). From: email@example.com (Duncan A. Buell) Subject: hoarseness and speech coaches I realize that the discussion about hoarseness has already come and gone, but I thought I would add my comments, since they may help people. I had been suffering from a chronic irritated throat from speaking to Dragon, so as well as seeing a doctor and getting medical treatment, I spent about 10 weeks seeing a voice coach. My coach is a choral music conductor, has 10 years experience as a public radio announcer, and teaches voice and public speaking privately and in classes. Here is the summary of my experience with her; I think this is highly relevant since I can now speak at great length to Dragon without suffering irritation in my throat. I found that once I recognized what I was doing that was causing the problems, it wasn't all that difficult to change my habits. Although it is desirable to change habits in everything regarding speaking, I found that the relatively controlled environment of speaking to Dragon made it easy to practice what I was supposed to be doing. And since speaking in general had never given me problems (I spent 15 years teaching), fixing the irritation caused by speaking to Dragon solved nearly all my problems. Two things were paramount in my case. First, I wasn't getting enough air, and second, I was ``squeezing'' my words out. This is a different way of stating the comments made in Pascarelli's book about shallow breathing. First things first: air. Don't breathe from the top of your chest but from your abdomen. Don't tense up. Treat your body as the instrument it must be -- stretch your arms up and rotate your shoulders before you sit down for a session. Roll your head to stretch your neck and jaw muscles. Singers would do this before a practice or performance, and athletes do this before exercise. You should do the same. Posture is important to being able to breathe properly and get enough air. Sit up; don't lean forward. When you lean forward you close off the abdominal part of your breathing apparatus and cannot help but breathe shallowly from your upper chest. Better yet -- stand. I have the good fortune to be able both at work and at home to stand when I dictate to Dragon. At work I have both the Dragon computer and my workstation elevated (using the simplest of support materials -- unopened reams of paper) and a copy holder coming up from between the two machines. This allows me to stand and get plenty of air, as well as to fidget, reach both keyboards and my cup of water, and even stretch while speaking. I had been speaking with short, shallow, bursts of air. This was less of a problem with dictating ordinary text as with dictating computer commands, programming constructs, or the detailed stuff of mathematical equations in TeX, since ordinary text has longer words and has a certain flow to it. My coach listened to me speaking and suggested I was probably starting my utterances with my vocal cords closed and then requiring that they produce sound from that position. Most of the changes I have made have been to try to change that. One key to not squeezing the sound is to speak words by letting the air start flowing first and then starting the sound. "Speak on a sigh" my coach says. I now speak words like "one", "as", "of", "and", and "in" much more as if I were singing them. Think of reaching for a high note to sing such a word. Open your throat and behind your palate, start the air going and then let the sound come. Very gently aspirate an "h" before such a word. Linger on the vowel to round it out. I do find that Dragon has a harder time distinguishing "as" from "has" (but not the other way around), but aside from that there seem to be no problems. The choice list is a little different from before ("Juan" now often appears when I say "one", but it's never the first choice). Another problem I had was my Midwestern "a"; too much nasal raspiness. Words like "as", "has", "that", were hard on my throat. I am much more careful to be more expansive with such words. I have even at times resorted to chanting to Dragon instead of speaking because it is so much more natural for me to sing with rounder, more open tones than to speak thus. Dragon has never seemed to notice at all that I have begun chanting instead of speaking. The bottom lines: posture and correct breathing to get enough air, and making sure to use enough of that air when you are speaking.
Return To Contents List