The DragonDictate / Speech Recognition FAQ

The DragonDictate / Speech Recognition FAQ

Currently maintained by Simon Crosby entirely by voice, using DragonDictate and a2x. Last update was $Id: $.

A related FAQ, the a2x FAQ contains information about using DragonDictate with the a2x program, which allows you to connect a PC running DragonDictate to a workstation which uses the X Windows environment, and control the workstation's graphical display and applications. For more information on speech recognition technology and research try the comp.speech archive.

These pages are by now a little rusty -- for a start Dragon Systems have excellent on-line advice and their own FAQ pages. I will no longer try to keep my copy of the Dragon Dictate FAQ up to date (Apr 96) because (1) I don't work for them and (2) they can do it themselves very well and (3) a2x is what this is really all about -- ie getting voice users to be able to control X-based workstations (though a3x lets you control NT -- see below). There are loads of other voice products out there now, for example Kurzweil Voice, Articulate Systems, IBM VoiceType.

These FAQs are direct copies (shortened where possible) of postings to the a2x mailing list, the SOREHAND mailing list, and the C+HEALTH mailing list. Send contributions/corrections/updates/ideas to me.

Collections of DragonDictate macros contributed mostly by a2x users, for a wide range of tasks including programming in various languages, and using word processing programs, latex, emacs, vi, and other applications can be found off the a2x FAQ.

The macros, and ASCII versions of the FAQs are also available by ftp . These will be updated periodically to reflect the current status and contents of these web pages.


Contents

Just click on the topic you want to get to...


What is DragonDictate?

DragonDictate is a speech recognition system which runs on a PC. It is a discrete utterance system, which means that you have to pause between words. It runs under DOS or windows. DragonDictate has a large English vocabulary, and also allows the definition of macros, which are basically sounds with which are associated sets of keystrokes. For example, controlling your editor you might need a macro "open file", which presses the magic key sequence to load a file into your editor. Although DragonDictate is PC based, it is possible to direct its output to a sensible machine, such as a workstation, by putting an Ethernet or serial interface onto the PC, and using a telnet program to send the keystrokes from DragonDictate to the workstation. A piece of public domain software called a2x allows you to convert ASCII keystrokes into events on a X display, so it is possible to "talk to X".

Voice Users Mailing Lists

Try reading comp.speech if you're interested in the technical stuff, the The a2x mailing list for a2x technical details, and for more general discussion of the use of speech recognition subscribe to the voice-users mailing list:

Announcing: The voice-users mailing list.  This list is for discussing
all aspects of using voice recognition input systems; DragonDictate,
Kurzweil Voice, IBM Voicetype, IN3, and others.  Sample topics might
include:

- Using such systems safely, without muscle or voice strain
- Techniques for improving recognition accuracy
- How to set up the physical voice workstation optimally
- General tips for effective use of voice interfaces
- Configuration of specific systems, troubleshooting, etc


To subscribe, send mail to:

        voice-users-request@lists.uchicago.edu

with subject line "subscribe" (without the quotes).  Random
administrivia should also be directed to the request address.  Should
you need to reach the list-maintainer non-electronically for some
reason (limited ability to use a computer, for instance), his current
work number is (312) 702-7142.

Posts to the list should go to:

        voice-users@lists.uchicago.edu

Return To Contents List


Other Speech Recognition Products


From: "Carlos M. Puig" 
Date: Wed, 30 Nov 1994 20:46:01 -0800

     The current issue of PC Magazine (December 20, 1994) has a major
review section on voice recognition products (pages 203-219).  The
following products are covered in detail:
 
     o Dragon Dictate for Windows
     o IBM Personal Dictation System
     o IBM Continous Speech Series
     o Kurzweil Voice for Windows
     o Listen for Windows
     o Phonetic Engine 500 Speech Recognition
 
In addition, there is a table of "Other Voice-Recognition Products" (p.
209) and a sidebar on "Navigation in a Mouseless World" (pp. 212-13)
covering briefly:
 
     o Voice Assist (bundled with the Sound Blaster 16)
     o VoiceMouse
     o IBM Navigation Product (name not finalized)
     o QuickSwitch for OS/2
 
This time there is no editor's choice:
 
"While voice-recognition technology for the PC has finally advanced enough
to yield productive tools, we feel that it's still too early in the game to
pick a clear winner in any of the the three categories we examined
[dictation, navigators, and application development]."

Return To Contents List


DragonDictate support on the net

DragonDictate support is now available over the internet at support@dragonsys.com

Return To Contents List


How Easy is it to use DragonDictate ?


From: Dana Bergen 

>I had a demo of DragonDictate today. I have a couple of questions,
>though. How easy is it to install and learn? The reseller wants what I
>consider a hefty chunk of money ($250 to install + ($250 x 2 for two
>4-hour training sessions) = $750 total) to install and train me on it.
 
This is insane.  You don't need anyone to train you on it.  It comes with
a tutorial and it's not difficult to use.  The "more advanced" features
like creating macros and using dcoms are reasonably well explained in
the manual.
 
>I'm pretty comfortable installing things in my PC (I put in a sound
>card and a tape backup unit myself), but he was talking about
>configuring interrupts, which I know nothing about.
 
I didn't install the card myself because my hands aren't up to using
a screwdriver but I don't think it's any big deal.  I think that
configuring the interrupt is just a matter of setting a switch on
the board.  If you're on a network there are some additional issues.
If you're concerned about this you might try negotiating a much lower
price for him to just do the install.  Better yet, try it yourself
and only pay for help if you get stuck.
 
>following a tutorial and reading a manual. On the other hand, I want
>DD to work, and I want to feel comfortable using it, otherwise it's a
>waste of $$$.
 
With respect to the installation, it will either work or not work.  It's
not going to work badly or differently because you installed it wrong.
 
I think your reseller is trying to recoup the money he's not making
since they cut the price.

Return To Contents List


Who can use DD ?

Many people are concerned that they will be unable to use DD. In my experience it can help almost everyone. Here's just one testimonial:

Date:    Mon, 20 Mar 1995 08:54:13 -0500
From:    croom 

My DD for Windows is getting me through Law School.  With my multiple
RSI's, I have to tape record my classes and library research, and then
transcribe them using DD.  I do most of my research online to avoid
handling large law books.  I have written 50-100 page papers on DD and I
take the computer to school to take my 3 hr final exams.  I don't even
need more time to take tests than the other students.  The software cost
$2000 last August (upgraded to Windows in October without charge).
  I have the Power edition which includes legal and medical terminology.
I realize I sound like a commercial, but I lost my job and my former
career due to RSI (still in litigation) but DD has given me a 2d chance.

DD can also be used for programming .

Return To Contents List


Working with DD in a Shared Environment

Many people are curious how you deal with "eavesdroppers" when using DD. Do you have to have your own office? What if you're working in a shared environment, eg a cubicle, possibly with cubemates?

From:    Jeff DelPapa 
  
At my employer, a private office is now policy for dragon users (there
are 3 of us now).  Being in the same room with a dragon user is
_WORSE_ than with a full time phone user, as the pauses trigger the
"start of conversation" break... (most people can tune out continuous
speech, the discrete speech drives people nuts.)  This is recognized
by "both sides" -- I did start out in a cube, and as soon as we moved
into new digs, I got an office, with storage spaces on 3 sides.
Whenever the subject of cubes as a space crunch cure arises, the
others that remeber me in the next cube say "give him an office, we
won't be jealous"
 
From:    Dana Bergen 
 
I work in a cubicle, and most of the people around me say it doesn't bother
them.  One of my neighbors has complained, however, so I'm going to get a
private office when our group moves.  That move keeps getting postponed,
however, and no one seems to consider it urgent that I be moved.
 
I think sharing a cubicle or office would be intolerable for the other
person, though.
 
Other people's ordinary noise and conversation doesn't cause me problems.
Dragon only picks up sharp loud noises like a door slamming or construction
sounds, and these are generally not mistaken for words so it doesn't cause
problems either.
 
Sending email to friends about private matters -- now *that's* a problem
I don't have a solution for!

From:    "Eric S. Johansson" 
 
>Sending email to friends about private matters -- now *that's* a problem
>I don't have a solution for!
 
1: have an email account at home with dd
2: train a "privacy" vocabulary/cypher
 
From:    Ned 
 
I work in a busy newsroom at a daily newspaper. I have two "pod mates"
and a steady stream of coworkers who frequent my little area. My dictation
is no louder than a normal telephone conversation, and my fellow workers
tell me they don't even notice me babbling to myself anymore. :-}

Return To Contents List


DragonDictate Upgrades v2 to v3


From: taylorw@marie.mit.edu (Washington Taylor)

Joe Steffen writes:

   I moved my DragonDictate 2.0 voice files to a PC with DragonDictate 3.0 and
   converted them with the modvoc program that's used for upgrading to 3.0,
   then used DragonDictate in my normal work for a few hours.  Modvoc copies
   all your words and macros and preserves your voice training, but it still
   has a few problems:

   1.  Word punctuation attributes are not preserved, both for words you added
   and special character punctuation you changed, e.g. I changed "*" to cling
   left and right so it's easy to use in file name and regular expressions.

Do you keep most of your macros in files?  If you do, it is easier to
keep track of and edit them.  Furthermore, when you upgrade you can
simply edit those files to the new format and fix punctuation
automatically.  I haven't upgraded to 3 yet, but even upgrading from 1
to 2 it only took a few minutes to write emacs macros to update my
macro files.

   2.  Trailing spaces on words are removed.  This is a problem for me because
   I define most computer command names to have a trailing space and cling
   right so I can say "*" after them.

Again, this problem would be fixed by using macro files.  If the
internal representation changed, you could use a command like
add-word /t /g "foo" "foo " r
(version 2 syntax)

   I went through the first part of my vocabulary online and counted over 170
   words such as file suffixes like ".bat" and parts of file paths such as
   "../".  I estimate I have over 200 such words and special characters with
   changed punctuation attributes, so unless modvoc is fixed I'm not spending
   the money to upgrade.  Note that there is no way to dump words in
   DragonDictate 2.0 so all of the above has to be corrected by voice.

Again, using macro files would eliminate this problem.

   I had the same error rate (3%) as with DragonDictate 2.0.  Has anyone seen
   an improvement in recognition accuracy after upgrading from 2.0 to 3.0?

I had 98% recognition with version 1.  I have 98% recognition with 2.
I expect to have the same with 3 and the windows version unless they
have radically improved something.   The only difference was the time
to reach the plateau, which was months for 1 and days for 2.

> From:    "Susan Maller ()" 

> Has anyone out there upgraded from Dragon 2.0 or 2.01 to Dragon 3.0?  How
> do you like it?  Do you notice a significant difference?  PLEASE REPLY.
> Susan Maller
> maller@madonna.coedu.usf.edu

Yes,  I'm on 3.0 Classic - still with a 30K vocabulary.  I found that
recognition has improved, particularly for words like "last" "cast"
"castle" in which the British flat "a" caused a problem.  I no longer
have to try to sound like an American on these :-)

Also slightly faster, I think, but that is subjective and I have no
proof.   So, no significant difference but I'm happy with the
upgrade. 

Return To Contents List


What Sound Cards Can I Use ?

DragonDictate for DOS uses the IBM M-Audio Acquisition and Playback Adapter, M-ACPA, which is an ISA-based card. This card is used for digitizing the audio and for doing FFTs on the digitized samples to assist recognition and reduced the CPU load. Read on for further information about these cards and the cards used by the windows version...

From: Scott@ccgate.dragonsys.com

>Also, (forgive my ignorance being a non-PC oriented person) could
>someone tell me whether there is a PCMCIA, or PCI  M-ACPA card
>available, or a similar card which DD can use ?

There is not a PCMCIA version of the M-ACPA card.  At this time,
DragonDictate for DOS can only use this card.  As you may have
gathered from reading the mailing lists, DragonDictate for Windows can
operate on a Windows sound card like a SoundBlaster 16, the
MediaVision PAS16 and the Microsoft Windows Sound System to name a
few.

From: Donald Hermes 
Subject: Re: SoundBlaster settings

From my experience with it, the SoundBlaster seems to need more volume
than the MACPA  card. For what it's worth it seemed to have more of
a 'typeahead' buffer than the MACPA card.
Make sure the automatic gain control is off. That needs to be set
from the recording settings in the mixer control. I had my microphone
volume set at about 4/5th of the way up.

	From: carroll@research.att.com (Martin Carroll)
	Subject: SoundBlaster settings

	For the past few months, I had been using a SM10A Shure microphone
	with a MACPA card, and DragonDictate picked up my soft speaking
	just fine.  I just switched to a SoundBlaster 16 card, and now,
	no matter what settings I specify in the audio mixer tool, I cannot
	get DragonDictate to consistently hear every softly spoken word.
	Sometimes I have to downright shout to get DragonDictate to hear a
	word.

Return To Contents List


DragonDictate for Windows or DOS ?


From: Jack 

>> but in my opinion, the technology just isn't there to
>>give you both adequate performance and to also have the DragonDictate menu pop
>>up on the workstation screen.

>I'm puzzled by this remark.  I have the DragonDictate menu on my workstation
>screen, and the performance seems quite adequate to me. (The accuracy, on the
>other hand...but that's a different topic.)  Do you mean that
>you experienced a noticeable delay between saying a word and seeing it
>typed, or that you had to wait for menus to be drawn?  This has not been
>my experience.  Or do you mean that you can speak more quickly if the
>menu is not exported?


What I mean is the following.  I regularly use Dragon to run an X display.
The way I'm using it is to have a PC monitor next to my workstation monitor.
After using this for about 1 1/2 years or so, (the point being that I had
become accustomed to a certain level of performance) I saw a demo of the
Windows product.  Point blank the performance was visibly worse than what
I am used to.  I surmised that the reason for this was that the Windows display
takes processing power to update, which is true.  The Dragon Systems rep
basically agreed with me, though he was surprised that I noticed.  The main
thing is, anything you ask the PC to do other than voice recognition is taking
cycles away from the recognition engine and giving them to some other task
like updating the VU meter, for example, which is constantly responding to
the volume of your voice.

Someone else said they didn't think that the Windows version was slower, and
then went on to explain that they were already using an exported menu that
appeared on their actual workstation display and explained that since the
exported menu was already taking a certain amount of processing power, that
the additional amount required to update and maintain the Windows display
was insignificant compared to what they were already putting up with.

My problem is that I could easily see that the Windows version was slower
than what I was using; since what I'm using needs to get faster, not slower,
this is not a solution I am willing to consider.  Also, since I rarely have to
look at the PC screen anyway as I just watch what happens in my X display, the
benefit of running on a single machine or of having an exported menu pop up on
my workstation (and cause the problems mentioned on this list with blotches
due to bugs in deskview/X) is of little or no use to me.  What I want is raw
performance, period.  Anything that will adversely affect that, and I could
easily tell from the demo that the Windows display would, is just not going
to make it with me.  Added to that, the Dragon rep basically admitted that it
*was* slightly slower, so I basically bailed on the Windows product.  This is
not to say that I'm not into DragonDictate; I am, it saved my career.  I just
won't do anything that will slow my scenario down.



From: steffen@iexist.att.com (Joe Steffen)

I installed DragonDictate for Windows last week.  The first problem is you
have to cancel the new user dialog and follow the README instructions to
change the voice board default to the M-ACPA board.  

The second and worst problem is you can only transfer some macros from
DragonDictate for DOS; you lose everything else: voice model training and
all added words (I have a significant number of acronyms and UNIX command
names and options).  After two calls to technical support it wasn't clear
which macros can be imported into DragonDictate for Windows; the format of
special keys like control keys may have changed, so you will need to write
a conversion script.  You can't import macros with dcom's because they are
replaced by a scripting language.

The third problem was the tutorial quit after an internal error 4 times.
After doing the quick training, the tutorial finally worked.  DragonDictate
for Windows operates differently so retraining yourself will be necessary,
e.g. you say Voice Menu instead of Voice Console.  I'm sure you can define
a compatibility macro, but the macro language documentation is on-line and
not in the printed manual, which I consider to be a problem since it is
easier for me to read a paper manual then control a help program by voice.

So now I'm considering upgrading to DragonDictate 3.0 since I will continue
to use DragonDictate for DOS for most of my work, which is on UNIX and a
Macintosh. 

From: Sebastian Seung 

I just got my DOS to Windows upgrade from Dragon Systems.  The user
interface is well done, but the documentation is not so good.  My
greatest disappointment is that it doesn't work with PC-XWARE.  I
called Dragon's technical support, but they were ignorant of the
problem.  Does anyone know the reason for the incompatibility?  Or has
anyone else found a compatible X terminal program for the PC?
 
From: Gary Shea 
Date: Tue, 15 Nov 1994 14:46:50 -0700

Howdy folks!  I finally got Lan WorkPlace going again (these Windows
apps are mighty picky...), so now I can connect to my Unix box
via the tnvt2270 app and dictate into that window.
In Windows, recognition isn't exactly peppy but it's maybe
20 words a minute?  I'm guessing.
But with 24M of memory, in the tnvt220 window I only
get about 10-12 words/minute.  I say my sentence,
then sit there for 20 seconds while it gradually gets created in front
of me.

I figured it was memory when I had 16Meg, but now with 24 it's
no better!

Anyone else have this experience, or better yet anyone that's
figured out that it's something simple?  Oh, please???

Anyone that's used DDDOS and then switched to DDWIN?  Notice
any change in recognition accuracy?  What I get is just
terrible... no better than 75% in dictation mode, even when I'm
really careful about corrections.  Quite a bit better in command
mode, but I'm never there so it doesn't help much...

From: Jack 
Date: Wed, 30 Nov 1994 16:50:35 -0800

>i've got a 486 box and a sparc 20.  i've been using dragondictate for
>windows for a couple of months now, and i am *nowhere* near normal speech
>rate.  more like 15-20 words a minute i'd guess, taking corrections into
>account.

>am i doing something wrong

YES!  You're using the Windows version!

No, really though, no insult intended but it really does perform that much
worse, there was a sizable flap regarding this issue on the list a while
back, but if you think about it for a while it makes perfect sense.  The
PC processor has to manage the Windows display, thereby consuming cycles
the recognition engine could otherwise have used.

The other part of the story is that I am well acquainted with this speech
technology which also improves performance.  It won't account for the entire
discrepancy in performance though, the main bulk is just that the Windows
version is slower.  For Unix users there is little if any reason I can think
of to use the Windows version.  If you're a DOS user, maybe it's a win...


Return To Contents List


DragonDictate Tips


From: Scott@ccgate.dragonsys.com

> Sender: taylorw@marie.mit.edu
> Does anyone have a clever idea for how to move the (windows) mouse
> in ddwin using a voice macro/script?  I would really like to have a set of
> macros which either

> 1) move the mouse to one of a set of fixed points on the screen

> or

> 2) move the mouse a relative distance (like "[two inches up]").

> I tried using the scripting language to start the mouse moving, then wait,
> then stop, to achieve (2), but it gave me  an error message about not
> being able to wait in that state.

Someone in my department here at Dragon has recently written a DLL
that will allow you to do exactly what you want through the DLLCall
scripting command in DDWin.

Using it, you can move the mouse pointer to x,y coords relative to the
screen or window and also offset the placement of the mouse pointer
any number of pixels up, down, left or right.

I have to get the file from him and I will make it available to you somehow.
 I don't have an FTP server that you can connect to, maybe I can attach it
to an e-mail (I think my mailer uencodes) or I can post it on our BBS for
you.

Return To Contents List


Controlling the mouse in DragonDictate for Windows

I've never used DragonDictate for windows, and this information comes directly from an a2x user (Wati (Washington Taylor)) and Dragon Systems. The contribution is a windows DLL which enables you to control the mouse in DragonDictate for windows using voice, and a set of mouse control macros contributed by Wati. Here's what Scott Jangro of Dragon Systems had to say about the DLL:

Here's a zipped up version of the mouse movement DLL and macros. I've renamed them from DDHELP to DDMOUSE for descriptive purposes.

There are three files in this DDMOUSE.ZIP file,

Wati has contributed these additional macros for mouse control in DD for windows, which make use of the DLL.

Return To Contents List


DragonDictate vs Kurzweil Voice


From: George Hu 
Subject: Kurzweil Voice

I am fortunate to have both Dragon Dictate and Kurzweil Voice for
Windows, and would like to provide some comparison.  It seems that
Dragon really dominates this mailing list, possibly due to having
multiple dealers on the alias.  While Dragon is a very good product
which has developed many features which are very useful, Kurzweil has
some advantages which everyone should examine.
 
In my opinion, Kurzweil has a much better recognition engine.  Out of
the box, Kurzweil beats a trained Dragon, and over time it still beats
Dragon.  When I use Dragon, I find it mistakes words a lot, and often
doesn't have the right word in a choice list of 10 words.  Kurzweil,
however, very rarely makes an error, when it does make an error it is
usually on a word which does sound ambiguous, and the choice list of 5
words has the correct word in the list much more often.   Dictating
this whole document, I probably made less than a dozen errors!
Kurzweil is also faster than Dragon on the exact same system.  This may
be do to Kurzweil still using a specific hardware board whereas Dragon
is running off of a Windows sound system card.  These two advantages
mean that I want to use Kurzweil, but do not look forward to using dragon.
 
Dragon has a much better set of macros and customization.  If you
intend to do programming, or other non-dictation activities, Dragon may
be the only system which can work for you.  Dragon allows you to
control the mouse and buttons by voice whereas Kurzweil does not.
Dragon can read text off of buttons and allow you to speak them whereas
Kurzweil only has a few predefined buttons you can say.  Dragon allows
you to create hierarchical macros and has sophisticated things you can
do inside macros such as control spacing and capitalization, Kurzweil does not.
 
Kurzweil is a simpler product to use.  There is no command vs. dictate
mode.  There are no flags for spacing and capitalization you have to
set before saying a word.  Kurzweil allows you to correct spacing and
capitalization afterwards.  For correcting errors in previous words,
you don't need to enter a special oops mode; you just say " backup 2"
or whatever.   Kurzweil has a manual of about 70 pages; Dragon comes
with three manuals totaling over 315 pages.  Kurzweil also comes with
many application macros which I have found easier to use than for
Dragon, although that probably varies a lot depending upon which
programs you run.
 
Eventually, Kurzweil will probably have all the features of Dragon.  It
will be difficult, however, for Dragon to change their whole engine.
Eventually, both these products will be put out to pasture by
continuous recognition systems.  Today, I think the choice is between
recognition accuracy, and features.  If you want a dictation system for
actual English dictation and moderate application control, then
Kurzweil deserves serious attention.  If you intend to do programming,
or other things which must be highly customized, then you probably need
the features of Dragon.
 
Lastly, Kurzweil is retailing at about 1000, which is a bit more than
Dragon.  Both systems can run on similar platforms, but I think
Kurzweil is faster.  Sometimes, Kurzweil can require more virtual
memory than Dragon, but both are basically memory hogs.  I run both on
a 66 megahertz 486 with 16 megabytes.

From: Gary R Noonan 
Subject: Re: Kurzweil Voice

        My approximately year long experience with the DOS version and
approximately 2 month experience with the Windows version of Kurzweil
support the comparison provided below.  The Dragon dealer was unable to make
his system perform with anywhere near the accuracy of Kurzweil when I
examined DOS systems.  The Windows version of Kurzweil is indeed very good
at voice recognition--even without training.

        I recently found a method to have Kurzweil produce mouse clicks on
selected windows.  Simply start the Windows Recorder macro system (found in
accessories window) and assign unique keystroke to the macro (such a Shift +
Alt + Ctrl + a) and make the desired mouse clicks.  Then create a macro in
Kurzweil that issues the hotkey for the Windows Recorder macro.  This
procedure allows you to have the Kurzweil system perform mouse clicks by
voice.  You can of course also create macros within individual programs and
have Kurzweil call them.  I have created numerous voice activated macros in
WordPerfect for Windows and call them from Kurzweil, often performing
complex tasks with a single voice command.

Return To Contents List


DragonDictate vs IN3


From: Dana Bergen 

>> I've been looking into IN3 and am considering buying it. However, there
>> is no demo version, and no money-back guarantee. So as far as the
>> company is concerned, I have to put down $700 sight unseen with no
>> guarantees. Is there anyone in the Boston area (I'm in Waltham) who is
>> using the product and would be willing to give me a short demo so I can
>> get a better feel for this product before plunking down $700?
 
IN3 costs $700?  Why spend $700 for a limited command set when you can
get a full-fledged dictation system (DragonDictate) for $1000?
 
I have no financial interest in this recommendation.  I use DragonDictate
and a coworker used to use IN3, and they are in two completely different
leagues.  I don't understand how IN3 can get away with charging that much
given what DragonDictate costs now.

From: Nelson Sproul 

>> I just got DD for Windows to use it with Unix.  I don't know how smart that
 was
>> (I'm beginning to wonder if I should have gotten IN3)...
 
you are better off with DragonDictate than in3.
 
when my hands first went bad, I wasn't aware of the DragonDictate/a2x
option.  Despite warnings from in3's customer service that I would
"go crazy" using in3 as a substitute for typing (as opposed to just
using it to handle mouse functions), I used in3 for one year, getting
barely acceptable performance with a vocabulary of about 150 items.
This vocabulary allowed me to spell words out, which was of course
an excruciatingly slow method, though one I was still grateful for
since it allowed me to continue work in this field.
 
Now, using DragonDictate, I have superior performance AND a vocabulary
of 30,000 items.  I mentioned this to an in3 rep, and all he could
say was that my Sun, a sparc ipc, is not a very strong machine.  I
don't think this accounts for a difference in performance/utility
of two orders of magnitude.  I haven't seen in3 run on a PC, but
I called Command Corp. (in3's manufacturer) today and they say in3's
performance is comparable on the two platforms.  I also asked what
was the largest vocabulary size they would expect, and I was told
"at least a couple hundred."
 
Needless to say, a vocabulary limited to hundreds of words is not
acceptable for an English dictation system for adults.  I have
never flamed anyone/anything in my life, but it appears
to me that, given the availability of DragonDictate and a2x, in3 is
a hoax.

From: Nick Parker 

Nelson Sproul  wrote:
> never flamed anyone/anything in my life, but it appears
> to me that, given the availability of DragonDictate and a2x,
> in3 is a hoax.
 
I can see how using IN3 for text dictation would be very
frustrating. I'd compare it to chopping down a tree with
fingernail clippers. It's possible, but geeeeeeeeeeeez! It's
certainly unfair to complain afterward about the design of
the fingernail clippers...
 
I've never seen INCUBED represented as a "dictation system."
It certainly isn't called that in the Command Corp marketing
literature I received, nor in the IN3 product documentation
I received when I bought the package.
 
It's all a matter of selecting the right tool for the job.
Almost all applications require a large amount of command
selection -- that's the nature of graphical user interfaces,
and feature laden software.  When you count up cuts and
pastes, font changes, saves, moves, etc, even writing a
simple text document has a LOT of command and data
selection. In the case of CAD, DTP (which is almost
indistinguishable from word processors these days), or other
similar applications, the user-computer interface is
comprised almost entirely of command and data selection. A
good voice command system, like IN3, can perform these
functions via voice, and drastically reduce the amount of
button pushing you do in a day. A continuous dictation
system might do the same, but at a much higher cost.  Both
dictation systems and command systems have their place --
it's a matter of selecting the right tool for the job.
Command systems will benefit everyone, and dictation systems
will give an additional benefit to those who need it.
 
 
I don't think IN3 is a "hoax" -- at all. It is very robust
and effective software. The last literature I saw lists the
prices at $179 for the basic version, and $395 for the Pro
version, which includes a very nice Audio Technica Pro 8
headset microphone. Not bad. And no, I don't have any
affiliation with Command Corp.  I'm just a satisfied
customer: IN3 did what they said it would do.
 
From: Ann Marie Lawler 

> IN3 costs $700?  Why spend $700 for a limited command set when you can
> get a full-fledged dictation system (DragonDictate) for $1000?
 
      Well, that's IN3 plus a top-of-the-line noise-canceling microphone.
 
      If you're on MS-Windows, IN3 doesn't cost anywhere near $700 and if
you're on a SPARC station you've got to add the cost of the PC, the sound
card, and possibly the communication connection to DD.  Maybe you've got a
free com port, maybe you don't, it's just that much more hardware.  That
plus find a place for that second box, and keyboard, and screen...  Sigh...
 
      Then you also have to get a2x running.  According to the documentation
with the copy of a2x that I have (distributed with the X11R6 sources) it
won't work with the Sun OpenLook version 3 (current release) out of the box.
You have to have X11R6 running or a patched version of X11R5 with the
XTEST extensions patched in.
 
> I have no financial interest in this recommendation.  I use DragonDictate
> and a coworker used to use IN3, and they are in two completely different
> leagues.  I don't understand how IN3 can get away with charging that much
> given what DragonDictate costs now.
 
      Well, IN3 and DD are in different leagues.  They are different
products designed to do different jobs.  One is a command system, one
is a dictation system.  I type just great.  In fact, some of my co-workers
think I type faster than I talk.  My typing was not where my problems
were starting.  If you can't type at all I guess you need a dictation
system.  If you can still type and want to prevent further damage, maybe
a voice command system is a better choice.  One product is not necessarily
better than the other, they're just designed for different jobs.
 
      My problems started with that darn rat (ahhh... I mean mouse).  When
you squeeze both sides and hold down a button while dragging something
around on a desktop, the stress goes right to the wrist. Added to that were
multi key cords.  ~F and ~B in vi as well as some of those lovely Alt-Function
keys for Wordperfect had me stretched accross the keyboard like I was palming
a basketball.  Ouch!
 
      If your hands are already so injured that even normal typing is no
longer possible, I'm sorry to hear that and I guess you don't have much
choice.  I prefer to not get that bad.  I replaced all the fancy command
operations with voice macros.  Now my typing doesn't bother me at all.  I
can even type faster than I used to because I've reduced some of the
distraction when doing commands.  I caught my problem early enough so
I guess I'm one of the lucky ones.  I'd rather do the commands and functions
by voice and still type rather than be forced to do all of my typing by voice.


From: Simon Crosby 

> From:    Ann Marie Lawler 
 
>       Actually, I would not be suprised at all, having worked with a2x as
> well as other products which require "clever window manager key bindings".
> That's actually why I made the statement.  There was just too much you cannot
> do with key bindings.  After a while you also get tired of cluttering up
> configurations with obscure kludged together key bindings.  And changing key
> bindings "on the fly" is not a lot of laughs.  Most people don't what to be
> forced to be "clever" just to use something for a purpose to which it was
> not designed.  Most people would rather use something that is easy to use
> and designed for the purpose to which they are applying it.
 
On the other hand, a2x is not a product, nor does it claim to be.  It
solves a problem, and does it exceedingly well.  There are lots of
people out there (like me) who cannot type.  My career would be in the
dustbin (US = garbage can) by now if I were not in a position to:
1. Write technical documents
2. Write programs in lisp, C, assembler and various other gunk
3. Manage a computer network
All by voice, on a unix workstation.  DD is the ONLY product which I
have seen  which will currently allow all of these, in spite of its flaws.
 
>
>       Actually, according to the sources, a2x has some pretty fancy features
> built into it that many people don't even know about.  Maybe because they
> are so archane to try and figure out.  "That's a control T ?what? for a
> window class?"  "But that command is suppose to do different things in
> those different windows."  And once you're set up, changing them on the fly
> is difficult at best.  And the recognizer still doesn't know what's happening
> on the screen or if something succeeded or not.  Then there's that extra
> box and screen and keyboard and sound card...
 
Yes, a2x needs a language front end, and DD needs to be "application
context aware". It also needs a *much* better undo mechanism than its current
"throw out backspaces" idea. Ideally voice recognition/input should be
an integral part of each application to guide recognition and so on
... One of these days perhaps... But remember, a2x is free.  It also
works well enough in conjunction with DD to be very useful.
 
>
>       The thing about IN3 is that it is easy to setup and use and to change
> commands while remaining a powerful command processor.  It just does a lot
> of things which cannot be done through key bindings.  Things like warping
> to a location within a window of a particular title.  Or combining functions
> where one function depends on the success or failure of a previous
>function.
 
Great.  I have no problem with IN3 -- If it solves a problem or has a
niche, then I'm 100% behind it.  Remember though, that not everyone
has a sparcstation.  What do I run on my alpha box ?
 
>
>       It might be possible to combine key bindings with piles of shell
> scripts and come close to some of the advanced features of IN3, but not all.
> There are things which you still cannot do, even with key bindings and shell
> scripts.  But do you really want to spend all of that time writing clever
> shell scripts to go with all those clever key bindings?  Just to do a few
> of those things which key bindings cannot do?  Just to use a product for
> something it was not designed for?
 
No I don't want to spend hours doing this.  I spent about 1.5 weeks,
once, and now I can do almost everything I need.  Admittedly not very
sophisticated, but it works.  If you can add value in a product and
people will buy it, then great.  I wish you the very best of luck.
And yes, I'm sure you have fantastic functions which I'd love to have.
 
>
>       If you need a dictation system, get a dictation system.  If you don't
> need a dictation why get one to do something other than dictation?  Different
> products for different jobs.
 
My DD world is more than a dictation system, it is my whole
human-computer interface.  This may well be true of IN3, now or
someday, and I agree IN3 can help save people's hands by reducing hand
strain.
 

Return To Contents List


IBM VoiceType

From: Diana Carroll

IBM Announces the VoiceType Speech Recognition Family

IBM announced a new brand name for its family of speech recognition products--VoiceType. Included in this family is a new product, VoiceType Dictation, which is a high-accuracy, large-vocabulary speech recognition system for dictation. VoiceType Dictation was previously known as the IBM Personal Dictation System but has been enhanced. This system has the capability of recognizing 32,000 words at approximately 70 to 100 words per minute, with 97 percent accuracy. The system is compatible with many existing applications such as Lotus Notes, AmiPro, cc:Mail, Microsoft Excel, Word, and Quicken.

The initial product is based on OS/2, but IBM has also created a Windows version and a version for notebook and laptop computers that is supported through a PCMCIA digital signal adapter card. VoiceType Dictation for OS/2 is available in American English, British English, French, German, Italian, and Spanish. These languages for the Windows version will be available in 1995. The suggested retail price for the OS/2 product is $999; the PCMCIA model is $1,099.

Other products in this family include IBM's Continuous Speech Series, which is a speaker-independent continuous speech toolkit, and IBM VoiceType Control II, which provides speech navigation in selected models of the IBM ThinkPads.

DQ Take: The VoiceType family is yet another enhancement to the growing list of voice recognition products that we have seen for the consumer market. The benefit of using voice recognition in a PC environment is that it can greatly speed up the completion of forms and documents in which a small amount of information changes, and it can also allow users to create forms and documents in a hands-free environment. This is done through straight speech recognition dictation; speech recognition also drives the menu options of common word processing and other software packages. The target users for these products are typically lawyers, lab technicians, radiologists, surgeons, other professionals, and physically challenged individuals. [ed: is this us??]

A number of competing products are on the market. However, several features differentiate VoiceType from the competition. The first is the ability for the user to create an entire document before making corrections. Many other products make the user stop and correct the error immediately, potentially interrupting the thought process. The next is the ability for users to capture their voice for playback in case any input is in dispute. Finally, IBM has developed special vocabulary packages for radiology, emergency medicine, and journalism in order to further the accuracy of input in these segments.

Although the market for speech recognition is growing, the deterrents to that growth still include appropriate applications and the accuracy of the system. Dataquest sees any improvement to the use of speech recognition technology as a positive sign because improvements will eliminate the barriers to growth. Although the market is growing in several segments, such as voice dialing for telephony applications and speech recognition in voice processing applications, any push that creates user acceptance will also push sales of products that incorporate this technology.

Info: Nancy Jamison (408)437-8182 (njamison@dataquest.com)

Return To Contents List


What helps / hinders recognition ?


From: steffen@iexist.att.com

    From: dorab@twinsun.com (Dorab Patel)
    Subject: what helps / hinders recognition
    
    In your experience, what do you find that helps recognition accuracy?  What
    seems to hinder recognition accuracy? 

I keep my office door nearly all the way closed,otherwise the higher
background noise increases errors.  I'm vigilant about correcting
misrecognitions and if I ever leave the mike on during a conversation so
there are more false recognitions than I can correct, then I revert to the
last saved voice models; otherwise your voice models get messed up and your
error rate goes up until they are retrained.  I keep track of common
misrecognitions of voice macros and change one of them even if I've used
one of them for a long time so my mental retraining is difficult.  For
example, [quit UNIX] often sounded like [quit emacs] so I changed the
former macro.  In 11 months my error rate dropped gradually from 8% to less
than 3%.

Return To Contents List


How About DragonDictate for the Mac?


Articulate Systems' licence DragonDictate for the Mac, and the product
is called PowerSecretary.   Anyone with a review please let me have it.
Articulate Systems
600 West Cummings Park
Suite 4500
Woburn, MA   01801, USA
+1(617)935-5656 Fax: +1(617)935-0490
+1(800) 443-7077
Here's the announcement from articulate for version 2, passed on by rose@src.honeywell.com (Fred Rose) on Date: Wed, 29 Mar 95 12:22:58 CST. Some of the comments which follow it are by now out of date. Anyone with a recent review please let me have it.

Direct Dictation into Applications - Version 2.0 allows you to directly dictate into almost all 3rd party applications such as WordPerfect, ClarisWorks, Excel, QuickMail, America On-line, QuarkXPress, and PageMaker to name just a few. In addition to dictating text, you can use bundled versions of AppleScript and QuicKeys to create whatever voice macros you want to control your favorite applications.

New Auto Correct Feature - Version 2.0 includes a very powerful new feature that enables the system to automatically correct words after you correct a previous word. For example, if you say "2.0" and the system hears "to point zero", when you correct the "to" to "2", the system will automatically correct the "point zero" to ".0". This feature increases the system's performance by automatically making corrections for you in many cases.

More Hands-free Control - For the disabled user, Version 2.0 provides significantly more hands-free control than previous versions. For example, you can now correct misrecognized commands entirely by voice from within the target application. In addition, there is more robust AppleScript control of the finder including moving and clicking the mouse.

PowerBook now Supported - Version 2.0 runs on the PowerBook 540 lines of notebook computers without any additional hardware other than our PowerSecretary microphone. For those of you who need a mobile dictation solution, this is the configuration for you.

This is what others have to say



From: "Zack T. Smith" 
Date: Tue, 13 Dec 1994 12:53:46 -0800

Power Secretary costs about $1k more than DDwin. FYI, I recently called
"articulate" systems, Inc., to ask the salespeople there whether
they could articulate why their price is so much higher. I pretended
that I have the option of using either Pentium of PowerPC and that
the deciding issue is which product (Powersectretary or DDwin) is
cheaper. Rather than admit that their price is significantly higher, the
salesperson actually lied to me and said that DDwin costs *more*.
An entire rationale was provided as to why this is the case.
I called Dragon to attempt to get verification of this new sales-insight,
but was told that the 'articulate' person's numbers and rationale
were all wrong, and somewhat troublingly so for the Dragon person I spoke
with.

Date: Wed, 14 Dec 1994 08:01:25 -0500
From: Steve Larose 

According to PC Magazine, the list price for DragonDictate for Windows,
Classic Edition, Version 1.0 is $698. Everything I have seen from Articulate
Systems indicates the list price for PowerSecretary is $2,500 (although they
sometimes offer a discount, but I don't know how one qualifies for the
so-called discount). While I'm a diehard Mac fan, it's hard to justify paying
$1800 more for voice recognition on the Macintosh. Does anyone have any
insight as to why Articulate feels it can charge so much more, and when the
price of their product might become competitive with voice recognition
products on the PC platform? 


From: "Gary L. Karp" 
Date: Tue, 1 Nov 1994 06:54:27 GMT

A brief update on my recent post on the PowerSecretary dictation system for
the Mac from Articulate Systems.
 
I had understood it was necessary to dictate into a separate window and then
copy into the application.  This has turned out to be only partially true.
One may dictate into any application, but cannot correct by voice unless they
work in the separate window.
 
The good news is, an update will release in November which corrects that.  It
will be possible to dictate and correct in any application window.
 
I feel confident based on discussion with Articulate that they are committed
to eventually meeting the full set of Dragon features on the Mac.  I will be
getting the system, and will report as I get it running.
 
It will remain necessary to have a minimum 040 system.  I am getting a
PowerPC.  It will also run on PowerBook 540s without added boards or
peripherals - except the microphone, of course.
 
A company in the Bay Area, Scott Tech, has a solution whereby Dragon is
pumped through a small PC and out to a Mac in ADB format.  This remains the
answer for pre-040 Macs.

Return To Contents List


DragonDictate and Notebook Computers


From: Scott Jangro 

To save you a toll call, here is the Windows Sound System compatible
sound card that I think somebody here told probably told Bob about:
 
WAV Jammer PCMCIA sound card
New Media Corporation
Irvine, CA
800-CARDS-4-U
 
This is not an entire notebook solution but a PCMCIA card that will plug
into a notebook with a PCMCIA slot.  We have tested the card here at
Dragon and it does work very well with DragonDictate for Windows.  It is
a Windows Sound System compatible card and like the real Windows
Sound System, it requires the use of a condenser type microphone (as
opposed to a dynamic microphone which is what we've traditionally
shipped with DragonDictate).  We do provide either type microphone with
the DragonDictate for Windows.
 
It is true that some sound card emulations do not do a good job at
mimicing the real thing, but  I would not say that it is necessarily *most*.
The truth is that we don't yet know if most of them will work or not.  The
point  is that one shouldn't assume that a card is truly Windows Sound
System compatible (or SoundBlaster 16 compatible) just because it says
it is.  For example, it may allow you to run Windows Sound System's
software drivers but the electronic characteristics of the hardware may
not be of the same quality as the real Windows Sound System.  While
this work well enough to do recording and playback in Windows or to
play WAV files, this may not work well on a technology like speech
recognition that relies heavily on good quality sound.
 
As far as information on any other sound cards goes, we are in data
collection mode right now. We do rely quite a bit on user feedback for
this infomration in addition to our own testing so if you have any
information on notebooks or sound cards that work well or not with
DragonDictate for Windows, I'd like to hear from you.  Thanks.
 
From: Scott Jangro 
 
We're currently putting together lists of laptop computers that work well
with DragonDictate for Windows
 
The company that you called has no connection with Dragon.  That
company is unique because they have a laptop that can accomodate an
ISA card.  This is important for people who use the ACPA DSP card for
DragonDictate (required for the DOS product).  Therefore, we refer
customers to them as the only vendor that we know of with this kind of
laptop.  FWIW, I apologize for the difficulty you had in contacting them.
 
Now that DDWin can run on a standard Windows sound card, we have
many more options.  We're currently running experiments on different
systems to put our seal of approval on some laptops that provide the
best quality performance.  Speech recognition is a very specialized use
of sound input and requires high quality speech samples for good
performance.  Many sound cards just do not provide the quality sound
that we feel is important for our product to perform the way we know it
can.  So while most if not all 16 but sound cards would pass the
"go-into- the-software-store- and-see-if-it- runs"  test, only a subset of
those would produce the quality sound for good speech recognition.  It
takes hours and hours of testing per card or laptop to determine this level
of quality and we will provide this information to you as soon as it is
known to us.
 
At this time I can tell you that users are running DDWin successfully on
the following computers.  We have purchased these machines and are in
the process of certifying them.  We DO NOT guarantee them at this time
but preliminary tests are promising.
 
Toshiba 4800 DX4-75mHz
Everex Stepnote DX4-75mHz
Ergo 100mHz
IBM Thinkpad 755
WAVjammer (PCMCIA card)
 

Return To Contents List


How About DragonDictate for Windows NT?

With the alarming increase of Windows NT systems, people want to know if there is a DD solution for this setup. There are two solutions - hardware and software, as discussed below:

Hardware


From: Simon Crosby 

A person in Cambridge suffering from early stages of RSI wants to know
whether there is a DragonDictate in the pipeline for NT.  Anybody
know?

Also, has anybody done the equivalent of a2x for NT?  I mean, PC doing
the recognition and an a2x-like thing on an NT  box.  Is this possible
(thankfully I have no knowledge of windows or nt)

From: Hans Heilman 

>> Also, has anybody done the equivalent of a2x for NT? 

There is a hardware solution called a TTAM that takes ascii output from
one PC and converts it into the correct electrical interface to go into the
keyboard port of a second PC. It also does mouse emulation going into the 
second PC's mouse port. I think this is the configuration:

+-----------------------+                   +-------------------------------+
|PC running Dragon under|         +---->[mouse port]     PC running windows |
|DOS, text goes out     |         |         |                               |
|port          [output port]-->[TTAM]-->[keyboard port]                     |
+-----------------------+        ^ ^        +-------------------------------+
                                 | |
			  keyboard mouse


A friend of mine has been using this setup for a while so he could use Dragon
for Windows development, before the DDWIN release (he was willing to live with
having to have 2 PC's). As an experiment, he swapped the righthand PC to be one 
running NT and everything seemed to work fine.

The TTAM is around $400 -- a company called something like Prentke-Romich sells
it (along with other assistive technology). If you send me mail, I'll get the
real name and address.


From: dp@world.std.com (Jeff DelPapa)

If you have a PS/2 mouse, the ttam will control it. (takes special key
sequences, but they can be done as voice macro's) The one I had (it is
currently on extended loan) also had a MAC ADB connector, and could
spoof both keyboard and mouse on that machine.  Mine was made by
Words+ in northern CA (who make lots of adaptive products) but they
have since stopped.  The box was originally designed by the Trace
Center at UWisc/Madison, and was licensed to several vendors.

Having said all that, there may be a minimal hardware solution. Under
DOS, and Windoze 3.1 there is a "handicapped access pack" (available
free for the asking from IBM and Microsoft respectively)  One of the
many things it provides is "serial keys" a setup that lets the keyboard
and mouse be driven by data on the serial line.  (you send "sentences"
to access characters outside of the normal printing ones)

I don't know if NT has such a facility, but since it was developed
after the access pak became available, they may have built it in from
the start.  If someone has the NT doc set and wants to take a quick
look, it would be appreciated.  (I will be looking into this myself
shortly, a co-worker will be in this spot soon)

All of these presume a two computer solution. In the case of NT, with
suitable software, you can use the DV/X trick and get the menu's to
share the main screen. (dragon/dos won't run in a less than full
screen ms-windows dos box, the video emulation isn't good enough).

In the case of the TTAM, it does not support ps/2 scan code 3
keyboards (unfortunate, as that is the code set that many X terms that
adopted PC keyboards use), so check yours.  The vanilla PC (big 5 pin
plug) keyboards are fine.  The program to convert key events to the
sentences needed by either ttam or the access pack does not exist in
the public domain.  The Word+ people used to include a copy if you
bought a ttam, but they wouldn't sell it separately, and it may be
essentially unavailable now.  I don't know what the others do.

From: dp@world.std.com (Jeff DelPapa)

I am glad we have the windoze NT documentation in machine searchable
format.  NT does indeed have a handicap access pack available for it,
the page that tells you how to get it is in the appendix of the SNA
Server development kit installation guide (I kid you not).  Anyhow it
is /softlib/mslfiles/wn0789.exe on ftp.microsoft.com.

I got a copy, but haven't tried it yet.  (toby will do so on monday).
Since I have the disks, I suppose I should try installing dd/win on NT
and see what happens.

From: Hans Heilman 

One place I know the T-TAM can be ordered is:

 Prentke Romich Company
 1-800-262-1984

Their catalog carries a number of assistive devices, and also sells
the T-TAM, apparently to connect their assistive devices to a PC
(although it can also be used to connect 2 PC's together).
They have a fact sheet on the T-TAM which can be included.

They list several different T-TAM models on their price list, so
be clear with them on the details.

Apparently, the T-TAM does have special input escape sequences
defined which can cause it to generate mouse button actions, although
my friend found that behavior to be somewhat flaky in different
PC configurations (as opposed to the straight ASCII character input
which worked fine). He was fairly frustrated with mouse emulation
difficulties.

He was running PROCOM+ on the Dragon PC to send Dragon-generated
text out the COM port, but ran into a limitation. Apparently,
PROCOM+ allows user-defined sequences to be found to different keys,
but the limit is 10 characters, which wasn't long enough for al
of the special T-TAM sequences. He worked around this by coding
a small ad-hoc program which reads input and sends it to the COM
port.

Software

The solution is a3x, an equivalent to the a2x program used to connect DragonDictate to workstations using the X graphical interface. The program was written by Nelson Sproul (sproul@sybase.com), and works as follows: given DragonDictate running on PC #1 running DOS and MS Access running SerialKeys (or axis, also by Nelson) on PC #2 under Windows or Windows NT it allows PC #1 to control PC #2 over a serial line. Here's what Nelson says about axis:


 axis: a freeware replacement for Microsoft handicap access serial keys
 This software is a stand-alone replacement for the serial keys
 functionality of the Microsoft handicap access pack.
 
 This functionality is required to allow an NT host
 to be controlled via a serial line, instead of by the keyboard
 and mouse.
 
 This software is required by a3x, a package which supports
 a2x-style control of an NT box by another PC running dos.
 
 I wrote this software because the Microsoft version, while
 working quite nicely for me under Windows NT 3.50, did not
 work under NT 3.51 (or windows 95, for that matter).  After
 a half dozen phone calls failed to get me any support 
 whatsoever, I gave up and put together my own version.

Send reports of problems to sproul@sybase.com.

a3x is a program which accepts as input keystrokes and generates output over a serial line to control a second PC, running Windows.

The following elements are required for DOS PC #1 to control Windows PC #2:

	1. PC #1
		a. DOS, with DragonDictate
		b. a3x for DOS
		c. com1 port attached to null modem cable
	2. PC #2
		a. Windows 
		b. MS Handicapped Access Pack for Windows 
		c. com1 port attached to the other end of the null modem cable

This program's key handling interface on PC #1 is similar to what a2x expects. Its mouse control is governed by a different, simpler, less sophisticated scheme.

Advantages versus DragonDictate for Windows: 1) works on Windows NT (as well as Windows for Workgroups) 2) offloads voice processing burden to another machine

Here is the a3x distribution with documentation, kindly provided by Nelson. Needless to say Nelson would appreciate any bug reports and positive feedback.

Return To Contents List


Troubleshooting


From: Scott Jangro 

>>>>>>>>>>>>>>>
 
has anyone used  PC XWare with dragon? so far it looks like it might
have problems...
.
 
<<<<<<<<<<<<<<<
We have heard of problems between DragonDictate for Windows and
PC-Xware and I have followed up with the publisher of PC-Xware.
Unfortunately, they have confirmed that their PC-Xware will not be
compatible with DragonDictate for Windows.  Further explanation below
for those interested...
 
As you may know, DragonDictate inserts keystrokes into the active
Windows application.  DragonDictate uses the same Windows facility as
the Windows Macro recorder does to play keystrokes into a Windows
application.  Any Windows application that follows the Windows
application development guidelines will be able to get input from the
Macro recorder and from DragonDictate for Windows.
 
PC-Xware doesn't follow the Windows conventions for getting
keystrokes.  Instead, it goes down to the DOS level, below Windows, to
get the keys.  Therefore, they miss the keystrokes that DragonDictate for
Windows is sending.  They also told me that they're aware that they do
not work with the Windows macro recorder.
 
I wish there was something that we could do to make DragonDictate for
Windows  work with this application as it looks like a great solution for
X/Windows users.  They didn't say if they were planning on changing
their application to conform to these  Windows standards or not.  If
anybody does come across a Windows X/Windows emulation, I'd be
very interested to hear about it.
 
From: daft@debussy.crd.ge.com (Chris Daft)
Subject: Dragondictate and PC-Xware


This isn't specifically to do with a2x, but I thought people using PC's
and X window machines would be interested in this response from NCD's
technical support on the incompatibility between DragonDictate and
PC-Xware.

  Yes, it is true that the reason Dragon Dictate does not work is our fault.  
	The
  engineers are supposed to be looking into changing this, but it is unlikely
  that it will be fixed soon.  Also, we do not have a scheduled date for this
  fix.

  -Stephen Peters

PC-Xware is in other regards a very nice way of making an X window
connection through a serial line or a LAN, from a PC.  If you care
about this, you might want to let them know (support@pcx.ncd.com) that
they need to fix the problem.

From: Jeff DelPapa 
 
I used Hummingbird's eXceed with DD/win.  Worked fine.  I will also
give the DEC (eXcursion I think) a try (it is what the UK branch
standardized on), when I am in the mood for shuffling floppies again.

From: Scott@ccgate.dragonsys.com
Subject: re: what do I need to run DragonDictate/a2x? -Reply

>>>>>>>>>>>>>>>
Has anyone had any trouble using DragonDictate and the Novell ethernet
software?  It worked find for me under DragonDictate V1, but with
V2(DOS) it is extremely unreliable causing DragonDictate to crash with
some DOS error like:

Error [35] General Protection Fault in DD30K.EXE at 00A8:DD50k
code=0000 ss=00D8 ds=00E8 es=0000 ax=06D4 bx=0130 cx=0011
dx=0008 sp=0604 bp=0614 si=88BE di=06D4

written over the top of a DD pop up box, whose only visible line is
"Say OK to continue", presumably above it said: "An error has occurred".


<<<<<<<<<<<<<<<
Terry,
You should definitely be able to get DD2 running in conjunction with a
Novell Ethernet network.  I've been doing it myself for two years.  Now,
having said those words of encouragement...

The bad news is that countless things can cause a G.P. Fault.  It
basically means that something is stepping on something else in memory
and the program that is running (usually DD because the DOS extender
we use is usually the program that detects the problem) throws this
error instead of hanging or otherwise going off into
computer-never-never land.  The good news is that this type of  error is
usually caused by a hardware conflict or a memory conflict.  Here are a
few things you can try...

HARDWARE:  When you moved to Version 2 of DD, you changed
hardware.  The ACPA card interacts differently with network cards than
the old Dragon card did.  We have found that you may need to set the
ACPA card's IRQ level and IO address level LOWER than those of the
ethernet card.  So for example, if you have your ethernet card on IRQ3
and your ACPA card on IRQ 5, you may have problems. Switch them
around if you can.  The ACPA card manual has alternative settings for
you to try.  The manual doesn't indicate IO addresses lower than 310 but
you can physically set the card lower than that.  If you are interested in
how to set the ACPA card's IO address into the 200s, let me know.  My
hardware settings that work well are:
ACPA   IRQ3;   IO 310 (Defaults)
Net   IRQ5;  IO360

MEMORY:
Try to clean out the config.sys and autoexec.bat files as much as
possible.  Some EMM386 settings can contribute to a memory conflict.  A
safe EMM386 configuration, with DOS 6 is:
   DEVICE=C:\DOS\EMM386.EXE NOEMS
Includes and eXcludes can cause some problems.
Also, remove as many device drivers and TSRs as possible.  The most
thorough way to troubleshoot is to REM out everything and see if that
helps.  Then put things back one at a time to find the offending program
or settings.



From: Scott@ccgate.dragonsys.com
Subject: Deleting words

>>>>>>>>>>>>>>>

From: Simon.Crosby@cl.cam.ac.uk
I'm fed up with the ", (numeric)" version of the comma  in
DragonDictate.  I've tried deleting it from the vocabulary, but it refuses to
disappear.  I'm using DragonDictate version 3.01 classic.
Anybody got any bright ideas?

<<<<<<<<<<<<<<<

The reason that there are some words that you cannot delete is really 
because of an oversight during the design of the product.  There are
some groups of words (I think the same groups that you see in the train
menu) that the delete function just cannot get to.  No good reason, you
just can't do it.  When you delete one of these, it may appear that you're
deleting it but you aren't.

You have perfectly valid reasons for wanting to do this and I have
passed your comments on to the developers.  Thanks for voicing your
feedback.

I have seen some good suggestions pass by here and I hope that they
are acceptible workarounds for now.

BTW, to set the record straight --  Simon is right,  DragonDictate is not
neural-net based, it is based on HMMs (Hidden Markov Models).

Return To Contents List


Do you get hoarse from using DragonDictate ?


From: rbd@sst.ll.mit.edu (Robert B Dunn)

	I am looking for some advice on using DragonDictate.  The problem
 I have (which I think is fairly common) is that I get  hoarse fairly
 quickly.  Unfortunately this happens so quickly that I have not been
 able to make much use of DragonDictate and a2x,  although I have had
 them for a few months. I now frequently get hoarse during normal conversations
 and occasionally have to stop talking to people.  I use the following
 strategies to reduce the stress of using DragonDictate:

	1) speak softly,		

	2) drink very frequently.

The following summary of a discussion of hoarseness was made by Scott@ccgate.dragonsys.com (Scott Jangro).

++++++++Original E-mail Posting+++++++++++++++++++++++++++++++++++++++++++++
I have been reading peoples' summaries of their experience with voice
input for performing programming and related tasks, since I have
considered use of this technology myself for a worsening RSI problem. It
all sounds quite encouraging. However, at this time I have a few
questions/concerns perhaps someone can address:

1) have the users of voice input at work found that their talking disturbs 
their officemates? Has anyone needed to arrange for having an office to 
themselves? It seems that when office space is limited this may pose a
problem.

2) has anyone experienced hoarseness or similar problems attributable
to the increased talking? Is it possible we may be partially substituting
one problem for another?       

3) speaking of hoarseness, how do Dragon and the other packages
perform when the user has hoarseness due to voice strain or a cold?
Does recognition suffer significantly?

Any insight on these questions would be appreciated. Thanks!

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1) have the users of voice input at work found that their talking disturbs
their officemates? Has anyone needed to arrange for having an office to
themselves? It seems that when office space is limited this may pose a
problem.
----------------------------- this can be a problem.


2) has anyone experienced hoarseness or similar problems attributable
to the increased talking? Is it possible we may be partially substituting
one problem for another?
---------------------------
No problems...


3) speaking of hoarseness, how do Dragon and the other packages
perform when the user has hoarseness due to voice strain or a cold?
Does recognition suffer significantly?
-------------------------------
I've found that if I already have a sore throat, working by voice can
aggravate it.  However, I have not had recognition problems even when
*completely* stopped up (as in runny nose, congestion, etc.).

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 >their officemates? Has anyone needed to arrange for having an office
to  >themselves? It seems that when office space is limited this may
pose a
>problem.

I work in a cubicle.  Most of the people around me say that my talking is
"white noise" that doesn't bother them.  Apparently one person has
complained, and as a result I may be getting a private office (which
I wouldn't normally rate) when my group moves in a month for two.
If this happens I'm certainly not going to complain :-)

>2) has anyone experienced hoarseness or similar problems attributable
>to the increased talking? Is it possible we may be partially substituting
one
>problem for another?       

I had really bad voice strain the first day or two, but I changed the way
I was speaking and the problem when away.  I had taken a singing class
and been taught how to project without straining my voice.  When I used
that technique I stopped having voice strain.

>3) speaking of hoarseness, how do Dragon and the other packages
perform when
>the user has hoarseness due to voice strain or a cold? Does
recognition
>suffer significantly?

I have allergies and my level of congestion varies a lot.  Mostly it's not a
problem.  Sometimes on a particularly bad day I notice the recognition
being a bit worse, but it's not a real big difference.  I have two frequently
used macros, [sniffle] and [clear throat], which don't type anything.
They work really well!

Here's the mail I sent in response to the original inquiry, since people
other than that person seem to be interested:

I'm a Senior Software Engineer at Sybase.  I do new feature
development on the SQL Server, which is our core product.  I investigate
problems, write specifications documents, write code, and fix bugs.  My
work is in C in a Unix environment.

I use DragonDictate all day long.  I can't type at all without pain, so
I do everything with DragonDictate.  I think that the effect on my
programming productivity is minimal; i.e. I get pretty much the same
amount of work done with DragonDictate as I used to get done typing.
It's a little more of an impediment when writing a document; I create
documents somewhat more slowly than before.  However, I'm perfectly
capable of doing what's needed for my job.  I was a very fast typist;
DragonDictate makes me more like a fairly slow typist. There are plenty
of engineers here who are not super fast typists who do a good job.  My
company expects the same amount of work from me that they expect
from other people, and this has not been a problem.

There was a pretty significant amount of time required to get going with
DragonDictate in a Unix programming environment.  Getting the various
pieces of software and hardware working together correctly took some
doing; getting macros and such set up took some doing; getting used to
using DragonDictate took some doing.  I spent a fair amount of time after
hours working on my environment for the first month or two.
During that period of time, I was doing my job using DragonDictate but
wasn't as productive as I am now with it.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


1) have the users of voice input at work found that their talking disturbs 
their officemates? Has anyone needed to arrange for having an office to 
themselves? It seems that when office space is limited this may pose a
problem.

> > > I have found that I can dictate pretty softly, and it seems to work
better that way anyhow.  Outside noise doesn't seem to disturb
DragonDictate much, though.  Classical music is just fine, as are
conversations in the hallway, but it doesn't seem to like Devo music!
As for disturbing my neighbors, as long as I talk relatively softly, that
doesn't seem to be a problem (no worse than having a neighbor who
spends all day on the phone, which is normal for some environments).
I think privacy may be more of an issue: you need to realize that now
people can hear you writing your email!  (Already a problem with phone
conversations).

2) has anyone experienced hoarseness or similar problems attributable
to the increased talking? Is it possible we may be partially substituting
one problem for another?       

> > > Yes, hoarseness is a problem.  Much better when I talk softly,
which is desirable anyway.  I also drink a lot of water and tea.  You
should treat it like typing -- don't do it for hours without stopping once in a
while for a break.  When I first got the system, my head, neck, and
shoulders bothered me more, until I realized that I was essentially
shouting at the system (like teaching a class without a microphone and
trying to project all the time).  This is less of a problem now that I've
learned that it works just fine with soft speech. 

3) speaking of hoarseness, how do Dragon and the other packages
perform when the user has hoarseness due to voice strain or a cold?
Does recognition suffer significantly?

> > > Yes, it does suffer.  It's really fun to dictate while crying!
:-(  You need to be sure not to save your voice files after such a session
so as not to screw it up.  It does still work, though.

Any insight on these questions would be appreciated. Thanks!

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

hoarseness has not been a problem for me, even when I was pulling a
lot of all-nighters last summer/fall.

the following will help keep you going longer:

1) sip water frequently, and don't wait for your throat to get dry before
you get something.

2) speak softly and steadily

3) avoid drinks with caffeine or the harsher pops.  (I don't know what
the mechanism is, but avoiding caffeine was suggested by a friend who
is a singer.  I did find that my throat would feel dryer and more irritated if I
had had a lot of pop and I had been working at the  computer.)

There are many professions where people use their voice all day.  Its
good to get suggestions from them.  We have an advantage, because
our voices must only carry to the microphone.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    1) have the users of voice input at work found that their talking
disturbs 
    their officemates? Has anyone needed to arrange for having an office
to 
    themselves? It seems that when office space is limited this may pose
a
    problem.

DragonDictate is sensitive to noise, particularly to drawers and binders
closing, although you can train an empty macro to these sounds it does
interrupt and slow you down.  I think a private office is important; don't
be afraid to mention the American Disabilities Act.
    
    2) has anyone experienced hoarseness or similar problems
attributable
    to the increased talking? Is it possible we may be partially substituting
one
    problem for another?

I sip water all day and have had problems with hoarseness only when I
let my cup stay empty.  Drinking water also forces me to take breaks to
go to the restroom.
    
    3) speaking of hoarseness, how do Dragon and the other packages
perform when
    the user has hoarseness due to voice strain or a cold? Does
recognition
    suffer significantly?

DragonDictate works surprisingly well when my voice sounds different
from normal.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

>
> 2) has anyone experienced hoarseness or similar problems attributable
> to the increased talking? Is it possible we may be partially substituting
one
> problem for another?

I am responding to this question because I have had a particularly bad
experience in this area.

I began using Dragon extensively at the beginning of this year.  After a
couple of months, I noticed my throat was getting sore each time I used
it.  Being a performer in amateur theatre, I found that I was having to quit
using Dragon by Wednesdays so that my throat could recover for a
weekend performance.  (In a dozen years of singing and acting, I have
never experienced voice problems from performing.)

The problems persisted and worsened.  I asked our medical department
if a voice therapist could be brought in to see what I was doing wrong. 
Nine weeks of aggravating bureaucracy later, I saw an ENT (ear, nose,
throat) doctor who told me I had modules on my vocal cords.  I
immediately stopped using the system.

While the nodules have since faded away, I continue to have pain in my
throat from speaking (regular speaking).  So now I can neither type nor
talk confortably.

There is one other person here that uses Dragon, and though he is able
to use it, he, too, had some much less severe throat problems along the
way.  Even now, if his allergies act up, he has some difficulty.

I am glad to hear that others have not had much problem with this, but be
advised that it can occur.  This possibility is also noted in the best book
on RSI we have yet to find, on p. 168 (RSI A Computer User's Guide, by
Pascarelli and Quilter).

From: duncan@super.org (Duncan A. Buell)
Subject: hoarseness and speech coaches

I realize that the discussion about hoarseness has already
come and gone, but I thought I would add my comments, since
they may help people.  I had been suffering from a chronic
irritated throat from speaking to Dragon, so as well as
seeing a doctor and getting medical treatment, I spent about
10 weeks seeing a voice coach.  My coach is a choral music
conductor, has 10 years experience as a public radio
announcer, and teaches voice and public speaking privately
and in classes.  Here is the summary of my experience with
her; I think this is highly relevant since I can now speak
at great length to Dragon without suffering irritation in my
throat.  I found that once I recognized what I was doing
that was causing the problems, it wasn't all that difficult
to change my habits.  Although it is desirable to change
habits in everything regarding speaking, I found that the
relatively controlled environment of speaking to Dragon made
it easy to practice what I was supposed to be doing.
And since speaking in general had never given me problems
(I spent 15 years teaching), fixing the irritation caused
by speaking to Dragon solved nearly all my problems.

Two things were paramount in my case.  First, I wasn't
getting enough air, and second, I was ``squeezing'' my words
out.  This is a different way of stating the comments made
in Pascarelli's book about shallow breathing.

First things first: air.  Don't breathe from the top of your
chest but from your abdomen.  Don't tense up.  Treat your
body as the instrument it must be -- stretch your arms up
and rotate your shoulders before you sit down for a session.
Roll your head to stretch your neck and jaw muscles.
Singers would do this before a practice or performance, and
athletes do this before exercise.  You should do the same.

Posture is important to being able to breathe properly and
get enough air.  Sit up; don't lean forward.  When you lean
forward you close off the abdominal part of your breathing
apparatus and cannot help but breathe shallowly from your
upper chest.  Better yet -- stand.  I have the good fortune
to be able both at work and at home to stand when I dictate
to Dragon.  At work I have both the Dragon computer and my
workstation elevated (using the simplest of support
materials -- unopened reams of paper) and a copy holder
coming up from between the two machines.  This allows me to
stand and get plenty of air, as well as to fidget, reach
both keyboards and my cup of water, and even stretch while
speaking.

I had been speaking with short, shallow, bursts of air.
This was less of a problem with dictating ordinary text as
with dictating computer commands, programming constructs, or
the detailed stuff of mathematical equations in TeX, since
ordinary text has longer words and has a certain flow to it.
My coach listened to me speaking and suggested I was
probably starting my utterances with my vocal cords closed
and then requiring that they produce sound from that
position.  Most of the changes I have made have been to try
to change that.

One key to not squeezing the sound is to speak words by
letting the air start flowing first and then starting the
sound.  "Speak on a sigh" my coach says.  I now speak words
like "one", "as", "of", "and", and "in" much more as if I
were singing them.  Think of reaching for a high note to
sing such a word.  Open your throat and behind your palate,
start the air going and then let the sound come.  Very
gently aspirate an "h" before such a word.  Linger on the
vowel to round it out.

I do find that Dragon has a harder time distinguishing "as"
from "has" (but not the other way around), but aside from
that there seem to be no problems.  The choice list is a
little different from before ("Juan" now often appears when
I say "one", but it's never the first choice).

Another problem I had was my Midwestern "a"; too much nasal
raspiness.  Words like "as", "has", "that", were hard on my
throat.  I am much more careful to be more expansive with
such words.  I have even at times resorted to chanting to
Dragon instead of speaking because it is so much more
natural for me to sing with rounder, more open tones than to
speak thus.  Dragon has never seemed to notice at all that I
have begun chanting instead of speaking.

The bottom lines: posture and correct breathing to get
enough air, and making sure to use enough of that air when
you are speaking.

Return To Contents List


Simon.Crosby@cl.cam.ac.uk