VoiceCode 1.0 User Guide

Last updated 16 October, 1996

Contents

1. VoiceCode overview 2
2. Using VoiceCode 3
2.1 Compiling a vocabulary 3
2.1.1 Compiling a symbols vocabulary for a source file 3
2.1.2 Compiling a file names vocabulary for a directory 4
2.1.3 Compiling a file names vocabulary for a directory tree 4
2.1.4 Compiling a symbols and file names vocabulary for a project 5
2.2 Defining new symbols and file names 5
2.3 Dictating symbols and file names 6
2.3.1 Enabling dictation of symbols or file names in an application 6
2.3.2 Disabling dictation of symbols and file names in an application 6
3. Customizing the vocabulary compiler 6
3.1 Vocabulary compiler options 7
3.2 Creating expansions for frequently used abbreviations 8
3.3 Automatically setting compiler options for files of certain types 9
3.4 Adding support for new languages 9




1. VoiceCode overview
VoiceCode is an add-on software for the DragonDictate speech recognition system. It is 
designed to help programmers using DragonDictate to dictate source code. 
Basically, VoiceCode takes the hassle out of dictating special expressions like variable, 
function and file names which are normally not recognized by DragonDictate. It does this 
by compiling sentences for all special expressions contained in source files and directories. 
Provided that certain conditions are met, the sentence compiled for a given special 
expression is easy to remember and is recognized by DragonDictate without requiring 
explicit training. Those conditions are:
1. the special expression must be made up of dictation terms or frequently used 
abbreviations of such terms
2. the end of a term must be indicated by either a change in case or the use of a non 
alphabetical character
Below are examples of special expressions that meet both requirements, along with the 
sentences generated by the VoiceCode vocabulary compiler. All these sentence are easy to 
remember aliases for the original expression. Also, since they contain only dictation terms, 
they can be recognized by DragonDictate without requiring explicit training.
Special expression			Compiled sentence 
aHungarianStyleVariable		[a Hungarian Style Variable]
a_C_style_variable		[a C style variable]
AN_UPPERCASE_ONE			[AN UPPERCASE ONE	]
a-lisp-style-variable		[a lisp style variable]
Unfortunately, special expressions are usually not like the above examples in that they 
contain some terms which are abbreviations of dictation terms. When this is the case, 
DragonDictate will not be able to recognize the compiled sentence without explicit 
training because it contains words that are not in the dictation vocabulary. 
However, VoiceCode provides a workaround for this problem in the form of an 
abbreviations file. This is a file where you specify expansions for abbreviations which you 
frequently use. The sentence compiled for a given special expression will then be 
recognized by DragonDictate as long as you have defined expansions for all the 
abbreviations it contains. 
For example, suppose you have defined document to be the expansion for the 
abbreviation doc and temporary to be the expansion for tmp then the vocabulary 
compiler will generate the following sentences for the following special expressions:
Special expression			Compiled sentence
tmp_doc				[temporary document]
aNewDoc				[a New Document]
THIS_DOC				[THIS DOCUMENT]
With the abbreviations file, you can define expansions for different types of abbreviations 
like standard file extensions (e.g. .c, .h, .pl), standard prefixes for variable types (e.g. pt 
for pointer in C programs) and abbreviations of domain specific terms (e.g. doc for 
Document in a word processing application).
2. Using VoiceCode
Dictation of symbols and file names with VoiceCode involves three types of tasks. These 
are: 
  Compiling a vocabulary 
  Defining new symbols and file names
  Dictating symbols and file names
The first type of task involves the creation of sentences for symbols used in existing source 
files (variables, functions and data structure names) and names of existing files and 
directories. These sentences are then loaded into two special DragonDictate vocabularies 
named VoiceCode-Symbols and VoiceCode-Files. These are called the VoiceCode 
vocabularies and are limited to 1000 sentences each.
The second type of task involves the creation of sentences for new symbols or file names 
that do not exist in any source file or directory.
The third type of task involves choosing one of the twoVoiceCode vocabularies and 
dictating sentences from it.
The remainder of this section gives more detailed explanations on how to accomplish 
those various types of tasks.
2.1 Compiling a vocabulary
2.1.1 Compiling a symbols vocabulary for a source file
VoiceCode allows you to compile a single source file to produce a vocabulary containing 
sentences for every symbol used in that file.
To do this, simply open the source file in your editor. Then invoke the DragonDictate 
macro [Load Vocabulary]. Your editor will then run the VoiceCode vocabulary 
compiler on that particular file. A message box will also be displayed with the following 
message:
"Please click OK when compilation is done." 
Once the compilation is complete, simply click the OK button and VoiceCode will start 
transferring the vocabulary file from the remote machine to the local machine.
If you installed VoiceCode with the Transfer through FTP setup (see section 2.5.3.1 of 
the installation procedure described in file install.ps), a second message box will be 
displayed with the following message:
"Please click OK when download is complete"
Once the vocabulary file has been transferred to the local machine, simply click the OK 
button to start loading the vocabulary into DragonDictate. Note that if you installed 
VoiceCode with the Transfer through file system mounting setup, the transfer and 
loading of the vocabulary file will happen automatically after you have clicked OK in the 
first message box. 
Once[Load Vocabulary] has finished running, the vocabulary VoiceCode-Symbols, 
should contain a sentence for every symbol in the source file.
2.1.2 Compiling a file names vocabulary for a directory
VoiceCode also allows you to compile a directory to create a vocabulary with sentences 
for the name of every file and subdirectory it contains.
To do this with the Emacs editor, simply open the directory using the dired command 
and invoke the [Load Vocabulary] DragonDictate macro. Emacs will then ask you if 
you want to recursively compile subdirectories. You must answer n.
To do this with the vi editor, open the directory and invoke the [Load Directory 
Vocabulary] DragonDictate macro.
Similarly to compilation of a source file, one or to message boxes may be displayed 
depending on whether you use the Transfer through FTP or Transfer through file 
system mounting setup (see section 2.1.1).
Once[Load Directory Vocabulary] or [Load Vocabulary] has finished running, the 
vocabulary VoiceCode-Files should contain a sentence for the name of every file and 
subdirectory of the directory which was compiled.
2.1.3 Compiling a file names vocabulary for a directory tree
VoiceCode also allows you to recursively compile a directory and all its subdirectories to 
create a vocabulary with sentences for the name of every file an subdirectory they contain. 
To do this with the Emacs editor, simply open the root of the directory tree using the 
dired command and invoke the [Load Vocabulary] DragonDictate macro. Emacs will 
then ask you if you want to recursively compile subdirectories. You must answer y.
To do this with the vi editor, simply open the root of the directory tree and invoke the 
[Load Vocabulary] DragonDictate macro.
Similarly to compilation of a source file, one or to message boxes may be displayed 
depending on whether you use the Transfer through FTP or Transfer through file 
system mounting setup (see section 2.1.1).
Once[Load Vocabulary] has finished running the vocabulary VoiceCode-Files, should 
contain a sentence for the name of every file and subdirectory of the directory tree which 
was compiled.
2.1.4 Compiling a symbols and file names vocabulary for a project
VoiceCode also allows you to compile a project file. This is a file containing the names of 
various source files and directories related to a particular project. For each source file in 
the project file, VoiceCode creates sentences for every symbol it contains. Similarly, for 
each directory in the project file, VoiceCode creates sentences for the name of every file 
and subdirectory contained in the tree rooted at that directory.
To compile a project, simply open the project file in your editor (Emacs or vi) and invoke 
the [Load Vocabulary] DragonDictate macro. Similarly to compilation of a source file, 
one or to message boxes may be displayed depending on whether you use the Transfer 
through FTP or Transfer through file system mounting setup (see section 2.1.1). 
2.2 Defining new symbols and file names
VoiceCode allows you to define new symbols that do not appear in any source file, as well 
as name of files and directory that do not yet exist.
To do this, you first invoke the sentence: [<VoiceCode-Macros/Styles> <VoiceCode-
Macros/File or Symbol>]. This is a macro that allows you to create new symbols or 
file names with various formatting style. The first word in this sentence is the formatting 
style you want to use (Capital, Hungarian, Lowercase or Uppercase) while the 
second word indicates whether you want to create a symbol or file name. 
When this sentence is invoked, DragonDictate opens the Add Word dialogue box, sets the 
To Vocabulary/Group field to VoiceCode-Symbols or VoiceCode-Files (depending 
on the second word used), puts the cursor in the Word Name field, puts DragonDictate 
into dictation mode and sets the dictation flags according to the formatting style specified 
as the first word. 
You can then specify the name of the symbol/file name by dictating the sequence of terms 
you want to use to create it. 
Next, you create the actual symbol or file name by invoking the sentence [<VoiceCode-
Macros/Join Operation> <Number/1 to 9>]. This sentence copies terms from the 
Word Name text box to the Resulting Action text box, then joins them using a 
particular formatting convention. The first word in the sentence corresponds to the 
formatting convention (Hyphenate, Join or Underscore) while the second word is 
the number of terms contained in the Word Name field. Note that you must put 
DragonDictate in command mode before you can invoke the [<VoiceCode-Macros/Join 
Operation> <Number/1 to 9>] macro.
For example suppose you want to create the file name aNewFile. You must first say 
[Hungarian File Name], followed by a and new and file. You would then 
say [Command Mode] followed by [Join 3]. 
2.3 Dictating symbols and file names
2.3.1 Enabling dictation of symbols or file names in an application
Before you can dictate symbols or file names, you must tell DragonDictate to listen for 
sentences from the proper vocabulary. This is done by invoking the sentence:
[<VoiceCode-Macros/File or Symbol> <VoiceCode-Macros/Application> 
<VoiceCode-Macros/Window>]
The first word in this sentence specifies whether you want DragonDictate to listen for 
symbols or file names. The second word is the name of the application in which you want 
to dictate them. The third word is the title (or the beginning of the title) of the window in 
that application where you want to dictate symbols or file names.
For example, suppose you want to dictate symbols in an Emacs window displayed on a PC 
through the Exceed application. You would have to say: [Symbols Exceed Emacs].
Once DragonDictate has been put into the proper mode, it will listen for symbols or file 
names, on top of the usual active vocabularies. This allows you to dictate symbols and file 
names in either command or dictation mode.
Note that you cannot dictate both symbols and file names at the same time in a given 
window. You can however dictate symbols in one window of one application and dictate 
file names in an other window of possibly the same application.
2.3.2 Disabling dictation of symbols and file names in an application
If you want to disable dictation of symbols and file names in a particular application, 
simply invoke the sentence: [Normal <VoiceCode-Macros/Application>], where 
the second term is the name of the application. 
3.  Customizing the vocabulary compiler
The VoiceCode vocabulary compiler can be configured to deal with files of various types 
written in different languages and containing symbols that make use of various 
abbreviations. This can be done in different ways:
  overriding compiler options through explicit command line arguments
  modifying some configuration files (abbreviations file, automatic options file and 
language definition files)
This section gives more details on how to configure the vocabulary compiler to deal with 
certain situations.
3.1 Vocabulary compiler options
The vocabulary compiler is a perl script with path $VCHOME/vocCompile.pl. This script 
can be invoked explicitly from a shell, although it is typically more convenient to call it 
using the DragonDictate macros provided with VoiceCode. 
The syntax of the vocCompile.pl script is as follows:
vocCompile.pl [-o<operation> -l<language> -c<case>] <file>
The <file> argument gives the path of the source file, project file or directory to be 
compiled. The various options are described below.
-o: 	the <operation> modifier of this option specifies the operation to be done on 
<file>
<operation> = s 	means compile <file> as a source file
<operation> = p 	means compile <file> as a project file
<operation> = d	 means compile <file> as a directory (do not 
recursively compile subdirectories)
<operation> = t 	means compile <file> as a directory tree (recursively 
compile subdirectories)

-l: 	the <language> modifier specifies the language in which the file is written.
-c: 		the <case> modifier specifies a rule for separating an uppercased term followed by 
a non uppercase 
<case> = u 	means separate before the last uppercase letter (e.g. 
split variable myFILEHandle into terms my, FILE 
and Handle)
<case> = l 	means separate before the first lowercase letter (e.g. 
split variable myFILEhandle into terms my, FILE 
and handle)
The default compiler options for a file are -os -lC -cu which means compile as a C 
source file, separating before the last uppercase letter of an uppercased term.
The default compiler options for a directory are -ot -lunix -cu which means compile 
as a directory tree (recursively compile subdirectories) with unix style file names, 
separating before the last uppercase letter of an uppercased term
These defaults values can be overridden in two ways. The first way is to specify different 
options at the command line. The second way is to use auto options. These are file name 
patterns that you associate with certain options so that whenever a file whose name 
matches the pattern is compiled, those options are used automatically (see section 3.3 for 
more information on how to set auto options). 
The order of priority of the various sets of option values are (from the highest priority to 
the lowest):
  explicit command line options
  auto options
  default options
3.2 Creating expansions for frequently used abbreviations
VoiceCode creates sentences for special expressions by splitting them into the various 
terms they contain. Unfortunately, many special expressions contain terms which are 
abbreviations of dictation terms. In such cases, the resulting sentence is not recognized by 
DragonDictate unless it is explicitly trained by the user.
However, VoiceCode provides a workaround for that problem in the form of an 
abbreviations file. In this file, you define extensions for abbreviations which you use 
frequently. When VoiceCode creates a sentence for a special expression, it replaces all 
abbreviations by their expansion so that the resulting sentence only contain dictation 
terms.
The abbreviations file has the following path: $VCHOME/Configuration/default.abr. 
You create a new abbreviation simply by adding a line to this file with the abbreviation 
followed by its expansion.
For example, suppose your source files contain many symbols where document is 
abbreviated by doc. You would then add the following line to your abbreviations file:
doc document
From then on, variables like: aDoc, my_doc, THE_DOC get compiled to the following 
sentences: [a Document], [my document] and [THE DOCUMENT].
Note that the expansion for an abbreviation can contain more than one word. For example 
you could define the expansion of ps as Post Script.
3.3 Automatically setting compiler options for files of certain types
In the section 3.1 we discussed the various options that can be used to modify the 
behavior of the vocabulary compiler. The user generally doesnt want to explicit specify 
those options every time he/she wants to compile a file, project or directory. A more 
practical way of specifying options is to use auto options. These are file name patterns 
that you associate with certain options so that whenever a file whose name matches the 
pattern is compiled, those options are used automatically 
You create new auto options by editing the auto options file: 
$VCHOME/Configuration/default.aut. All you need to do is to add a line to this file 
with a file name pattern followed by the set of options you want to use for files that match 
that pattern. You can use the * wild card in the file name pattern. Note also that the 
compiler options must be enclosed by single quotes.
For example, suppose you want all files ending with the extension .pl to be compiled 
as perl source files. All you have to do is to add the following line to the auto options file:
*.pl -os -lperl
By default, VoiceCode comes with an auto option file which assigns the following 
options to the following file name patterns:
File name pattern	Compiler options
*.pl			compile as a perl source file
*.c			compile as a C source file
*.h			compile as a C source file
*.cpp		compile as a C source file (C and C++ look the same as far as the 
VoiceCode compiler is concerned)
*/			compile as a directory tree with unix file names
*.prj			compile as a project file
3.4 Adding support for new languages
VoiceCode comes with support for many of the commonly used programming languages. 
If you are using a language which is not currently supported by the package, it is fairly 
simple to add that support yourself.
All you need to do his to create a language definition file for this new language. This file 
should have the following name $VCHOME/Configuration/<language>.lng where 
<language> is the modifier you want to use with the -l option to tell the VoiceCode 
vocabulary compiler to compile a file written in that language.
The language specification file contains all the information needed by the vocabulary 
compiler to extract valid symbols from a source file. This information takes the form of 
named perl regular expressions. Those are described below:
Regular expression name		Description
COMMENTS_SINGLE_LINE_START	regular expression matching the string used to mark 
the beginning of single-line comments in the 
particular language
COMMENTS_BRACKEDTED_START	regular expression matching the string used to mark 
the beginning of  multi-line comments in the 
particular language
COMMENTS_BRACKEDTED_END	regular expression matching the string used to mark 
the end of  multi-line comments in the particular 
language
SYMBOLS	regular expression matching any valid symbol in the 
particular language
For example, the language definition file for the C++ language might read something like 
this:
COMMENTS_SINGLE_LINE_START 	//
COMMENTS_BRACKETED_START  	/\*
COMMENTS_BRACKETED_END    	\*/
SYMBOLS	[\w_]+
The first expression means that C++ single line comments always start with the string 
//.
The second and third expression mean that C++ multi-line comments start with /* and 
end with */. Note the use of the backslash in front of the asterisk in those regular 
expression. This is because some characters like *, . and ? have special meanings 
inside a regular expression. If you want the regular expression to treat these as straight 
characters, you must precede them by a backslash.
The last expression is a little bit more complex. Basically it means that valid C++ symbols 
consist or one or more alphanumeric or _ characters. In this expression, [] means a 
set of allowable characters and + means one or more of those characters. Within the 
brackets, \w means any alphanumeric character. Other possible symbols you could use 
inside the brackets are: 

\W 	non alphanumeric
\d 	digit
\D	non digit
\s 	white space (includes newline, carriage return and tab)
a-z	lowercase alphanumeric
A-Z	uppercase alphanumeric

The SYMBOL regular expression for the definition file of most languages will be a variation 
of the C++ regular expression using the special symbols described above. You dont need 
to learn the full flexibility of perl regular expressions to write a new language 
configuration file.
10


9



