AMSTERDAM SOFTWARE (Contact: Piek Vossen, Piek.Vossen@let.uva.nl) - Parser for the English definitions in LDOCE Description: Gives constituent structure and predicate argument relations of the definitions in the form of a labelled and bracketed tree. Three versions have been developed for the definitions of nouns, verbs and adjectives. Applied to all senses of LDOCE. Output is stored as an LDOCE derived dictionary in the LDB. Technicalities: Parser developed with Atlas parser-generator, runs on VAX VMS and uses Atlas run-system and a lexicon. - Parser for the Dutch definitions in Van Dale Description: Same as above but restricted to noun definitions. Has been applied to 2000 food and drink denoting senses and a random sample of 3000 noun senses Output is stored as a Van Dale derived dictionary in the LDB. Technicalities: Parser developed with Atlas parser-generator, runs on VAX VMS and uses Atlas run-system and a lexicon. - Extract Description: Converts parse trees which have the form of labelled and bracketed trees into flat relational lists expressing the logical form underlying the defintiions and abstracting from non-semantic syntactic surface structures. Intertactive version and batch mode is provided. In the interactive mode any specified (menu-driven) relation is extracted from a parse-tree in the batch mode all present relations are extracted. Technicalities: Developed in Pascal on VAX VMS and developed in Procyon Common Lisp, MacIntosh, LDB. - Word Devil Description: browses through hierarchical relation in dictionaries. Compares hierarchies cross-linguistically, disambiguates senses of genus words, detects circularities and genusowrd gaps, outputs taxonomy structures. Technicalities: Developed in Pascal on VAX VMS using L-tree lexicons, developed in Procyon Common Lisp, MacIntosh, LDB and in C on Unix using L-tree lexicons. - Trans Description: Creates tlinks between LKB fragments using mono- and bilingual dictionaries and genus lexicons loaded in the LDB. Technicalities: Procyon Common Lisp, MacIntosh, LDB All software available BARCELONA SOFTWARE (Contact: Horacio Rodriguez, horacio@lsi.upc.es) MACO: Morphological Analyser Corpus-Oriented Description: MACO (Morphological analyser Corpus-Oriented) is a tool for morphological analysis of corpora. MACO has been designed to attach as much morphological information as possible (of course the part of speech but also other information, depending on the linguistic source) to every word in the input text. MACO has been conceived and designed as a general purpose morphological tool although the current implementation of the system (and the involved Data Sources) is devoted to the morphological analysis of Spanish texts. MACO may be considered as a toolbox. Therefore, it allows the user to taylor it to the desired working environment. MACO allows different ways of integrate different sets of morphological analysers with different coverage and usually adapted to specialized tasks. Current Spanish version includes: SegWord, Amcas, Formario, Number, Accumulate, Proper-noun, Initial and Default-cats. Software is fully available. Data sources are available with the exception of Vox dictionary that would require an agreement from Biblograf. LDB/LKB Integration Software Description: The central aim of this software, is to provide tools for loading intermediate and relative stable versions of lexicons developed in the LKB into the LDB, allowing, in this way, flexible database-like access/search to entries based on any aspect of their contents. The system is fully compatible with LKB functionalities and its display capabilities are adaptad to the special characteristics of the material (FS) to be displayed. Fully available. TGE: Tlinks Generation Environment. Description: TGE (Tlinks Generation Environment) is a software system designed and built in order to provide a way of constructing Tlinks semi-automatically from LKB data and bilingual dictionaries loaded in the LDB. The system allows several forms of extraction, depending on the classes of tlinks to be produced, the involved data sources and the degree of human intervention. Fully available. SAIBT (Semi Automatic Index Building Tool) Description: The aim of SAIBT (Semi Automatic Index Building Tool) software system is to help users to index MRDs (Machine Readable Dictionaries) within the LDB environment in a user-friendly way, due to the problems the users have found for building the Dictionary and Interface Definition. No need to mention that a good knowledge of the LDB software and environment is absolutely unavoidable, in order to fully understand the functionality of SAIBT and be able to use it. More specifically, SAIBT computes the *interface- definitions* and the *dictionary-definitions* of the dictionary to be indexed in an easy way and saves them into a file that needs loading when indexing the dictionary. It also offers the user some help in the process of designing the extract functions and display function. Fully available. CAMBRIDGE SOFTWARE (Contact: Ted Briscoe, ejb@cl.cam.ac.uk) The Lexical Data Base (LDB) System Description: The LDB is a specialised database system implemented in Common Lisp with a graphical user interface (X-windows, Macintosh) enabling fast and flexible access to MRDs via index files. It is fully described in Boguraev, B and Briscoe (eds) "Computational Lexicography for Natural Language Processing" Longman, 1989 and by Carroll, J in Sanfilippo (ed) "The (Other) Acquilex Papers", Cambridge Computer Laboratory, TR-253, 1992. The Lexical Knowledge Base (LKB) System Description: The LKB is an implementation of a multilingual lexicon system in Common Lisp with a graphical user interface (Allegro CL with Common Windows, Procyon CL, MCL). The lexicon is structured as a multiple default inheritance hierarchy of typed feature structures to which lexical and morphological rules and translation links can be applied. It is fully described in Briscoe, Copestake and de Paiva (eds) "Inheritance, Defaults and the Lexicon" CUP, 1993 and various working papers. A standalone version (Stuffit archive) is currently available for older Macs, a new MCL version will replace this shortly. The Acquilex PoS Tagger Description: A HMM (bigram) part-of-speech tagger implemented in C with options to apply the Viterbi or Forward-Backward algorithms for direct or maximum likelihood estimation of transition and/or lexical probabilities from tagged or untagged training data. This is fully described in Elworthy, D Part of Speech Tagging, Acquilex-II Working Paper 10 and also in Elworthy, D. Does Baum-Welch Re-estimation Help Taggers? Proc. of ANLP-94. Lucifer Description: A system implemented in Common Lisp for automatically assigning tlinks between monolingual LKB entries from lexicons for different languages developed using a common type system. The system uses the delta rule to find the best match between alternative possible pairs training itself on unambiguous cases. It requires a bilingual MRD or word list pairing potentially translation equivalent word forms between the two languages. It is described fully in Copestake et al, Multilingual Lexical Representation, Acquilex-I Working Paper 43. Availability: All Cambridge Software is available with full documentation. It is free to bona fide researchers based in non-profit educational organisations for approved research on the basis of acknowledgement of its use and agreement to not distribute it further. It is available to commercial organisations for research and commercial use by negotiation, which usually involves a donation to the Computer Laboratory. PISA SOFTWARE (Contact: Nicoletta Calzolari, glottolo@icnucevm.cnuce.cnr.it) SYSTEM DESCRIPTION OPERATING ENVIRONMENT SO-extractor Parsing System for the Written in a proprietory extraction of typical subjects code for the IBM-VM and objects from the definitions Operating System and the example sentences contained in Italian monolingual and bilingual Machine Dictionaries. [NOT available] SO-identifier Automatic Self-learning System Written in C Language for for the Identification of Subject the Unix environment and Object in Italian. It makes use of morphological, syntactic, lexico-semantic and pragmatic knowledge. It is based on principles of linguistic analogy. Training Input: output list produced by the SO-extractor Test Input: list of Italian sentences preprocessed by a proprietory Italian Grammar [MON95] Output: identification of subject and object relations. [Available] SO-disambiguator Automatic Self-learning System Written in C Language for for the disambiguation of the Unix environment Subject and Object assignment in Italian. It makes use of lexico-semantic knowledge and taxonomical generalizations. It is based on principles of linguistic analogy. Input and Output are structurally similar to the input and output of the SO-identifier system, except for a specific focus on lexico- semantic knowledge integrated by taxonomical information. [Available] GENUS-extractor Semantic Parsing System for Written in a proprietory the extraction of the Genus code for the IBM-VM terms and the semantic Operating System relations linking them to the definiendum. It operates on Noun and Verb definitions of Italian monolingual dictionaries. Input: output of a proprietory Italian Grammar [S. Montemagni, 1995, Subject and Object Assignment in Italian. PhD dissertation, UMIST Manchester, in preparation] Output: genus/relation lists [NOT available] PALCO: Phrasal Core of a parsing system for Written in Analyzer for the analysis of real texts, in MacCommonLisp for the Large COrpora terms of their syntactic features MacIntosh Operating at the phrasal level. System It constists of: - a grammar doing the analysis work - an interface that allows the user to customize his parsing session. It is based on PGDE [AITech, 1992, PGDE User Manual, AIT.TR]. It is interfaced with the DMI (Italian Machine Dictionary) of the ILC. Input: Italian text Output: syntactic structures of [Available on the text in terms of its phrasal conditions to be constituents. stipulated] VAN DALE SOFTWARE (Contact: Margreet Moerland, margreet@cobra.vdl.nl) Co-Co... is a full-screen multi-file editor with built-in corpus tools. The editor allows you to create, edit and save ASCII text files. The corpus tools allow you to create dictionary entries, word frequencies and find collocations. Co-Co... implements a small fast editor. Co-Co... is a graphical user interface for: - KWIC lists - Frequency lists - Z-score calculation - Complex collocations calculation [Stassen:10] - Viewing: BVD-list [Stassen:7] FRQ-list [Stassen:7] F-score list [Stassen:8] Co-Co... runs on the IBM-PC family of computers. Co-Co... requires at least 640K to run smoothly. It runs on any 80-column monitor. The minimum requirement is at least one VDL-microCorpus and a harddisk. Co-Co... also supports a mouse. It is however possible to enlarge the power of Co-Co... with other VDL-microCorpora. MicroCorpora have to be tagged with part-of-speech and lemma. Available to research institutions and institutions participating in CEC projects as indicated in ESPRIT contracts at no cost, at commercial conditions to all other third parties.