Identification and Transformation of Terminal Morphemes in Medical Englishi)
The system for the identification and subsequent transformation of terminal morphemes in medical English is a part of the information system for processing pathology data which was developed at the National Institutes of Health.The recognition and transformation of terminal morphemes is restricted to classes of adjectivals including the -ING and -ED forms, nominals and homographic adjective/noun forms.The adjective-to-noun and noun-to-noun transforms consist basically of a set of substitutions of adjectival and certain nominal suffixes by a set of suffixes which indicate the corresponding nominal form(s).The adjectival/nominal suffix has a polymorphosyntactic transformational function if it has the property of being transformed into more than one nominalizing suffix (e.g., the adjectival suffix -IC can be substituted by a set of nominalizing suffixes -0, -A, -E, -Y, -IS, -IA, -ICS): the adjectival suffix has a monomorphosyntactic transformational property if there is only one admissible transform (e.g., -CIC-X).The morphological segmentation and the subsequent transformations are based on the following principles:a. The word form is segmented according to the principle of »double consonant cut,« i.e., terminal characters following the last set of double consonants are analyzed and treated as a potential suffix. For practical purposes only such terminal suffixes of a maximum length of four have been analyzed.b. The principle that the largest segment of a word form common to both, adjective and noun or to both noun stems is retained as a word base for transformational operations, and the non-iden, tical segment is considered to be a »suffix.«The backward right-to-left character search is initiated by the identification of the terminal grapheme of the given word form and is extended to certain admissible sequences of immediately preceding graphemes.The nodes which represent fixed sequences of graphemes are labeled according to their recognition and/or transformation properties.The tree nodes are divided into two groups:a. productive or activatedb. non-productive or non-activatedThe productive (activated) nodes are sequences of sets of graphemes which possess certain properties, such as the indication about part-of-speech class membership, the transformation properties, or both. The non-productive (non-activated) nodes have the function of connectors, i.e., they specify the admissible path to the productive nodes.The computer program for the identification and transformation of the terminal morphemes is openended and is already operational. It will be extended to other sub-fields of medicine in the near future.