Conference Publication Details
Mandatory Fields
QasemiZadeh, Behrang; Handschuh, siegfried;
COLING Workshop on Synchronic and Diachronic Approaches to Analyzing Technical Language
Investigating Context Parameters in Technology Term Recognition
Optional Fields
Technology Term Recognition Term classification supervised machine method k-nearest-neighbors regression k-nn information extraction random indexing random projection distributional semantics corpus-based linguistics computational terminology
Adam Meyers, Yifan He and Ralph Grishman
Dublin, Iralnd
We propose and evaluate the task of technology term recognition: a method to extract technology terms at a synchronic level from a corpus of scientific publications. The proposed method is built on the principles of terminology extraction and distributional semantics. It is realized as a regression task in a vector space model. In this method, candidate terms are first extracted from text. Subsequently, using the random indexing technique, the extracted candidate terms are represented as vectors in a Euclidean vector space of reduced dimensionality. These vectors are derived from the frequency of co-occurrences of candidate terms and words in windows of text surrounding candidate terms in the input corpus (context window). The constructed vector space and a set of manually tagged technology terms (reference vectors) in a k-nearest neighbours regression framework is then used to identify terms that signify technology concepts. We examine a number of factors that play roles in the performance of the proposed method, i.e. the configuration of context windows, neighborhood size (k) selection, and reference vector size.
Science Foundation Ireland
ISBN: 978-1-873769-46-1
Grant Details
Publication Themes