Natural Language Theory and Technology
Intelligent Systems Laboratory
Palo Alto Research Center
The Natural Language Theory and Technology group develops
broad-coverage theories and technologies for efficiently converting
between natural language expressions and machine-interpretable
canonical representations. This ability is required for machines to
perform such tasks as content analysis, fact-finding, question
answering, inference, translation, and sense making, and to
communicate naturally with people.
Our approach involves investigating the features that all human
languages have in common and developing universal theoretical and
computational frameworks that apply easily to languages of widely
varying types. The theories of Lexical Functional Grammar (LFG) and
Finite-State Morphology are examples of frameworks that we originated
and that linguists and computational linguists around the world have
adopted and are extending. We and our close collaborators
also construct broad-coverage linguistic databases (grammars,
lexicons, morphologies, semantics) that describe the distinctive
characteristics of particular languages (Arabic, Chinese, English, Japanese,
Turkish, Urdu, ...). Finally, we investigate and implement efficient
algorithms for processing text and meaning and embed those algorithms
in prototypical applications. The research disciplines in NLTT include
linguistics, computer science, logic, and statistics.
List of previous
members and associates
See also the PARC pages on our natural
language processing activities.
Ambiguity-enabled, Scalable Knowledge Repository (Asker)
Asker allows for
natural language question answering and retrieval on massive data
collections. This uses our Two-way Bridge
between Language and Logic to provide a robust, broad-coverage
mapping between natural language strings and abstract Knowledge
Representation. The forward mapping uses the XLE parser and English ParGram
grammar, followed by semantic and AKR rules using XLE's ordered
rewrite system (XFR). This process is reversed for generation of
grammatical, natural language strings from AKR. This research is used
in part by Powerset for their indexed search products.
XLE investigates algorithms for
efficient parsing and generation with broad-coverage LFG
grammars, paying particular attention to the problem of ambiguity
management. These algorithms are embedded in a high-performance
implementation for grammar development and natural language
applications. XLE includes an ordered rewrite rule component (XFR)
that is used for deeper processing (semantics, knowledge
representation) and for a variety of applications.
ParGram and ParSem are part of a research and development consortium that
produces large-scale LFG grammars for several languages. These
grammars are developed on and are interpreted by the XLE system. They
incorporate statistical machine
learning techniques to induce disambiguation routines for
broad-coverage constraint-based parsing.
The ParTrans (Parallel
ParTrans develops new strategies and algorithms for
translation that take advantage of the ParGram grammars, the XLE
parser, generator, and rewrite system (hybridizing hand-writing and
induced rules), and our general techniques for managing ambiguity.
We continue our theoretical work on grammatical formalisms
Functional Grammar), syntactic, semantic, and knowledge
representation issues, parsing and generation algorithms, and
finite-state morphology and phonology. For more information, see the
selected bibliography of NLTT
Downloads for the NLP community
For research and education we also make available:
The Grammar Writer's
Workbench for Lexical Functional Grammar, an earlier,
Lisp-based implementation of LFG theory. The current XLE system is available under
license for research and commercial purposes.
The PARC700 dependency bank and related tools.
Last updated: .