Philipp J. Stolka
Practical Natural Language Processing / Proseminar Künstliche Intelligenz / SS 1998 / Philipp Stolka




3 - Syntax: How to Put It

Grammar, Efficiency And
Recovering Unknown Words

We will now take a closer look at the NLP way of analyzing texts; that is, we move away from the probabilistic, non-structured IR approach and concentrate on grammatical text design.

As [AI22] puts it, any communication or speech act is built from seven distinct processes: Intention, generation, synthesis, and perception, analysis, disambiguation and incorporation. Of these, the first three take place in the speaker or sender, and the last four happen in the hearer's mind. In the context of this paper, only generation, analysis and disambiguation are of interest to us.

During generation, the speaker makes a choice of words or symbols appropriate to what he wants to convey to the hearer. This happens in accordance to certain grammar rules that apply to the selected language.
Analysis means that the perceived string is being processed by the hearer in order to extract the possible meanings. This consists of both syntactic interpretation (also called parsing) and semantic interpretation, taking into account the words' meaning as well as their meaning in the current situation. The result of the analysis of a syntactically correct sentence is something equivalent to a parse tree, a data structure in the form of a tree that represents the sentence in terms of words connected to phrases (which will be explained later on).
Disambiguation, finally, picks out the meaning that has most likely been intended by the sender, as some syntactically correct constructs allow for more than one semantic interpretation. As it was mentioned before, you cannot know exactly what the sender wanted to express without having direct access to his knowledge, thus disambiguation is a "process that relies heavily on uncertain reasoning" [AI23].


prev: 2.3 - Dealing with text
this: 3 - Syntax: How to Put It
next: 3.1 - The Grammar of Formal Languages

back to main page...

Last modified: Tue Sep 22 21:21:55 MEST 1998