| Practical Natural Language Processing / Proseminar Künstliche Intelligenz / SS 1998 / Philipp Stolka |
"Communicating with natural language, whether as text or as speech, depends heavily on knowledge of the domain of discourse. Understanding is not merely the transmission of words; it also requires inferences about the speaker's goals and assumptions about the context of the interaction. Implementing a natural language understanding program requires that we represent large amounts of knowledge and reason effectively with it. We must consider such issues as nonmonotonicity, belief revision, metaphor, planning, learning, and the practical complexities of human interaction. These are the central problems of artificial intelligence."
-- Luger and Stubblefield
In virtually every niche of our planet, there is some interaction.
Be it on the level of atoms and molecules, on cellular level or be it developed forms like plants, animals and humans, everywhere we see entities exchanging information in order to rearrange or to achieve a goal. Of course, the amount, quality and means of transmission vary greatly, but effectively, for each of the many levels there is a common protocol by which distinct things can interact.
As [AI22] puts it, "communication is the intentional exchange of information brought about by the production and perception of signs drawn from a shared system of conventional signs."
Although there are as many different definitions of information and communication as there are books about it, they finally all come down to stating that exchange of information between a sender and a hearer is involved.
Now, we do not want to be held up here by looking at how this might be physically accomplished, but rather focus on what is communicated and in which way - in terms of syntax and semantics - the information is actually transferred from A to B. Abstracting away from internal concepts that might represent the sender's beliefs, from acoustics (or something similar) as a way of transmission, and the hearer's internal representations, we are left with communication stripped to the afore-mentioned common core: information wrapped into a language wrapped into signs of a more or less obscure alphabet.
It's this information that we will concentrate on in order to exploit it for our language processing applications.
Language is a "complex, structured system of signs (...) that enables humans to communicate" [AI22].
We already know that communication is aimed at moving information from the sender to the hearer. Obviously, generating sensible utterances (that is: spoken or written strings or texts in any given language) poses no big problem to humans, and to a limited degree, machines can also tell us what we should know about their work.
But, how about extracting the information back from the message? Again, no problem for humans. Machines, on the other side, need some explicit guidance to understand what is told to them. But why should machines understand language - natural language - at all?
Clearly, user interfaces to computers and other artifacts have improved notably in the last years, but still, it is difficult for certain user groups to have satisfactory access to the offered services. Thus, you need to implement some easier and more intuitive ways to provide this access - natural language, for example.
This is the point where Natural Language Processing enters the scene. It grabs the input string (suppose it's already built-to-order in the way the system likes it - only a string of signs, extracted from the user's speech) and meddles around with and in the end, we hope to get a record of what was said and what this means. And, of course, meddling around and distilling a record is the tricky thing about that.
We are now going to see what is to be done, what can be done and how a structure is imposed on the raw input that eventually leads us to an understanding of the string.
| prev: | 0 - Table Of Contents |
| this: | 1 - Introduction |
| next: | 2 - Technology Applied |