| Practical Natural Language Processing / Proseminar Künstliche Intelligenz / SS 1998 / Philipp Stolka |
Most times, the parser doesn't know how far he has to backtrack, so there may be a lot of duplicate work in this approach. Having recognized "the" as a DET is correct, of course, so dropping this result would be inefficient as it would have to be parsed a second time only to return the same result (in this example where only one single word would have to be re-analyzed it isn't so severe, but for a large NP consisting of many words this might be another thing).
Here we invent another data storage structure, the chart, together with another parsing algorithm which is more suitable for this specific task.
In an ordinary parse tree, we would not be able to react to changes of this type - changes in the parse structure - in an easy way. Every difference would call for a major reorganization of the tree, and complex sentences would require many of them. Therefore we devise the chart which, for a n-symbol sentence, consists of n+1 vertices placed between the symbols, and possible edges connecting the vertices.
This shows the starting configuration of the chart parser working on our example sentence. Only the vertices do exist yet, no edges have been introduced so far.
The edges will have labels describing their status. An edge labeled[0, 5, S - NP VP °]states that there is a NP VP combination forming an S that would extend from 0 down to 5. The "°" is a separator that separates the already-found symbols from the yet-to-find ones. The above edge would be called a complete edge, as there are no symbols to be found anymore (no symbols after the °). These stand in contrast to incomplete edges like[0, 3, S - NP ° VP]which are still in search of the symbols after the °; this example says that there is an NP spanning from 0 to 3 which, if followed by a VP, would combine to form a complete S. (Thus, the indexing numbers indicate the place of the symbols already recognized.)
The chart parser performs a cycle of four different procedures over and over again during the parsing process: the initializer, the predictor, the scanner, and the completer.
The initializer is employed only once at the beginning of the parsing process and provides us a starting edge from the bare original chart, that is, it gives us a[0, 0, S' - S °]edge. This means we have a provisional S' symbol indicating we are still missing an S whilst nothing has been found yet.
The second procedure is the predictor that takes an incomplete edge, looks at what is missing to complete this edge and adds new incomplete edges that make up the missing symbols.
The completer takes two edges, an incomplete one that is still searching for a certain symbol and a complete one that provides exactly this symbol and is starting at where the first one ends, and combines them into a new edge that incorporates the complete into the incomplete edge (this new edge may or may not be complete itself).
The scanner has a similar task: It takes an incomplete edge that is searching for a certain word type (noun, verb etc.) and a word of this type at the ending position of the edge and combines them to form a complete edge. That is, it works like the completer, but is operating on word level instead of phrase level.
By following this method, you do not have to do any backtracking. The parsed symbols get stored in the chart and can be used later on, partly or wholly, in any context.
If we are given the rewrite rules1. S - NP VPwe can get the following chart trace for our sentence "The Chinese signs are manifold.":
2. NP - Det NOUN
3. NOUN - Adj Noun
4. VP - Verb Adj
5. Det - "the"
6. Noun - "signs"
7. Noun - "Chinese"
8. Verb - "are"
9. Adj - "Chinese"
10. Adj - "manifold"
edge procedure derivation comment a initializer [0, 0, S' - ° S] get the start symbol b predictor(a) [0, 0, S - ° NP VP] expand this to NP VP (rule 1) c predictor(b) [0, 0, NP - ° Det NOUN] (rule 2) d predictor(c) [0, 0, Det - ° "the"] predicting the first word to be "the" e scanner(d) [0, 1, Det - "the" °] finding "the" f completer(c,e) [0, 1, NP - Det ° NOUN] integrate this into our NP g predictor(f) [1, 1, NOUN - ° Adj Noun] go on (rule 3) h predictor(g) [1, 1, Adj - ° "Chinese"] predict the Adj to be "Chinese" (*) i scanner(h) [1, 2, Adj - "Chinese"] finding the Adj to be "Chinese" j completer(g, i) [1, 2, NOUN - Adj ° Noun] integrate... k predictor(j) [2, 2, Noun - ° "signs"] predict... (*) l scanner(k) [2, 3, Noun - "signs" °] find... m completer(j, l) [1, 3, NOUN - Adj Noun °] integrate... n completer(b, n) [0, 3, NP - Det NOUN °] integrate... o completer(b,n) [0, 3, S - NP ° VP] integrate... p predictor(o) [3, 3, VP - ° Verb Adj] go on (rule 4) q predictor(p) [3, 3, Verb - ° "are"] predict... r scanner(q) [3, 4, Verb - "are" °] find... s completer(p, r) [3, 4, VP -Verb ° Adj] integrate... t predictor(s) [4, 4, Adj - ° "manifold"] predict... (*) u scanner(t) [4, 5, Adj - "manifold" °] find... v completer(s, u) [3, 5, VP -Verb Adj °] integrate... w completer(v, o) [0, 5, S - NP VP °] and finally integrate.
At certain points, there are many different possibilities to predict the next edges; e.g. at the line where edge h (*) is predicted as "Chinese" it might as well be predicted as "manifold", as this is an Adj, too (the same happens in k and t). Of course, this is done, too; but the scanner in the next line finds "Chinese" instead of "manifold", thus only edge h with "Chinese" is relevant. (The other edges have simply been omitted in the trace above due to place requirements.)
Now we are left with a complete chart like the one that follows, wherew:X/Y Zmeans that edge w is still looking for a Y-Z combination that would form a complete X if encountered.When you want to extract a single parse tree from this chart, you face a problem: You can very well go bottom-up, from the single words up to the S, but not the other way round. This is quite displeasing for any tree-building algorithm. Thus, you need a way to come top-down, from S down to the words.
This might be accomplished by noting the children edges for each edge in the label; for example, the n:NP edge this might look like this:n:NP (e, m)(it is combined from the e and the m edge, which can be represented in an analogous manner). Now, you can simply construct a parse tree by starting at the w:S edge and descending recursively down the chart edges:This is an intuitive way to proceed when you have an unambiguous sentence. However, when it is unclear which meaning was intended, you would have to build many parse tree (the amount of trees would equal the amount of ambiguities multiplied by the amount of possible interpretations in each case - an exponential number).
- w:S (n ,v)
- n:NP (e, m)
- e:Det ("the")
- m:NOUN (i, l)
- i:Adj ("Chinese")
- l:Noun ("signs")
- v: VP (r, u)
- r:Verb ("are")
- u:Adj ("manifold")
Take a part of the example in [AI23]: For the sentence "Fall leaves fall." there are two possible syntactic interpretations, eitherS [NP ["Fall leaves"] VP ["fall"]]orS [NP ["Fall"] VP ["leaves fall"]].You can represent this in one single structure that embraces all the possibilities:S [This way of representing multiple parse trees is called packed forest, as many trees are enclosed in one main composition.].
| prev: | 3.2.1 - A Simple Parser |
| this: | 3.2.2 - The Chart Parser |
| next: | 3.3 - Unknown Words |