Related Experiments. This work was part of an unpublished MPhil dissertation. • ▇▇▇▇, ▇▇▇▇▇▇. Using Lexical Resources to Improve Parsing Accuracy. MPhil Dissertation, 2011, University of Cambridge. This project uses VALEX, an automatically extracted wide-coverage subcategorization resource in an attempt to improve the C&C parser. The stages are: VALEX subcategorization frames are mapped to CCG categories used in the C&C parser through a common formalism called Grammatical Relations. The mapping scheme is used to convert the VALEX subcategorization lexicon to a tag dictionary which relates words to corresponding CCG categories. Experiments are conducted on combining the generated tag dictionary with the original parser's dictionary under various settings of word frequencies. Performance of the parser i evaluated on CCGbank and a Wikipedia dataset under different experimental settings. Finally, a pilot study investigates enhancing features in the original parsing model by training with artificially generated data. Mapping is the most fundamental step in this project as it establishes ways how VALEX SCFs can relate to CCG categories. Before actual mapping, an initial investigation of the two formalisms reveals that the mapping is inevitably many to many. Multiple SCFs are mapped to one category because VALEX and CCG have different approaches for modeling types of argument verbs can take. CCG only models shallow surface syntax while VALEX has more fine-grained frames modeling underlying syntax like raising and control . For example, sank in His reputation sank low and appears in He appears crazy have the same category (S\NP)/(S[adj]\NP) but two different SCFs because the subject is raised in the second sentence. As a result the two SCFs are mapped to one category. The other direction of many- to-many mapping results from sentence features which, for example, can specify a generic (S\NP)/NP category into (S[dcl]\NP)/NP, (S[pt]\NP)/NP, (S[ng]\NP)/NP, (S[b]\NP)/NP and the passive voice category S[pss]\NP. Those categories are all mapped to one SCF for transitive verbs because those categories are related to tense and voice which VALEX does not distinguish.
Appears in 2 contracts
Sources: Grant Agreement, Grant Agreement