Relation Extraction Models Sample Clauses
Relation Extraction Models. Models for relation extraction from QnA data incorporates the topic of the question and can be represented as a graphical model (Figure 3.1). Each mention of a pair of entities is represented by a set of mention-based features xi and question-based features xq. A multinomial latent variable zi represents a relation (or none) expressed in the mention and depends on the features wq xq xi |Q| wm
Figure 3.1: QnA-based relation extraction model plate diagram. N - number of different entity pairs, M - number of mentions of an entity pair, |Q| - num- ber of questions where an entity pair is mentioned, xi and xq - mention-based and question-based features, wm and wq - corresponding feature weights, la- tent variables zi - relation expressed in an entity pair mention, latent variables and a set of weights wm for mention-based and wq for question-based features: | zˆi = arg max p(zi xq, xi, wq, wm) z∈P ∪∅ To estimate this variable we use L2-regularized multinomial logistic regres- sion model, trained using the distant supervision approach for relation extrac- tion [138], in which mentions of entity pairs related in Freebase are treated as positive instances for the corresponding predicates, and negative examples are sampled from mentions of entity pairs which are not related by any of the predicates of interest. Finally, to predict a set of possible relations y between the pair of entities we take logical OR of individual mention variables zi, i.e., yp = ∨M entities. [zi = p, p ∈ P ], where M is the number of mentions of this pair of Existing sentence-based relation extraction models can be applied to individ- ual sentences of a QnA pair and will work well for complete statements, e.g., “Who did ▇▇▇▇ ▇▇▇▇ marry? ▇▇▇▇ ▇▇▇▇ and ▇▇▇▇▇▇▇▇ ▇▇▇▇▇ married at secret ceremony ...”. In sentence-based scenario, when the set of question-based features is empty, the above model corresponds to the Mintz++ baseline described in [188], which was shown to be superior to the original model of [138], is easier to train than some other state of the art distant supervision models and produces comparable results. In many cases, an answer statement is hard to interpret correctly without knowing the corresponding question. To give the baseline model some knowl- edge about the question, we include question features (Table 3.1), which are based on dependency tree and surface patterns of a question sentence. This information can help the model to account for the question topic and improve pre...
