Disambiguation by maximum overlap Clause Samples

Disambiguation by maximum overlap. The other case is shown in Figure 2. Here there are two mentions in File A, m-892 (TALIBAN MILI- TIA) and m-905 (TALIBAN), both overlapping with one mention in File B, m-788 (THE NOW- OUSTED TALIBAN MILITIA), so it is not pos- sible to have a matching of all the mentions. We choose the mapping with greatest overlap, in terms of characters, and so m-892 and m-788 are taken to match, while m-905 is left without a match. For such cases of disambiguation by maximum overlap, it may be possible that a different match- ing, the one with less overlap, might be a better fit for one of the higher levels of annotation. This issue will be resolved in the future by using ENTI- TIES rather than ENTITY MENTIONS as the units to compare for the RELATION and EVENT levels.