Dictionary level evaluation. a. The aligned corpus will be used as data source to run the dictionary creation and annotation tools. From the result, a random sample of bilingual entries will be inspected, in the size of 500 to 1000 entries. b. The entries will be evaluated according to the criteria explained above. The evaluation dimensions will be: • Is the character code correct? Does the file only contain legal characters? • Are all obligatory annotations available? • Do the annotations contain legal values? (data type; correct values for member-typed annotations; correct spelling / lemma presentation for string-typed values) These criteria can be evaluated by a validation program. There should not be any obvious errors in the data23. • Are the monolingual entries properly annotated? E.g. are all nouns annotated with „feminine„ really feminine? Are the inflectional patterns correctly assigned? Are the parts of speech right? etc. Annotation errors are: wrong values, incomplete or missing values. 22 the Alignment Error Rate (AER) is defined on word level, not on sentence level, cf. ▇.▇. ▇▇▇▇▇▇/▇▇▇▇▇ 2007. 23 There can be unclear cases e.g. in spelling; they should not count as error. • Are the proposed translations correct? This will be evaluated by searching entries in other dictionaries. Errors mean that the claimed translation cannot be found anywhere.
Appears in 2 contracts
Sources: Grant Agreement, Grant Agreement