Segmenter Evaluation Clause Samples

Segmenter Evaluation. To evaluate the punctuation recovery and recasing functionalities of the segmenter, we calculate the Word Error Rate (WER) taking or not into account case and punctuation (see Table 14). Case Punctuation WER I D S Recall Precision insensitive insensitive 0.1 38 12 5 - - insensitive sensitive 7.7 1526 1521 1916 54.9 55.0 sensitive insensitive 5.3 38 12 2964 52.4 51.5 sensitive sensitive 12.3 1504 1499 4892 - - Table 14: Word error rate (%) for the translation of simulated ASR output into transcription style, taking or not into account case and punctuation; Corrected WER is obtained by subtracting the WER value when case and punctuation are not considered (i.e. 1.9%) We extracted the number of punctuation marks in the reference (Nref) and hypothesis (Nhyp) by evaluating each file with respect to the same file with punctuation marks removed. The number of punctuation marks is the number of insertions. From these numbers we deduced the recall R and precision P of punctuation recovery in the following way (I, S and D stand respectively for the number of insertions, substitutions and deletions of the punctuation-sensitive evaluation in Table 14): We extracted the number of upper-cased words in the reference (Nref) and hypothesis (Nhyp) by evaluating each file with respect to the same file with case removed. The number of upper-cased words is the number of substitutions. From these numbers we deduced the recall R and precision P of punctuation recovery in the following way (here we only consider case, thus we cannot have insertions or deletions, and S stands for the number of substitutions of the case-sensitive evaluation in Table 14): The baseline segmenter translated the above example in this way: Another product made in Germany , being packed for export this crate will travel by truck and then by ship to North Africa exports represent a huge part of Germany 's economy many engineering companies make most of their sales to foreign customers and that 's sad to increase even further next year . The German ▇▇▇▇▇▇▇▇ of commerce and industry forecasts Germany well export about 1.4 five trillion euros worth of goods in 20 14 , that would be over four percent more than Germany 's 20 13 exports , a big reason for the expected growth is the gradual recovery of eurozone economies . That means new investments and new orders for machinery from Germany , while that 's good news for German companies many foreign countries grumble that Germany isn 't returning the favor by sp...