SMT-based segmenter Clause Samples

SMT-based segmenter. 8.2.2.1. Monolingual SMT system We built the SMT system with the Moses open source toolkit, with all default options, except the following changes allowing the translation to differ from the original only by punctuation marks and case:  The filter by maximum sentence length was disabled, allowing sentences as long as our maximum segment size (500 words).  The reordering functionality was disabled, thus translation was monotonic.  The extracted phrase pairs were filtered to only keep translated sides differing from the source side only by punctuation marks and case.