Implementation and performance. For certain languages targeted by PANACEA, standalone tools for different processing stages are not available. Instead, pipelines or comprehensive one-does-it-all tools cover all processing stages up to and including syntactic analysis (e.g. the Italian SynSg System, or the Charniak and RASP parsers). Almost half of the tools are implemented in Java, with C/C++ coming second together with some hybrid systems (Figure 2) involving Perl, Python, etc. More than half of the tools claim to be OS independent or, at least, operating on both Windows and Linux (Figure 3). Proprietary 38% GPL 16% Open source 9% 15 ▇▇▇▇://▇▇▇.▇▇.▇▇▇/TR/ws-gloss/ Hybrid: 22% Linux; 13% Java; 48% Window s; 32% C/C++; 22% Other; 9% Figure 2 Programming Languages Figure 3 Operating Systems 25,00 20,00 15,00 10,00 5,00 0,00 It is not easy to compare the processing speed of different categories of NLP tools, especially if pipelines of tools instead of standalone versions are documented. Depending on the perspective, the majority of most basic tools (i.e. up to lemmatization) seem to perform relatively fast (Figure 5). Nevertheless, higher level tools, like for example constituency parsers for EN, are much slower in comparison (Figure 5). ▇▇▇▇ Processing Tool ▇▇▇▇ POS Tagger ILSP SST ILSP FBT ILSP Lemmatizer ▇▇▇▇ ▇▇▇▇▇▇▇ ILSP MENER LT-Lemmatiser LT-Decomposer LT-MonoTermExtract TreeTagger Charniak ▇▇▇▇▇ Berkeley Stanford
Appears in 2 contracts
Sources: Grant Agreement, Grant Agreement