Dataset Sample Clauses

Dataset. Information or data from a database Service primarily devoted to market sizing by market segment and country, derived from a single (a) market segment and (b) country.
AutoNDA by SimpleDocs
Dataset. The database we used has been developed within the subalpine GIG. Samples were collected between 1997 and 2010 in the sublittoral zone of 19 Austrian, 25 German, 21 France and 28 Italian subalpine lakes. 10 of those lakes were sampled in 2 different years while 1 lake was sampled in 3 different years, each lake-year combination has been considered as an indipendent sample unit. Invertebrates were indentifyed to the lower taxonomic level possible, mostly to genus/species level. Data gathered in more than one sampling site were aggregated to lake-year level. Environmental variables We considered climatic and morphological environmental variables (table 1): precipitation, mean annual temperature, difference between temperature in July and in January, lake surface area, lake mean depth and catchment area. The climatic data were gathered from the Climatic Research Unit (CRU) model (New et al. 2002; xxxx://xxx.xxx.xxx.xx.xx/). Table 1: Environmental variables ranges. min max Mean annual Prec. (cm) 60.16 162.67 Mean annual Temp. (°C) 5.17 12.99 T(July)-T(January) (°C) 17.40 21.60 surface (km2) 0.04 79.90 mean depth (m) 3.20 53.21 catchment (km2) 1.01 4551.60
Dataset. Dataset description The data basis was compiled within the lake macroinvertebrate groups of the AL- and CB-GIG. 7 countries with existing assessment systems for eulittoral macroinvertebrates and 4 additional countries contributed data. 9 countries are represented in the CB-GIG (table 1) and 3 in the AL- GIG (table 2).
Dataset. The conversational dataset that is utilized is the Friends TV show transcript1. The Friends TV show transcript is a multiparty dialogue data with speaker name annotated, which contains the scripts from the ten seasons of the Friends TV show. Each season contains 24 episodes with around 6,000 utterances. The corpus, in total, includes 67,373 utterances, 126,059 sentences and 1,110,936 tokens. The training, validation and test set were split on a 19:2:3 ratio where for each season, the first 19 episodes were used as the training set, the 20 and 1xxxxx://xxxxxx.xxx/emorynlp/character-mining Figure 3.1: Overview of a sequence tagging model with the bottom part representing ELMo embedding structure (before the top bi-RNN networks are applied) [35]
Dataset. ‌ The used dataset contains information taken from 118 Instagram fashion influencer’s accounts representing 2703 lines of text. For each fashion influencer, the pseudo name, the link to his profile picture, his biography and the most recent 100 posts are provided. Each post consists of a link to the published picture, the textual caption of the post together with some basic metrics as the number of comments and likes. We extract the textual data from this dataset which revolves around 100 textual captions of fashion influencer’s posts and we apply the following NLP tools.
Dataset. Any data you provide to the Project is subject to the license agreement indicated in the Project’s source repository for the Materials. Signature
Dataset. The primary focus of the biometrics research we perform is to develop algorithms, techniques and tools for automatic recognition of humans. As a part of a research work, we are involved in forming this face database. This database enables researches in developing, testing and publishing human recognition algorithms.
AutoNDA by SimpleDocs
Dataset. The dataset used for the study reported in this section is the same used by Xxxxx et al. (2009b) to derive ITA08, consisting of Nr 553 strong motion recordings relevant to a total number NE=106 earthquakes with moment magnitude MW varying from 4.0 to 6.9 and recorded at epicentral distances up to about 100 km (see Figure 2.3). The considered sites (total number NS=206) have been initially classified using the soil classification proposed bx Xxxxxxx and Xxxxxxxx (1996), where Vs30 has been introduced for sake of clarity.
Dataset. For Chinese-English (ZH-EN) translation, our training data for the translation task consists of 1.25M Chinese-English sentence pairs extracted from LDC corpora1. The NIST02 testset is chosen as the development set, and the XXXX00, 0Xxx corpora include LDC2002E18, LDC2003E07,LDC2003E14, Hansards portion of LDC2004T07, LDC2004T08 and LDC2005T06. # Model NIST WMT Existing NMT Systems 1 EDR (Tu et al., 2017) N/A N/A 33.73 34.15 N/A N/A 2 DB (Xxxxx et al., 2018) 38.02 40.83 X/X X/X X/X X/X Xxx XXX Systems 3 Transformer(Base) 45.57 46.40 46.11 44.92 45.75 27.28 4 +lossmse 46.71† 47.23† 47.12† 45.78† 46.71 28.11† 5 +lossmse + enhanced 46.94† 47.52† 47.43† 46.04† 46.98 28.38† 6 Transformer(Big) 46.73 47.36 47.15 46.82 47.01 28.36 7 +lossmse 47.43† 47.96 47.78 47.39 47.74 28.71 8 +lossmse + enhanced 47.68† 48.13† 47.96† 47.56† 47.83 28.92†
Dataset. The dataset is divided into two groups: native prose and translated prose. Native prose comprises the so-called Mabinogion corpus, found in the White Book of Rhydderch (NLW Peniarth 4 & 5, dated c. 1350) and the slightly later Red Book of Hergest (Xxxxx College Oxford 111, dated c. 1385). This corpus consists of eleven narrative tales, usually dated ‘between the end of the eleventh and the beginning of the fourteenth centuries’; the details of their date remain debated (Xxxxxx 1998: 134; Rodway 2013: 1). The first four tales are known as the Pedeir Keinc, the ‘Four Branches’ (Xxxxxxxx 1930). These include narratives named after the four major characters: Xxxxx, Xxxxxxx, Xxxxxxxxx and Math. Then there are three Arthurian tales about Xxxxxxx, Xxxxx and Xxxxxxx, traditionally labelled the ‘Three Romances’ although this label is contested (see esp. Xxxxx-Xxxxxx 2004). Arthurian literature of this kind featuring the same protagonists is found in other European languages, including Xxxxxxxx xx Xxxxxx’s French versions. The relationship of the Welsh Romances to their French counterparts has been long debated, and the current consensus treats them as native compositions but ones influenced to some extent by the French ones (see x.x. Xxxxx-Xxxxxx 1991; see also Xxxxx & Xxxx 2006 & 2008, Xxxx 2010, and Xxxxx-Xxxxx 2014 on the Welsh Charlemagne cycle). We include the Romances tentatively in the native corpus, but will also discuss them separately in section 4 to see whether they differ significantly from the other native texts. We then have four further native tales: Culhwch ac Olwen, Breudwyt Maxen (‘The Dream of Macsen’), Breudwyt Ronabwy (‘The Dream of Rhonabwy’) and Cyfranc Lludd a Llefelys (‘The Tale/Encounter of Lludd and Llefelys’, which also occurs inserted into the NLW Llanstephan 1 version of Brut y Brenhinedd and other later versions).3 Culhwch ac Olwen is usually taken to be somewhat earlier linguistically than the other tales of the Mabinogion corpus (see Rodway 2013: 1, fn. 2) and we analyse this tale separately in section 4 to see whether it differs from the other texts as regards adjectival agreement. For our annotated corpus, only the edited versions of the White Book of Rhydderch texts have been used.4
Time is Money Join Law Insider Premium to draft better contracts faster.