Dataset Statistics Sample Clauses

Dataset Statistics. To get an idea of the distribution of atoms in the datasets, the number of unique atoms in the subject, predicate and object positions were computed and tabulated in Table 2.l. Additionally, the number of atoms that could possibly take part in one of the 3 possible joins between different positions were calculated and tabulated in Table 2.2. The abbreviations S, P, and O denote subject, predicate, and object respectively. Dataset Uniq Subj Uniq Pred Uniq Obj DBpedia l36l08 8878 282l99 Uniprot 592639 79 294676 SP2Bench 3l629 6l 8l9l9 Table 2.l: Subject, Predicate, Object statistics Dataset SP SO PO DBpedia 0 48085 0 Uniprot 0 l05560 8 SP2Bench 0 l48l6 0 Table 2.2: Join statistics
Dataset Statistics. The new data set was created based on the 6512 application resume pool from the School of Nursing at Emory University. All the application resumes here applied the specific job, Clinical Research Coordinator, which was divided into four different levels: CRC I, CRC II, CRC III, CRC IV. In addition, for each level of CRC position, it may have multiple different CRC jobs. For example, more than one CRC job may have the same CRC level. Therefore, there are 108 jobs for CRC I, 88 jobs for CRC II, 29 jobs for CRC III and 6 jobs for CRC IV. Out of the 6512 unique resumes, in other words, out of the 6512 applicants, some of them may apply multiple jobs in the same level or across the levels, so there were totally 25027 applications. Due to multiple applications from one applicant, one more necessary cleaning process was to divide applicants into groups by their highest will of application. For example, if one applicant both applied for CRC I and CRC II, he or she should be grouped into CRC II applicant by his or her highest level applied. In this way, the ratio of applicants in the four levels was 28:12:8:2. Then, 2025 resumes were randomly selected to form the dataset: 1134 CRC I applicants’ resumes, 486 CRC II applicants’ resumes, 324 CRC III applicants’ resumes and 81 CRC IV applicants’ resumes. Annotation of these 2025 resumes will be explained more in detail in Section 3.2.

Related to Dataset Statistics

  • Statistics The Parties shall endeavour to promote, in accordance with existing statistical cooperation activities between the Union and ASEAN, the harmonisation of statistical methods and practices including the gathering and dissemination of statistics, thus enabling them to use, on a mutually acceptable basis, statistics on trade in goods and services, foreign direct investment and, more generally, on any other area covered by this Agreement which lends itself to statistical data collection, processing, analysis and dissemination.

  • Usage Statistics The Distributor shall ensure that the Publisher will provide access to both composite system-wide use data and itemized data for the Licensee, the Participating Institutions, individual campuses and labs, on a monthly basis. The statistics shall meet or exceed the most recent project Counting Online Usage of NeTworked Electronic Resources ("COUNTER") Code of Practice Release,3 including but not limited to its provisions on customer confidentiality. When a release of a new COUNTER Code of Practice is issued, the Distributor shall ensure that the Publisher will comply with the implementation time frame specified by COUNTER to provide usage statistics in the new standard format. It is more than desirable that the Standardized Usage Statistics Harvesting Initiative (SUSHI) Protocol4 is available for the Licensee to harvest the statistics.

  • Financial Statements Statistical Data 2.6.1. The financial statements, including the notes thereto and supporting schedules included in the Registration Statement and the Prospectus, fairly present the financial position and the results of operations of the Company at the dates and for the periods to which they apply. Such financial statements have been prepared in conformity with generally accepted accounting principles of the United States, consistently applied throughout the periods involved, and the supporting schedules included in the Registration Statement present fairly the information required to be stated therein. No other financial statements or supporting schedules are required to be included in the Registration Statement. The Registration Statement discloses all material off-balance sheet transactions, arrangements, obligations (including contingent obligations), and other relationships of the Company with unconsolidated entities or other persons that may have a material current or future effect on the Company's financial condition, changes in financial condition, results of operations, liquidity, capital expenditures, capital resources, or significant components of revenues or expenses. There are no pro forma or as adjusted financial statements which are required to be included in the Registration Statement and the Prospectus in accordance with Regulation S-X which have not been included as so required. 2.6.2. The statistical, industry-related and market-related data included in the Registration Statement and the Prospectus are based on or derived from sources which the Company reasonably and in good faith believes are reliable and accurate, and such data agree with the sources from which they are derived.

  • Statistical Data The statistical, industry-related and market-related data included in the Registration Statement, the Sale Preliminary Prospectus, and/or the Prospectus are based on or derived from sources that the Company reasonably and in good faith believes are reliable and accurate, and such data materially agree with the sources from which they are derived.

  • Statistical Sampling Documentation a. A copy of the printout of the random numbers generated by the “Random Numbers” function of the statistical sampling software used by the IRO.‌ b. A description or identification of the statistical sampling software package used by the IRO.‌