{"component": "clause", "props": {"groups": [{"snippet_links": [{"key": "employees-may", "type": "clause", "offset": [8, 21]}, {"key": "hours-per-day", "type": "clause", "offset": [39, 52]}, {"key": "a-school", "type": "clause", "offset": [78, 86]}, {"key": "day-probationary-period", "type": "clause", "offset": [161, 184]}, {"key": "contractual-benefits", "type": "clause", "offset": [292, 312]}], "samples": [{"hash": "9DRbaio7rUY", "uri": "/contracts/9DRbaio7rUY#cleaner", "label": "Collective Bargaining Agreement", "score": 29.3004779816, "published": true}, {"hash": "9iY1TYK4fAf", "uri": "/contracts/9iY1TYK4fAf#cleaner", "label": "Collective Bargaining Agreement", "score": 22.7146072388, "published": true}], "size": 4, "snippet": "Cleaner employees may be hired up to 8 hours per day and may be scheduled for a school-year or up to a 52-week duration. Cleaners will have a sixty (60) working day probationary period. Seniority shall be accrued. The following provisions apply to this classification and supersede any other contractual benefits:", "hash": "c4e2fa6057d4d836af659c6a4745a9e5", "id": 3}, {"snippet_links": [{"key": "employed-by-the-company", "type": "definition", "offset": [3, 26]}, {"key": "associated-equipment", "type": "clause", "offset": [168, 188]}], "samples": [{"hash": "f40abklxZR3", "uri": "/contracts/f40abklxZR3#cleaner", "label": "Collective Bargaining Agreement", "score": 30.0671768188, "published": true}, {"hash": "1EFo3o6zTtc", "uri": "/contracts/1EFo3o6zTtc#cleaner", "label": "Collective Bargaining Agreement", "score": 29.61003685, "published": true}, {"hash": "tP9ldVaL16", "uri": "/contracts/tP9ldVaL16#cleaner", "label": "Collective Bargaining Agreement", "score": 28.2759227753, "published": true}], "size": 16, "snippet": "Is employed by the Company to clean aircraft exteriors, interiors (including furnishings), perform other operational cleaning functions relative to aircraft, parts and associated equipment.", "hash": "96d13c185f2ef5c1e33d893919f7b0f6", "id": 1}, {"snippet_links": [{"key": "work-of-the", "type": "clause", "offset": [4, 15]}], "samples": [{"hash": "1iEAsteOidw", "uri": "/contracts/1iEAsteOidw#cleaner", "label": "Collective Bargaining Agreement", "score": 25.0540409088, "published": true}, {"hash": "dMNe4ct9te5", "uri": "/contracts/dMNe4ct9te5#cleaner", "label": "Collective Bargaining Agreement", "score": 24.9445457458, "published": true}, {"hash": "lIs7rOnMux8", "uri": "/contracts/lIs7rOnMux8#cleaner", "label": "Collective Bargaining Agreement", "score": 23.0068454742, "published": true}], "size": 4, "snippet": "The work of the Cleaner classification, depending upon assignment, may include the following:", "hash": "a90aef5d2b8183ac47d5e59066238775", "id": 2}, {"snippet_links": [{"key": "mailing-address", "type": "definition", "offset": [7, 22]}, {"key": "city-of", "type": "clause", "offset": [47, 54]}, {"key": "state-of", "type": "clause", "offset": [76, 84]}, {"key": "service-provider", "type": "clause", "offset": [119, 135]}], "samples": [{"hash": "gmZAWraTXk3", "uri": "/contracts/gmZAWraTXk3#cleaner", "label": "Cleaning Services Contract", "score": 29.320892334, "published": true}, {"hash": "6HcKhE0CXDz", "uri": "/contracts/6HcKhE0CXDz#cleaner", "label": "Cleaning Services Contract", "score": 23.1615333557, "published": true}, {"hash": "9XlljZzErXP", "uri": "/contracts/9XlljZzErXP#cleaner", "label": "Cleaning Services Contract", "score": 21.9199180603, "published": true}], "size": 3, "snippet": "with a mailing address of ___________________, City of ___________________, State of ___________________, (\u201cCleaner\u201d). Service Provider and Client are each referred to herein as a \u201cParty\u201d and, collectively, as the \"Parties.\"", "hash": "95121876cbcfbe1d7ae7c9faab63f84a", "id": 4}, {"snippet_links": [{"key": "web-service", "type": "definition", "offset": [72, 83]}, {"key": "the-service", "type": "clause", "offset": [157, 168]}, {"key": "optional-parameters", "type": "clause", "offset": [292, 311]}, {"key": "type-of", "type": "definition", "offset": [350, 357]}, {"key": "content-of-the", "type": "clause", "offset": [578, 592]}, {"key": "according-to", "type": "definition", "offset": [676, 688]}, {"key": "for-example", "type": "clause", "offset": [702, 713]}, {"key": "to-apply", "type": "clause", "offset": [815, 823]}, {"key": "the-source", "type": "clause", "offset": [1003, 1013]}, {"key": "the-user", "type": "definition", "offset": [1080, 1088]}, {"key": "structural-information", "type": "clause", "offset": [1242, 1264]}, {"key": "the-method", "type": "definition", "offset": [1394, 1404]}, {"key": "terms-of", "type": "definition", "offset": [2010, 2018]}, {"key": "in-short", "type": "clause", "offset": [2085, 2093]}, {"key": "the-value", "type": "clause", "offset": [2113, 2122]}, {"key": "the-default-value", "type": "definition", "offset": [2288, 2305]}, {"key": "the-current", "type": "clause", "offset": [2367, 2378]}, {"key": "supported-languages", "type": "clause", "offset": [2405, 2424]}, {"key": "a-list", "type": "definition", "offset": [3124, 3130]}, {"key": "the-sub", "type": "clause", "offset": [3210, 3217]}, {"key": "of-terms", "type": "clause", "offset": [3308, 3316]}, {"key": "a-sub", "type": "clause", "offset": [3475, 3480]}, {"key": "in-addition", "type": "clause", "offset": [3543, 3554]}, {"key": "these-terms", "type": "clause", "offset": [3584, 3595]}], "samples": [{"hash": "iekqAdfiy4S", "uri": "/contracts/iekqAdfiy4S#cleaner", "label": "Grant Agreement", "score": 23.9125595093, "published": true}, {"hash": "cio9hfFSNnc", "uri": "/contracts/cio9hfFSNnc#cleaner", "label": "Grant Agreement", "score": 23.9125595093, "published": true}], "size": 2, "snippet": "The Cleaner module described in 2.1.3 is also available as a standalone web service accessible from \u2587\u2587\u2587\u2587://\u2587\u2587\u2587.\u2587\u2587\u2587\u2587.\u2587\u2587/soaplab2-axis/#ilsp.ilsp_cleaner_row. The service has one mandatory parameter:\n1. The input parameter is the URL of a web document to be cleaned. The Cleaner also uses five optional parameters:\n1. The outputType parameter sets the type of the output. It can be: i) a text file containing only the clean text, ii) an XML file containing metadata of the web document and the clean text only, and iii) an XML file containing metadata of the web document and the content of the web document annotated as boilerplate or text. Users can select the type of output according to their needs. For example, the first type might be useful for somebody who has already downloaded web documents and would like to apply de-duplication on document level by using only the clean text of the downloaded web documents. The second type could be useful for someone who would like to extract metadata from the source web documents and keep only the clean text from these sources. If the user is interested in both boilerplate and clean text, the third type should be selected. It is worth mentioning that both the second and third types provide structural information about the web document, by using the attribute type and the values title, heading or listitem.\n2. The methodsList parameter sets the method for removing boilerplate. Boilerpipe provides six methods: ArticleExtractor, ArticleSentencesExtractor, DefaultExtractor, KeepEverythingExtractor, LargestContentExtractor, and NumWordsRulesExtractor (default). Short descriptions of the methods are reported at \u2587\u2587\u2587\u2587://\u2587\u2587\u2587\u2587\u2587\u2587\u2587\u2587\u2587\u2587.\u2587\u2587\u2587\u2587\u2587\u2587\u2587\u2587\u2587\u2587.\u2587\u2587\u2587/ svn/trunk/boilerpipe-core/ javadoc/1.0/index.html. The attribute crawlinfo with value boilerplate will be added to every paragraph of the web document which has been classified as boilerplate. Remaining paragraphs constitute the clean text.\n3. The minimumLength parameter defines the minimum accepted length in terms of tokens for each paragraph of the clean text. Users not interested in short paragraphs can set the value of this parameter accordingly. The attribute crawlinfo with value ooi-length will be added to every paragraph of the clean text with length less than minimumLength. The default value is 10.\n4. The language parameter sets the targeted language. The current list of ISO 639 codes for supported languages includes en, el, es, fr, it and de. Selecting one of these languages implies that the user is only interested in content in this language. Therefore, the embedded language identifier will be applied on each \u201caccepted\u201d paragraph (i.e. each paragraph that has not been classified as boilerplate and has length over the minimumLength), and a crawlinfo attribute with value \u2587\u2587\u2587-\u2587\u2587\u2587\u2587 will be added to every paragraph that is not in the targeted language. If there is no targeted language (default), the embedded language identifier will be applied on the main content (clean text) of the web document, and the ISO code of the identified language code will fill the element <language>.\n5. The termList is a list of triplets (<relevance weight, term, topic-class>) that define the domain, or the sub-domains. This parameter can be provided by uploading an already existing file with a list of terms as described in section 2.1.10 above. The embedded text to topic classifier will be applied on the document and, if the document is classified as relevant to a sub-domain, the <subdomain> container will be filled accordingly. In addition, the Cleaner will search for these terms in each \u201caccepted\u201d paragraph. If one or more terms are found in a paragraph, the attribute topic will be added to this paragraph, and found terms will be stored as the attribute value.", "hash": "1502a3ed5662a9ac956ddf9a9841e39d", "id": 5}, {"snippet_links": [{"key": "web-page", "type": "clause", "offset": [42, 50]}, {"key": "to-ensure", "type": "clause", "offset": [288, 297]}, {"key": "the-production", "type": "clause", "offset": [298, 312]}, {"key": "modified-version", "type": "definition", "offset": [374, 390]}, {"key": "structural-information", "type": "clause", "offset": [452, 474]}], "samples": [{"hash": "iekqAdfiy4S", "uri": "/contracts/iekqAdfiy4S#cleaner", "label": "Grant Agreement", "score": 23.9125595093, "published": true}, {"hash": "cio9hfFSNnc", "uri": "/contracts/cio9hfFSNnc#cleaner", "label": "Grant Agreement", "score": 23.9125595093, "published": true}], "size": 2, "snippet": "Apart from its textual content, a typical web page also contains certain \u201cnoise\u201d elements including navigation links, advertisements, disclaimers, etc. (often called boilerplate) of only limited or no use for linguistic purposes. Such irrelevant parts should be removed or marked as such to ensure the production of good-quality language resources. For this task FMC uses a modified version of Boilerpipe9 (Kohlsch\u00fctter et al, 2010) that also extracts structural information like title, heading and list item. It also segments text in paragraphs exploiting the presence of specific HTML tags like <p>, </br> and <li>. Paragraphs judged to be boilerplate and/or detected as titles, etc. are properly annotated (see subsection 2.1.8)", "hash": "aadd6549d76848ed978680eb59bf729c", "id": 6}, {"snippet_links": [{"key": "to-ensure", "type": "clause", "offset": [0, 9]}, {"key": "common-areas", "type": "clause", "offset": [30, 42]}, {"key": "lift-lobbies", "type": "clause", "offset": [56, 68]}, {"key": "parking-area", "type": "definition", "offset": [79, 91]}, {"key": "the-complex", "type": "definition", "offset": [167, 178]}], "samples": [{"hash": "lIjv9JnvKup", "uri": "/contracts/lIjv9JnvKup#cleaner", "label": "Lease Agreement (Augmedix, Inc.)", "score": 33.2313499451, "published": true}, {"hash": "hZNkbGS3yKn", "uri": "/contracts/hZNkbGS3yKn#cleaner", "label": "Lease Agreement (Augmedix, Inc.)", "score": 33.2313499451, "published": true}], "size": 2, "snippet": "To ensure hygiene cleaning of common areas, stairwells, lift lobbies, rooftop, parking area, garden/garden area, void areas, security post and usable areas outside of the complex", "hash": "7edeeb2193dca942dea7cb62a72373d1", "id": 7}, {"snippet_links": [{"key": "the-parties", "type": "definition", "offset": [0, 11]}, {"key": "agree-to", "type": "clause", "offset": [19, 27]}, {"key": "provisions-regarding", "type": "clause", "offset": [42, 62]}], "samples": [{"hash": "2VoLY3Nw2wO", "uri": "/contracts/2VoLY3Nw2wO#cleaner", "label": "Collective Agreement", "score": 31.3425521851, "published": true}, {"hash": "16yik4ZDdMa", "uri": "/contracts/16yik4ZDdMa#cleaner", "label": "Collective Agreement", "score": 29.7631568909, "published": true}], "size": 2, "snippet": "The parties hereby agree to the following provisions regarding the Cleaner position:", "hash": "c99d5f280aff1fd7175f84669de3ce15", "id": 8}, {"snippet_links": [{"key": "employees-shall", "type": "clause", "offset": [8, 23]}], "samples": [{"hash": "abNnlaDtSgM", "uri": "/contracts/abNnlaDtSgM#cleaner", "label": "Labor Agreement", "score": 24.6143417358, "published": true}, {"hash": "fDKq7pkZFh7", "uri": "/contracts/fDKq7pkZFh7#cleaner", "label": "Labor Agreement", "score": 24.4488334656, "published": true}], "size": 2, "snippet": "Cleaner employees shall clean the interior and exterior of equipment such as service cars, buses, trolleys, cutaways, trucks and vans.", "hash": "dc34725a2bb9f7219d4541a6f383198b", "id": 9}, {"snippet_links": [{"key": "related-to", "type": "definition", "offset": [77, 87]}, {"key": "extension-of-the", "type": "clause", "offset": [194, 210]}], "samples": [{"hash": "caB1wGb0Zb1", "uri": "/contracts/caB1wGb0Zb1#cleaner", "label": "Grant Agreement", "score": 22.8285617828, "published": true}, {"hash": "5ay6FIyNrTu", "uri": "/contracts/5ay6FIyNrTu#cleaner", "label": "Grant Agreement", "score": 22.8285617828, "published": true}], "size": 2, "snippet": "The Cleaner aims to detect and remove boilerplate text that typically is not related to the main content (e.g. navigation links, advertisements, disclaimers, etc.) from a web document. It is an extension of the Boiperplate remover service described in subsection 3.3 of D4.2", "hash": "a323c32589e8b4578271bd658694c93d", "id": 10}], "next_curs": "ClASSmoVc35sYXdpbnNpZGVyY29udHJhY3RzciwLEhZDbGF1c2VTbmlwcGV0R3JvdXBfdjU2IhBjbGVhbmVyIzAwMDAwMDBhDKIBAmVuGAAgAA==", "clause": {"children": [["", ""], ["tech-ii-paint-and-body-employee", "Tech II Paint and Body Employee"], ["master-paint-body-technician-employee", "Master Paint & Body Technician Employee"], ["tech-i-paint-and-body-employee", "Tech I Paint and Body Employee"], ["tech-iii-paint-and-body-employee", "Tech III Paint and Body Employee"]], "title": "Cleaner", "parents": [["travel-mileage", "Travel/Mileage"], ["military-leave", "Military Leave"], ["vacations", "VACATIONS"], ["vacation-bump-downs", "Vacation Bump Downs"], ["union-viewing-of-recorded-data", "Union Viewing of Recorded Data"]], "size": 65, "id": "cleaner", "related": [["cleaning", "Cleaning", "Cleaning"], ["wash", "Wash", "Wash"], ["trash", "Trash", "Trash"], ["welding", "Welding", "Welding"], ["containers", "Containers", "Containers"]], "related_snippets": [], "updated": "2025-07-24T06:49:08+00:00", "also_ask": ["What are the essential duties and standards that must be specified for the Cleaner role?", "How can liability for damage or loss during cleaning be allocated or limited?", "What termination or replacement rights should be included for non-performance?", "How does this clause compare to industry-standard cleaning service agreements?", "What are the main enforceability challenges if the Cleaner fails to meet obligations?"], "drafting_tip": "Specify the cleaner's duties and standards to ensure clarity of expectations; define payment terms to prevent disputes; require insurance coverage to mitigate liability risks.", "explanation": ""}, "json": true, "cursor": ""}}