Hash join Sample Clauses

Hash join. The hash join requires the creation of a hash table for the triples of one rela- tion, with the key being the atom and the values being the triples containing that atom. Looking up an atom in the hash table can be achieved in amor- tized O(l) time. To minimize collisions, it is better to hash the triples of the smaller list. Since all triples in both relations have to be visited, it does not help to hash the triples of the larger relation. Once the hash table has been built for the smaller relation, the triples of the larger relation are iterated and use the hash table to see if they satisfy the join condition. Each time two triples satisfy the condition, they are both are added to the result set. The I/O cost for the hash join is #blocks(Rl) + #blocks(R2), where Rl and R2 are the joining relations. All blocks of both relations have to be visited, so the hash join has a very fixed I/O performance profile.