Storage cont. #
Sorted files cont. #
Since inserting into a ordered structure is expensive, we keep track of new inserts in a separate file.
Master files are ordered, overflow file are unordered.
New inserts go to the overflow file.
When we search:
- Start by searching the ordered master file,
\( O(\lg n) \)
- If we don’t find it, it is possible that it is in the overflow file
- Search the unordered overflow file, \( O(n) \) however it is a small file
If at any point during the search process we find a match, the search returns. Periodically, the overflow file is merged into the master file.
Hashing functions #
- Applies hashing techniques to DBMS
- Hashing for files on disk is called external hashing
In summary #
There are 3 ways to order files
- unordered
- sorted
- organized by hash
We can order the same table in multiple ways by using the tuple’s indexes. That way we can do a fast search on any attribute.
Indexing structures #
File: Indexing slides
DBMS can take care of building indexes for specific attributes for you.
B-Tree #
A lot of DBMS use B+ Tree for indexing files.