File

In a database, a file is collection of pages which contain records. More specifically, a file is a doubly-linked list where a page is a node connected to other pages. Pages can be considered as arrays of records.

%%🖋 Edit in Excalidraw, and the light exported image%%

A DBMS can implement the CRUD (Create Read Update Delete) principle only if it can:

  • Read a particular record (using it’s record id)
  • Insert, delete & modify a record
  • Scan all records. It should also be able to use conditions to scan particular records (such as when constructing SQL queries)

The common file structures are:

Analysing Runtimes

In order to properly analyse what makes a file structure optimized, we first need to identify a basic operation, similar to analysing time complexity of algorithms.

The basic operation for database systems is bringing a page from HDD to RAM (which is require in order to process it). This is also known as the number of page accesses also known as disk I/O operations.

Heap File

  • Simplest of the three file structures
  • Pages are allocated (and de-allocated) as records grow and shrink. That is, it is a dynamic data structure
  • Not sorted

%%🖋 Edit in Excalidraw, and the light exported image%%

OperationSpeedWhy?
InsertionFastBecause it is basically an unsorted linked list, has very fast insertion and deletion operations
as we can simply add new pages to the end of the file
DeletionFastSame process as insertion
Querying an attribute /
Return pages satisfying some condition
SlowNeed to linear search through all pages.

Sorted File

  • Like heap file, but records (and hence pages) are sorted based on some attribute.
  • Very fast for range queries, but hard to maintain, since records (and sometimes pages) need to be reordered every time a new record is inserted

%%🖋 Edit in Excalidraw, and the light exported image%%

OperationSpeedWhy?
InsertionSlowNeed to resort the file every time a new record is inserted
DeletionFastDeleting a record still keeps the file sorted.
Querying an attribute /
Return pages satisfying some condition
FastCan binary search through records (and pages)

Indexed File

Another file structure uses indexes to allow faster file access.

Database Indexing