PageId
is a class containing table name, base rid, column index, base or tail, and page’s sequential ID (within the column)
- Bufferpool
- maintain in-memory cache of pages (frames)
pages: dict[PageId, Page]
dirty_pages: set[PageId]
range_refcount: dict[int, int]
tail_index
: for each page range, store RID mapping; when range_refcount
for that page range reaches zero, then delete the range RID index after a while or if we’re nearing memory limit.
- when len(frames) exceeds configured breakpoint (e.g. a percentage of memory, or fixed memory usage like 1GiB), use LFU eviction policy
- handle read & prefetch & write requests and automatically pin frames when they are requested
- Read can be done simultaneously, but write requires exclusivity. A write requires all previous reads and write to be finished.
- handle unpin request given
PageId
- PageReader
- reads a page given a PageId and database base path; used by bufferpool only
- PageWriter
- writes a page given a PageId and database base path; used by bufferpool management only
- RecordReader
- should it be a context manager?
- can be configured to read all or select columns of a table
- functionality
- prefetch for linear scan
- prefetch for indexed scan
- allow for predicates
- unpin pages
- Query implementation
- Select (match search key via an index)
- Select (match search key via linear scan)
- Insert
- Update
- Sum
- Delete
- Rename Record to TableRow
- Write-Ahead Log
- writes records changed for each transaction to disk
- Writer
- periodically saves dirty pages to disk
- Create a new QueryResult class that doesn’t have metadata columns
- Implicit wrapping of all single operations as transactions
- Page Merge Worker
- What do we need to store in addition to the data? Sequence number? How do we represent & track merged pages on disk?
- So base pages are all stored in a single file. When base pages and tail pages are merged, the merged offset are stored at a fixed offset depending on the merge sequence number. Alternatively, to optimize away the wasted space when a merge happens without a full page range, we can store file offsets to the beginning of each merge.