RookieDB

A Java-based SQL database system

Overview

RookieDB is a bare-bones database implementation supporting the execution of simple transactions in series. Throughout this project, I added support for B+ tree indices, efficient join algorithms, query optimization, multigranularity locking to allow concurrent execution of transactions, and database recovery.

Project Details

1. B+ Tree Indices

In the first part, I implemented B+ tree indices, which are crucial for efficient data retrieval. The key components include:

  • DataBox: Represents data types (Boolean, Int, Float, Long, String) in RookieDB.
  • RecordId: Identifies a record by its page and entry number.
  • BPlusTree, BPlusNode, LeafNode, InnerNode: Classes to manage the structure and operations of the B+ tree.

2. Efficient Join Algorithms

Implemented join algorithms to optimize query execution:

  • Block Nested Loop Join (BNLJ): Efficiently joins large tables by using blocks of data pages.
  • Grace Hash Join (GHJ): Uses hash partitioning to manage memory efficiently during join operations.
  • Sort Merge Join (SMJ): Sorts and merges tables to perform joins.

3. Query Optimization

Developed query optimization techniques to improve the efficiency of database operations:

  • Single Table Access Selection: Chooses the optimal way to access single tables.
  • Join Selection: Selects the best join order and method.
  • Cost-Based Optimization: Utilizes cost estimates to determine the most efficient query execution plan.

4. Multigranularity Locking

Implemented a locking mechanism to ensure transaction isolation and consistency:

  • Lock Types: S (shared), X (exclusive), IS (intent shared), IX (intent exclusive), SIX (shared with intent exclusive).
  • Lock Manager: Manages locks and enforces multigranularity constraints.
  • LockContext: Represents resources in the hierarchy and handles locking at different levels.

5. Database Recovery

Ensured the database can recover from crashes while maintaining ACID properties:

  • Write-Ahead Logging (WAL): Logs changes before applying them to the database.
  • Checkpointing: Periodically saves the state of the database to reduce recovery time.
  • ARIES Recovery Algorithm: Utilizes analysis, redo, and undo phases to restore the database to a consistent state.

6. NoSQL Features

Explored NoSQL features by working with a document-based database:

  • Handling Documents: Managed collections of documents with fields that are not primitive data types.
  • Aggregation Framework: Used MongoDB’s aggregation framework to perform complex data queries and transformations.
  • Query Examples: Demonstrated complex queries involving match, group, sort, limit, lookup, and project stages.

Conclusion

RookieDB provided a comprehensive experience in building a fully functional database system. From implementing fundamental data structures like B+ trees to ensuring robust transaction management with locking and recovery mechanisms, this project encompassed a wide range of essential database concepts and techniques.

Acknowledgments