Tags:conceptdatabasedbmsdbmsarchitecturequeryprocessorstoragemanager Status:🟩
DBMS Architecture
Summary
The architecture of a DBMS consists of key components that handle data storage, processing, and interaction. These include:
- Interfaces for user and application communication.
- Query Processor for query analysis, optimization, and execution.
- Storage Manager for data handling, consistency, and recovery.
- Database with stored data, indexes, and metadata.
Data moves through a storage hierarchy (files, buffers, cache) before reaching the CPU core for processing. While some databases can fit entirely in memory, growing data may eventually require disk-based storage management.
Details
The architecture of a DBMS can be split up into different parts.
It mainly consists of 4 main components:
-
Interfaces: These allow users or applications to communicate with the database using queries, tools, or commands.
-
Query Processor: This part handles queries by analyzing, optimizing, and executing them to retrieve the requested data.
-
Storage Manager: This manages how data is stored, retrieved, and updated in the database. It ensures data consistency, manages transactions, and handles backups and recovery.
-
Database: This includes the actual stored data, indexes for fast searches, and a catalog containing metadata about the database. It could be a cloud or on a physical machine.
The Security manager exists to authenticate users to manipulate tables.
Query Processor
The query processor parses an SQL query into a plan that can be executed. It then optimized that plan and lastly executes the optimized plan using relation operators (join, group by, etc.).
Storage Manager
Storage Manager
In a disk based architecture, DBMSs stores a database as one or more files on a disk, usually in a unique format that only they can read.
The storage manager is responsible for maintaining the files of a database.
It organizes the files as a collection of pages. It tracks what data has been read and written to pages and the available space that is left within those pages.
Link to original
Data Movement through Storage Hierarchy
From bottom to top:
-
Files are made up from pages.
-
The pages are loaded into a buffer.
-
The data in the buffer manager is then loaded into cache.
-
The core processes the data from the cache.
-
Files and Pages:
- Files in a DBMS are typically divided into pages (fixed-size chunks of data) for storage and management efficiency.
-
Buffer Manager:
- The buffer manager loads pages from disk (secondary storage) into main memory (RAM). This helps reduce the need to repeatedly access slower disk storage.
-
Cache:
- When the core (CPU) needs to process data, it first accesses it from a cache, a smaller and faster memory layer closer to the CPU. The cache contains data frequently accessed or recently used.
-
Core:
- The CPU core processes data directly from the cache whenever possible, as cache memory is faster than main memory
Fit into RAM?
Sometimes all the data might fit into the RAM. There are even databases with terabytes of main memory.
It would never need to wait for disk while executing tasks (still wait for disk from persistent storage).
However there comes a day where the data doesn’t fit in the memory anymore. For this read more about scaling out and scaling up.