click below
click below
Normal Size Small Size show me how
DATABASE - 5
Distributed Database Management Systems
| Question | Answer |
|---|---|
| is a collection of multiple, logically interrelated databases spread across physically interconnected locations (nodes), appearing as a single, unified system to the user. | distributed database |
| architecture consists of multiple interconnected nodes, where each node has its own local database and DBMS, but all nodes work together as a single system. | Distributed Database Management System (DDBMS) |
| Basic Structure of DDBMS Architecture | Client → Network → Multiple Database Nodes |
| Components of DDBMS | Query Processor Transaction Manager Data Manager Communication Manager |
| Handles user queries and decides where (which node) the data should be retrieved from | Query Processor |
| Manages transactions across multiple nodes and ensures consistency during execution | Transaction Manager |
| Responsible for storing, retrieving, and updating data within each node | Data Manager |
| Handles communication between nodes to exchange data and coordinate operations. | Communication Manager |
| Importance of Distributed Database Systems | Scalability High Availability Fault Tolerance |
| Ability to add more servers (nodes) to handle growth | Scalability |
| System remains accessible even if some nodes fail • Users can still access data anytime | High Availability |
| System can continue operating despite failures | Fault Tolerance |
| Types of Distributed Database System: | Homogeneous Database Heterogeneous Database Client-Server Distributed Database System Peer-to-Peer Distributed Database System Cloud-Based Distributed Database System |
| All sites use the same DBMS, data model, and structure, making communication and data sharing easier. The data may be distributed, but the system is uniform across all locations. | Homogeneous Database |
| Different sites use different DBMSs, schemas, or data models, which makes integration and query processing more complex. Special middleware or translators are needed for communication between systems. | Heterogeneous Database |
| The server manages and stores the database, while clients send requests and receive results over a network. It provides centralized control with distributed access. | Client-Server Distributed Database System |
| All nodes have equal roles, and each can store data and process requests without a central server. This structure improves fault tolerance and decentralization. | Peer-to-Peer Distributed Database System |
| These databases are hosted on cloud platforms and distributed across multiple regions for scalability and availability. They are often provided as managed services, reducing infrastructure management. | Cloud-Based Distributed Database System |
| A relation is divided into smaller parts called fragments, and each fragment is stored at different sites where it is needed. The fragments must be designed so that the original relation can be reconstructed without losing any data. | Data Fragmentation |
| Divides a table into groups of rows (tuples). Each fragment contains a subset of records. Commonly based on: Location, Conditions (e.g., age > 18). | Horizontal Fragmentation |
| Divides a table into groups of columns (attributes). Each fragment contains selected attributes. | Vertical Fragmentation |
| ensure that data is available across multiple locations while remaining correct and synchronized. | Replication and Data Consistency |
| is the process of creating and maintaining copies of data across multiple nodes in a distributed system to improve availability, fault tolerance, and performance. | Replication |
| stores a copy of all data on every node, ensuring high availability and reliability but increasing storage and update overhead.. | Full Replication: |
| stores only selected data on certain nodes, reducing storage costs while requiring careful planning to maintain data availability | Partial Replication: |
| ensures that all copies of data in a distributed database have the same value at a given time. It is difficult to maintain due to network delays, system failures, and simultaneous updates. | Data Consistency |
| ensures that all replicas always reflect the latest updated data before any read operation. This guarantees accurate results but may reduce system performance due to synchronization delays. | Strong Consistency |
| allows temporary differences between replicas but ensures they become consistent over time. It improves performance and availability but may return outdated data during synchronization. | Eventual Consistency |
| is a transaction that involves multiple databases or systems, ensuring that all operations either succeed or fail together. It is essential for maintaining data integrity across different locations. | Distributed Transaction |
| Need for Distributed Transactions | Atomicity, Consistency, Isolation & Durability (ACID). |
| 2PC is a protocol that ensures all participating systems agree before committing a transaction. It consists of a prepare phase and a commit/rollback phase to guarantee atomic execution. | 1. Two-Phase Commit (2PC) |
| Two-Phase Commit (2PC) | ➢ Step 1: Start Transaction ST ➢ Step 2: Execute Operations EO ➢ Step 3: Prepare Phase (Voting) PP ➢ Step 4: Commit Decision CD ➢ Step 5: Final Execution FE |
| Three-Phase Commit (3PC) | 3PC improves 2PC by adding a pre-commit phase to reduce the chances of system blocking. It allows better fault tolerance by separating decision-making into more steps. |