Distributed Database Systems | Database Management System

Distributed Database Systems | Database Management System

A Distributed Database is stored on several computers which are:

  1. Geographically dispersed
  2. Vary in size & function (may vary from workstation to mainframe)
  3. Separately administered
  4. Inter-connected by WAN/LAN using different communication media.

Node or Site

A computer forming part of a Distributed Database System is called a Node or a Site.

Characteristics of a Distributed DBMS

A Distributed DBMS is characterized by:

(a) Sharing of Data : Users at one site may be able to access data residing at other sites. For example, in case of banking database, it is possible to transfer funds from an account maintained at one branch to another account maintained at another branch.

(b) Autonomy : There is a Global Database Administrator responsible for the entire system in addition, Local Database Administrator of each site is able to retain a degree of control over its own database.

(c) Improved System Availability: In distributed Systems, each data item is replicated at multiple sites, so that failure of Some sites does not result in System failure. A Global Transaction needing Certain data may find it at several sites despite failure of some sites. When a failed site recovers, its database is updated before integration into the System.

Advantages of Data Replication

(a) Higher availability of system: if one site containing a relation r fails, the relation r may be found replicated at other sites; thus system continues to process the queries related to relation r despite failure of Some Systems.

(b) Increased Parallelism: The Read operations on a relation can be processed locally in parallel at the sites where it is replicated. In this case, data movement across sites is avoided.

Disadvantages of Data Replication

Increased Overhead on Update: The System has to ensure that all copies of replicated data got to be updated to ensure consistency of data in the Distributed Database. This results in increased movement of data across sites and increased processing overheads at sites containing replica of relation being updated.

Hence, in general, replication enhances the performance of Read operations but incurs greater update overheads.