Study Notes on Database Design

By BYJU'S Exam Prep

Updated on: September 25th, 2023

Welcome to our comprehensive study notes on database design! This resource is designed to provide you with a solid foundation in the principles and practices of designing efficient and robust database systems. Whether you’re a student studying computer science, an aspiring database administrator, or a professional looking to enhance your skills, these study notes will guide you through the essential concepts and techniques of database design.

Database design is a critical aspect of information management, as it involves structuring and organizing data in a way that facilitates efficient storage, retrieval, and manipulation. These study notes will cover key topics such as entity-relationship modelling, normalization techniques, database schema design, and data integrity constraints. By understanding these fundamental principles, you will be equipped with the knowledge and skills necessary to design databases that meet the specific requirements of various applications and industries. So, let’s dive in and explore the fascinating world of database design.

Download Formulas for GATE Computer Science Engineering – Programming & Data Structures

Types of keys

There are basically four types of keys:

• Candidate Key: Minimal Set of attributes that can be used to identify data records uniquely.
• Super key: Set of attributes that can be used to identify data records uniquely (may or may not be minimal).
• Foreign Key: Foreign key is basically a set of attributes referencing the primary key of the same relation or some other relation. It allows null values.
• Alternative key: Candidate key other than Primary key. it also allows null values.

Schema Refinement/Normalization

Decomposition of complex records into simple records. Normalization reduces redundancy using the non-loss decomposition principle.

Decomposition

Splitting a relation R into two or more sub relations R1 and R2 such that redundancies can be removed. A fully normalized relation must have all the functional dependencies such that the determiner is a primary key or a super key.

Decomposition should satisfy: (i) Lossless join, and (ii) Dependency preserve

Lossless Join Decomposition

Join between the sub relations should not create any additional tuples( Spurious tuples) or there should not be a case such that more tuples in R1 than R2.

R R1  R2 (Lossy)

R ≡ R1  R2 (Lossless)

Dependency Preservence: Because of decomposition, there must not be loss of any single dependency i.e., if F1 and F2 be the decomposed FD set for R1 and R2, then F= F1 U F2.

Functional Dependency (FD): Dependency between the attribute is known as functional dependency. Let R be the relational schema and X, Y be the non-empty sets of attributes and t1, t2, …,tn are the tuples of relation R. X Y {values for X functionally determine values for Y}

Trivial Functional Dependency: If X Y, then X Y will be trivial FD.

Here, X and Y are set of attributes of a relation R.

In trivial FD, there must be a common attribute at both the sides of ‘’ arrow.

Non-Trivial Functional Dependency: If X Y = φ (no common attributes) and X Y satisfies FD, then it will be a non-trivial FD.

(no common attribute at either side of the ‘’ arrow)

Case of semi-trivial FD

Sid Sid Sname (semi-trivial)

Because on decomposition, we will get

Sid Sid (trivial FD) and

Sid Sname (non-trivial FD)

Properties of Functional Dependence (FD)

• Reflexivity If X Y, then X Y (trivial)
• Transitivity If X Y and Y Z, then X Z
• Augmentation If X Y, then XZ YZ
• Splitting or Decomposition If X YZ, then X Y and X Z
• Union If X Y and X Z, then X YZ

Attribute Closure: Suppose R(X, Y, Z) be a relation having a set of attributes i.e., (X, Y, Z), then (X+) will be an attribute closure which functionally determines other attributes of the relation (if not all then at least itself).

Normal Forms/Normalization: In relational database design, normalization is the process of organizing data to minimize redundancy. Normalization usually involves dividing a database into two or more tables and defining the relationship between the tables. The normal forms define the status of the relation between the individuated attributes. There are five types of normal forms

First Normal Form (1NF): Relation should not contain any multivalued attributes or relation should contain atomic attribute. The main disadvantage of 1NF is high redundancy.

Second Normal Form (2NF): Relation R is in 2NF if and only if R should be in 1NF, and R should not contain any partial dependency.

Partial Dependency: Let R be the relational schema having X, Y, and A, which are non-empty sets of attributes, where X = Any candidate key of the relation, Y = Proper subset of any candidate key, and A = Non-prime attribute (i.e., A doesn’t belong to any candidate key)

In the above example, X A already exists and if Y A will exist, then it will become a partial dependency, if and only if

• Y is a proper subset of the candidate key.
• A should be a non-prime attribute.

If any of the above two conditions fail, then Y A will also become a fully functional dependency.

Full Functional Dependency: A functional dependency P Q is said to be a fully functional dependency if the removal of any attribute S from P means that the dependency doesn’t hold any more.

Suppose, the above functional dependency is a fully functional dependency, then we must ensure that there are no FDs as below.

Third Normal Form (3NF): Let R be a relational schema, then any non-trivial FD X Y over R is in 3NF, if X should be a candidate key or super key or Y should be a prime attribute.

• Either both of the above conditions should be true or one of them should be true.
• R should not contain any transitive dependency.
• For a relation schema R to be a 3NF, it is necessary to be in 2NF.

Transitive Dependency: A FD, P Q in a relation schema R is transitive if

• There is a set of attributes Z that is not a subset of any key of R.
• Both X Z and Z Y hold

• The above relation is in 2NF.
• In relation to R1, C is not a candidate key and D is a non-prime attribute. Due to this, R1 fails to satisfy the 3NF condition. Transitive dependency is present here.

AB C and C D, then AB D will be transitive.

Boyce Codd Normal Form (BCNF): Let R be the relation schema and X Y be any non-trivial FD over R in BCNF if and only if X is the candidate key or super key.

If R satisfies this dependency, then of course it satisfies 2NF and 3NF.

Summary of 1 NF, 2 NF and 3 NF:

Fourth Normal Form (4NF): 4NF is mainly concerned with multivalued dependency A relation is in 4NF if and only if for every one of its non-trivial multivalued dependencies X →→Y, X is a super key (i.e., X is either a candidate key or a superset) and all the multivalued dependencies are in BCNF.

Fifth Normal Form (5NF): It is also ‘known as Project Join Normal From (PJ/NF). 5NF reduces redundancy in relational database recording multivalued facts by isolating semantically related multiple relationships. A table or relation is said to be in the 5NF, if and only if every join dependency in it is implied by the candidate keys.

You can follow the detailed champion study plan for GATE CS 2021 from the following link:

Detailed GATE CSE 2021 Champion Study Plan

Candidates can also practice 110+ Mock tests for exams like GATE, and NIELIT with Test Series check the following link: