1. Why Databases Matter in Data Analysis

Databases are essential tools for data analysts. Most real-world data is stored in databases because they provide an efficient way to store, organize, manage, and access large volumes of data.

Using databases allows analysts to:

  • Retrieve information quickly
  • Analyze data efficiently
  • Support data-driven decisions
  • Solve complex business problems

2. What Is a Database?

A database is an organized collection of data stored electronically.
Data in a database is typically arranged in tables, which store related information in a structured format.

Each table contains:

  • Rows (records)
  • Columns (fields)

3. Relational Databases

A relational database is a type of database that contains multiple related tables. These tables can be connected using shared fields.

Key idea

  • Tables are linked through relationships
  • Relationships exist when tables share one or more common fields

Example structure

A car manufacturer database might include:

  • A dealership table
  • A product details table
  • A repair parts table

Each table stores different information but can be connected through shared identifiers.


4. Keys in Relational Databases

Keys are fields that allow tables to be uniquely identified and connected.

There are two main types of keys:

  • Primary keys
  • Foreign keys

5. Primary Keys

A primary key is a field that uniquely identifies each record (row) in a table.

Characteristics of a primary key

  • Each value must be unique
  • Cannot contain null or blank values
  • Only one primary key is allowed per table

Examples

  • Branch_ID uniquely identifies each dealership branch
  • VIN uniquely identifies each car in the product details table
  • Part_ID uniquely identifies each repair part

Primary keys ensure data integrity within a table.


6. Foreign Keys

A foreign key is a field in one table that refers to the primary key in another table.

Purpose of a foreign key

  • Creates a relationship between tables
  • Allows data to be connected across tables

Example

  • The repair parts table includes:
    • VIN as a foreign key referencing the product details table
    • Branch_ID as a foreign key referencing the dealership table

Key rules

  • A table can have multiple foreign keys
  • Foreign keys do not need to be unique
  • Foreign keys may contain duplicate values

Foreign keys enable relational databases to link related information efficiently.


7. Primary Key vs. Foreign Key Summary

Primary Key

  • Uniquely identifies a record within a table
  • Must be unique
  • Cannot be null
  • Only one per table

Foreign Key

  • Connects one table to another
  • Refers to a primary key in a different table
  • Multiple foreign keys allowed per table

Understanding the difference between these keys is critical for working with relational databases.


8. Why Keys Are Important

Keys make it possible to:

  • Maintain data accuracy and consistency
  • Avoid duplicate records
  • Join tables for analysis
  • Organize complex datasets logically

Without keys, relational databases would not function effectively.


9. Key Takeaways

  • Databases store and organize large amounts of data
  • Relational databases consist of related tables
  • Tables are connected using shared fields
  • Primary keys uniquely identify records
  • Foreign keys link tables together
  • Each table has one primary key
  • A table may have multiple foreign keys

One-sentence summary

Relational databases organize data into connected tables using primary and foreign keys, enabling efficient storage, retrieval, and analysis of complex datasets.