SQLAdvanced SQL — Database Design, Relationships, & JOINs
In ba-5, you learned basic SQL (SELECT, INSERT, UPDATE, DELETE). Now it's time to go deeper: designing database schemas, understanding relationships between tables, writing JOIN queries, and learning about indexes for performance.
You can copy these SQL statements and run them in any database tool (PostgreSQL, MySQL, SQLite) to practice.
Creating Tables (The Blueprint)
Before you can store data, you must define the table's structure. Copy this code into your database to create a users table:
-- Create a users table
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
username TEXT NOT NULL UNIQUE,
email TEXT NOT NULL UNIQUE,
age INTEGER,
is_verified BOOLEAN NOT NULL DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Insert some sample users
INSERT INTO users (username, email, age) VALUES
('alice', 'alice@example.com', 25),
('bob', 'bob@example.com', 32),
('charlie', 'charlie@example.com', 28);
-- See what you created
SELECT * FROM users;
When you run this code, you should see:
id | username | email | age | is_verified | created_at
1 | alice | alice@example.com | 25 | 0 | 2024-...
2 | bob | bob@example.com | 32 | 0 | 2024-...
3 | charlie | charlie@example.com | 28 | 0 | 2024-...
Key constraints explained:
- PRIMARY KEY: Uniquely identifies each row. No two rows can have the same primary key value.
- NOT NULL: The database will refuse to insert a row if this column is missing.
- UNIQUE: No two rows can have the same value in this column. Perfect for email addresses and usernames.
- DEFAULT: If no value is provided, use this default.
DEFAULT CURRENT_TIMESTAMP automatically timestamps when a row was created.
Primary Keys & Foreign Keys (Relationships)
A Primary Key uniquely identifies each row. A Foreign Key is a column that stores the Primary Key value of a row in another table — this is how relationships between tables are formed.
-- The parent table
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
username TEXT NOT NULL
);
-- The child table — each post BELONGS TO a user
CREATE TABLE posts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
body TEXT,
user_id INTEGER NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
-- Insert a user first, then their posts
INSERT INTO users (username) VALUES ('ama_asare'); -- Gets id=1
INSERT INTO posts (title, body, user_id)
VALUES ('My First Post', 'Hello world!', 1); -- Linked to user 1
💡ON DELETE CASCADE: When you delete a user, what should happen to all their posts? CASCADE means "delete them too." The alternative is SET NULL (set the user_id to null) or RESTRICT (refuse to delete the user if they have any posts).
The Three Relationship Types
Every database relationship falls into one of three patterns:
1️⃣One-to-Many (1:N)One user has many posts. One order has many items. The Foreign Key lives on the "many" side (the posts table, not the users table).
🔗Many-to-Many (M:N)A student has many classes AND a class has many students. Requires a third "join table" with two Foreign Keys pointing at both sides.
🔒One-to-One (1:1)One user has exactly one settings record. Implemented by adding a UNIQUE constraint to the Foreign Key column.
-- Many-to-Many: Students and Classes
-- The "enrollment" join table links them
CREATE TABLE students (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT
);
CREATE TABLE classes (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT
);
-- The join table
CREATE TABLE enrollments (
student_id INTEGER REFERENCES students(id) ON DELETE CASCADE,
class_id INTEGER REFERENCES classes(id) ON DELETE CASCADE,
enrolled_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (student_id, class_id) -- Prevents duplicate enrollments
);
JOINs — Fetching Related Data
When data is split across tables (as it should be), you need JOINs to reassemble it. A JOIN temporarily stitches tables together in memory based on matching key values.
-- INNER JOIN: Return only rows where the condition matches in BOTH tables
SELECT
users.username,
posts.title,
posts.created_at
FROM posts
INNER JOIN users ON posts.user_id = users.id
ORDER BY posts.created_at DESC;
-- LEFT JOIN: Return all posts, even if the user doesn't exist
-- (user columns will be NULL for orphaned posts)
SELECT
posts.title,
users.username
FROM posts
LEFT JOIN users ON posts.user_id = users.id;
-- Joining three tables at once: posts + users + comments
SELECT
users.username,
posts.title,
COUNT(comments.id) AS comment_count
FROM posts
JOIN users ON posts.user_id = users.id
LEFT JOIN comments ON comments.post_id = posts.id
GROUP BY users.username, posts.title
ORDER BY comment_count DESC;
Indexes — Making Queries Fast
By default, a query like SELECT * FROM users WHERE email = 'alice@example.com' must scan every single row until it finds a match. With 1 million users, that's 1 million comparisons. This is called a sequential scan and it's extremely slow.
An Index is a separate data structure (usually a B-Tree) that the database maintains alongside your table. It keeps a sorted copy of the indexed column's values with direct pointers to the actual rows. A lookup on an indexed column takes milliseconds, not seconds.
-- Create an index on email (because we query by email constantly)
CREATE INDEX idx_users_email ON users(email);
-- Index on a frequently-filtered column
CREATE INDEX idx_posts_user_id ON posts(user_id);
-- Compound index for multi-column queries
CREATE INDEX idx_posts_user_date ON posts(user_id, created_at DESC);
-- Check if a query is using your index (look for 'Index Scan' vs 'Seq Scan')
EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = 'alice@example.com';
🚨 SQL Injection — The Most Critical Security Issue!NEVER concatenate user input directly into SQL strings. If a user types '; DROP TABLE users; -- into a search box and you run "SELECT * FROM users WHERE name = '" + input + "'", your entire users table gets deleted. Always use parameterized queries or an ORM!
// ❌ DANGEROUS — SQL Injection vulnerable
const query = "SELECT * FROM users WHERE email = '" + req.body.email + "'";
// ✅ SAFE — Parameterized query (the $1 is a placeholder)
const result = await db.query(
'SELECT * FROM users WHERE email = $1',
[req.body.email] // Input is sanitized and escaped automatically
);