Databases Lesson 4: Indexes & Performance
Indexes make queries fast. Without them, every query scans the entire table. With them, the database jumps directly to the data — like an index in a book.
Creating Indexes
-- Without index: scans ALL rows to find matches O(n)
SELECT * FROM users WHERE email = 'alice@example.com';
-- Create index on email
CREATE INDEX idx_users_email ON users(email);
-- Now the query uses the index: O(log n)
-- Composite index: for queries filtering on multiple columns
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Perfect for: WHERE user_id = 5 AND status = 'pending'
-- Unique index: enforces uniqueness AND speeds up lookups
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);
-- Full-text search index (PostgreSQL)
CREATE INDEX idx_posts_search ON posts USING gin(to_tsvector('english', title || ' ' || content));
EXPLAIN — See What the Database Does
-- Analyze query performance
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 5;
-- Look for:
-- "Seq Scan" = no index, slow (bad for large tables)
-- "Index Scan" = using index, fast
-- "cost=X..Y rows=N" = estimated rows
-- Signs you need an index:
-- Queries filtering on a column frequently
-- JOIN ON columns
-- ORDER BY columns
-- Columns in WHERE clause with high cardinality
🏋️ Practice Task
Create a table “logs” with 100,000 rows (use generate_series in PostgreSQL or a loop). Measure query time WITHOUT index: SELECT * FROM logs WHERE ip_address = “1.2.3.4”. Add index. Measure again. Share the before/after times.
💡 Hint: In PostgreSQL: INSERT INTO logs SELECT generate_series(1,100000), concat(floor(random()*255)::text,’.’,floor(random()*255)::text,’.’,’1.1′), NOW()