This blog post was inspired by Bill Karwin’s book SQL Antipatterns:Avoiding the Pitfalls of Database Programming. I read this over Thanksgiving weekend. If you are a developer, and you do database work, this is on the must-read list.

I agreed with just about everything in his book except a few little details. In Chapter 4, “ID Required,” Mr. Karwin lays out two antipatterns.

1. The primary key is always an autoincrementing column named id.
2. The primary key is not necessary when there is another unique restriction present naturally in the data.

### Regarding the First

The first condition is something like this.

CREATE TABLE employees (
id INT NOT NULL IDENTITY(1, 1),
last_name VARCHAR(64) NOT NULL,
first_name VARCHAR(64) NOT NULL,
department_id INT NOT NULL,
-- ...
CONSTRAINT [pk_employees] PRIMARY KEY CLUSTERED ([id] ASC)
);


I agree that not every table requires a primary key named id. There are lots of situations where not using id or not having autoincrementing values is helpful. However, understand that with technologies like NHibernate with fluent mapping or Rails’ ActiveRecord, it is terribly damn convenient for the developer.

NHibernate with explicit mapping won’t much matter, since you must define the exact purpose of each and every column. It’s only when you’re using some of the implicit mapping magic that this comes into play.

Also, if you’re using PostgreSQL, there is some syntactic sugar you can use when joining tables.

-- Using id
SELECT e.id AS employee_id, e.first_name, e.last_name, SUM(te.total_hours) AS sum_total_hours
FROM employees e
INNER JOIN time_entries te ON e.id = te.employee_id
WHERE te.entry_date = N'2013-06-16'
GROUP BY e.id, e.first_name, e.last_name

-- Using table-specific columns
SELECT e.employee_id
SELECT e.id AS employee_id, e.first_name, e.last_name, SUM(te.total_hours) AS sum_total_hours
FROM employees e
INNER JOIN time_entries te USING (employee_id)
WHERE te.entry_date = N'2013-06-16'
GROUP BY e.id, e.first_name, e.last_name


In the above example, you can use the USING keyword to join two tables on columns when the names match. If you’re explicit about calling the Employee ID employee_id instead of id, you can take advantage of syntax.

Consider your data access. If you think you’ll be handcrafting a lot of SQL, then maybe this syntactic sugar is worth it. Are you using a database where this even matters? Are you using a data access tool (NHibernate with fluent mapping, ActiveRecord) that takes advantages of the primary key being called id? There’s a lot to consider here beyond, “Don’t do it.”

### Regarding the Second

Mr. Karwin also describes the following as an antipattern.

CREATE TABLE users (
user_id INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,

He says that this table has two unique identifiers, since both user_id and username unique identify a row. So I tried this in my Portfolio project. I have the ability to create tags and apply these tags to events. The slug for these tags are each unique. So I tried dropping the id column in favor of just using the slug column as the primary key. I made sure to turn on cascading updates every place a tag was referenced. Imagine my surprise when I saw this error message when I tried to edit a tag.