Productive Linux


 Subscribe via Feedburner in a reader

Enter your email address:

Delivered by FeedBurner


Don't have an account?
Sign up to
Forgot your password?

Is almost every popular web content management system just plain wrong?
8 November 2010 @ 19:54 GMT
by Paul

There was an interesting discussion on perlmonks about the use of primary keys in databases.

The purpose of a primary key is to identify a row of data (a tuple) in a unique way. To achieve this end, many website back-ends identify each database row with a unique integer that is automatically inserted every time a new row is saved into the database.

This is guaranteed to give each row a unique identifier and therefore, superficially, it sounds like an good idea. However:

The essential problem is that an autonumber "id" column contains no information about the record to which it's connected, and tells you nothing about that record. It could be a duplicate, it could be unique, it could have ceased to exist…

That quote is from here. Here is a discussion of reasons often given for using unique integers to identify database rows. The passage below is particularly interesting because the use of integer primary keys is standard practice on websites and web-based content management systems:

Bad. Both the SQL Standard and the Design Principles arguments are based on willful ignorance. Generally the developer using these rationales heard from a friend of a collegue who read someone's blog who took a course at the University that ID columns were a good idea, and is adhering to this piece of pseudo-knowledge like a limpet, lest he or she be required to go out and get some real education. That some of these self-blinded designers are also book and article authors is really tragic. For the record, neither the SQL Standard nor relational theory compel the use of surrogate keys. In fact, the papers which established relational theory don't even mention surrogate keys.

So, if you aren't supposed to use integer primary keys in a web application, what are you supposed to use? Something that really does uniquely and naturally identify each and every database row? What might that be?

And if we can't answer that question, perhaps the RDMS is the wrong solution for serving web pages.

Just a thought.

I never took a single class in database administration. It was only through the force and clarity of raw intuition did I realize that using an autonumber as a key is stupid. I believe MySQL allows the creation of indexes based on arbitrary fields, which does speed the lookup time.

Posted by Brother Mark on 2010-11-11 14:41:36.

@Brother Mark

Mysql, postgresql and other databases allow a 'UNIQUE' constraint which ensures the the values in a column are unique. Any values that can be guaranteed to be unique can be used as the 'primary key'.

Posted by Paul (registered user) on 2010-11-11 16:45:11.
Comments disabled