ACPI administration advocacy advocacy advocacy opinion alsa amarok apache apple apt aptitude audio audo authentication automount avi awk bash BIOS boot business cache calendar calibre cdr cdrecord censorship commandline computerscience console convert cron cut database date debian degree design desktop development disk dpkg dvd economics education emacs email europe exim faad ffmpeg file files firefox firewall flash foss freedom ftp fun fuse git gnumeric graphics grep growisofs grub gtkpod hardware hardware html idiocy image imagemagick images installation ip iphone ipod iptables iso itunes ivman kde kernel keyboard knoppix lame laptop latex linux locale lockin longlines m4a microsoft mimetypes minitab mount mp3 mp4 mplayer multimedia music mysql network nfs nfs4 nmap openbox openoffice opinion opinion partition pdf perl php politics postgresql printing privacy programming rant remote rhythmbox rss rsync rxvt scp screengrab screenshot script scripting scsi security sed server shell siteadmin sitenews sitesoftware skype skype slackware sound sox spam spreadsheet ssh statistics subversion sudo svk swap t23 t43 terminal text thinkpad thunderbird time timezone ubuntu udev upgrade usb usbmount users uuid versioncontrol vfat video vnc windows wine wordpress wordprocessing X40 xwindows xwindows youtube
There was an interesting discussion on perlmonks about the use of primary keys in databases.
The purpose of a primary key is to identify a row of data (a tuple) in a unique way. To achieve this end, many website back-ends identify each database row with a unique integer that is automatically inserted every time a new row is saved into the database.
This is guaranteed to give each row a unique identifier and therefore, superficially, it sounds like an good idea. However:
The essential problem is that an autonumber "id" column contains no information about the record to which it's connected, and tells you nothing about that record. It could be a duplicate, it could be unique, it could have ceased to exist…
That quote is from here. Here is a discussion of reasons often given for using unique integers to identify database rows. The passage below is particularly interesting because the use of integer primary keys is standard practice on websites and web-based content management systems:
Bad. Both the SQL Standard and the Design Principles arguments are based on willful ignorance. Generally the developer using these rationales heard from a friend of a collegue who read someone's blog who took a course at the University that ID columns were a good idea, and is adhering to this piece of pseudo-knowledge like a limpet, lest he or she be required to go out and get some real education. That some of these self-blinded designers are also book and article authors is really tragic. For the record, neither the SQL Standard nor relational theory compel the use of surrogate keys. In fact, the papers which established relational theory don't even mention surrogate keys.
So, if you aren't supposed to use integer primary keys in a web application, what are you supposed to use? Something that really does uniquely and naturally identify each and every database row? What might that be?
And if we can't answer that question, perhaps the RDMS is the wrong solution for serving web pages.
Just a thought.
@Brother Mark
Mysql, postgresql and other databases allow a 'UNIQUE' constraint which ensures the the values in a column are unique. Any values that can be guaranteed to be unique can be used as the 'primary key'.
I never took a single class in database administration. It was only through the force and clarity of raw intuition did I realize that using an autonumber as a key is stupid. I believe MySQL allows the creation of indexes based on arbitrary fields, which does speed the lookup time.
Posted by Brother Mark on 2010-11-11 14:41:36.