ACPI administration advocacy advocacy advocacy opinion alsa amarok apache apple apt aptitude audio audo authentication automount avi awk bash BIOS boot business cache calendar calibre cdr cdrecord censorship commandline computerscience console convert cron cut database date debian degree design desktop development disk dpkg dvd economics education emacs email europe exim faad ffmpeg file files firefox firewall flash foss freedom ftp fun fuse git gnumeric graphics grep growisofs grub gtkpod hardware hardware html idiocy image imagemagick images installation ip iphone ipod iptables iso itunes ivman kde kernel keyboard knoppix lame laptop latex linux locale lockin longlines m4a microsoft mimetypes minitab mount mp3 mp4 mplayer multimedia music mysql network nfs nfs4 nmap openbox openoffice opinion opinion partition pdf perl php politics postgresql printing privacy programming rant remote rhythmbox rss rsync rxvt scp screengrab screenshot script scripting scsi security sed server shell siteadmin sitenews sitesoftware skype skype slackware sound sox spam spreadsheet ssh statistics subversion sudo svk swap t23 t43 terminal text thinkpad thunderbird time timezone ubuntu udev upgrade usb usbmount users uuid versioncontrol vfat video vnc windows wine wordpress wordprocessing X40 xwindows xwindows youtube
I've been agonising about normalising a database system, in fact, the database system this web site runs on.
When I first considered splitting out some table attributes into seperate tables, I didn't realise how completely un-normalized the database was, how much redundant data there are, how many integrity checks where being pushed up into the application logic layer and so on. Basically, it's a much bigger task than I had previously imagined.
To save the work, there was one group of field that I was considering not normalising. Consider a table with these fields:
| node_id | author_permissions | group_permissions | other_permissions | ... other attribtutes |
Each of the permissions is a string of four characters. Each character can either be 'i', the permission is inherited, '-' the permission is disabled or one of 'rwdx', the permission is enabled.
So, each permission is really a boolean and should be represented as such. More than that, each permission is a trit, where it could take one of three values NULL, 0 or 1 meaning 'inherit', 'disable' or 'enable' respectively.
To normalise the above database table properly, I'd have to create a permissions table like this:
CREATE TABLE "access" ( usertype varchar(6) NOT NULL, permission varchar(10) NOT NULL, access boolean NOT NULL, node_id bigint REFERENCES node(node_id) ON DELETE CASCADE, PRIMARY KEY (usertype, permission, node_id) )
In this table, usertype is one of 'author', 'group', 'guest' or 'other' and permission is one of 'read', 'write', 'execute' or 'delete'. The table is normalised because the access value is dependent only on the key (which consists of the other three columns). The permission value has a 'NOT NULL' constraint, because if the permission is inherited we don't set the permission field to 'NULL', we delete the row from the table - hence on LEFT JOIN queries 'NULL' is returned.
This all makes logical sense. However, I wasn't going to do it because, who wants a big table of boolean values? that are going to be joined anyway? especially when the application layer still expects a string 'i', '-' and 'rwxd'? So, I was going to create a fudge.
Thanks to Phil Factor, I won't now:
OTLT [One True Lookup Table] syndrome comes about usually when one is forced to create a number of simple lookup tables which contain a code and a name. One gets the sudden flash of inspiration that one could roll them all into one lookup table, with a unique identifier for each row, and a type field, thereby saving the chore of creating and maintaining a number of tables. EAV disease, like OTLT syndrome both tend to be the result of pressure from programmers to push their OO designs back into the relational database, like a bathyscaphe in an alien environment . There is loose talk of persisting object data in a data store, as though one was storing apples in a shed. It also comes about when a development team attempts an Agile development before they've fully understood the data model.
Thanks you too, Fabian Pascal on the dangers of a not normalized database. And the problems of integrity if you don't normalise. Indeed, there have already been problems with integrity in the database system as a whole including the permissions system.
Normalization, here I come!