Personal Links Minimize

Search Blogs Minimize

Blog Archives Maximize

 OldBlog      
Location: BlogsDataZulu    
Posted by: Eric Wilson 2/25/2005

Terminology confusion abounds in the software industry and I believe the worst examples are in database management in particular. So I shouldn't have been surprised to read the nonsense I discuss here... but still I was.

The following quote opens up edition 6.14 of Woody’s Access Watch (“WOW”), a normally fine e-newsletter focused on building Microsoft Access solutions:

Access is a normalized database (except for purists who don’t count a database as normalized unless it adheres to all 13 of the Codd & Date rules), and that means that you should store linked data in different tables.

If you don’t reel from an affront to your sensibilities from that one sentence, you should. This is no different, to me, than saying:

Microsoft Word is a legal document.

Sounds silly, right?

But did the WOW quote sound as odd to you? It should. It reveals many misunderstandings about database technology and practices and is confused and muddled, something we of all people should avoid.

Ignoring the poor grammar, here are the technical reasons this WOW statement should offend you as a technology professional:

1. Microsoft Access is not a database.

Read that again. Access is not a database. A database is an organized collection of information accessible for some purpose. Is Access an organized collection of information? No! It is a piece of software. Many people are far too cavalier with the word “database.” Database professionals—those in our society who are responsible for (and presumably trained for) organizing and classifying the world’s information—should be the least so.

Fundamentally, Access is a “database management system” or “DBMS” for short. You could call it “database software” or even, loosely, a “database system.” But it not a database. “Northwind” and “pubs” are databases. IMDB.com is (largely) a database. The card catalog at the library is a database. But Microsoft Access is a DBMS.

Calling Access a database is like calling Microsoft Word a “document.”

2. Nothing guarantees that any particular database implemented within Access will be normalized or not normalized...

... any more than using Word to write documents guarantees they will be legally enforceable documents.

In my experience databases implemented with Access frequently violate all sorts of normalization rules. One recent example I saw had a single column, “[City, State]” (yes, including the comma and space!). This clearly violates first normal form (“1NF”).

And don’t mistake this for an Access limitation or problem. No DBMS forces you to follow good database design practices. I have seen equally good and bad designs in Access, Oracle, and SQL Server. (Okay, the more advanced tools are somewhat more likely to be used by better trained professionals. But the edge is slight.)

To think of Access as a “normalized” database tool or even a “relational” DBMS would be akin to thinking of Word as a tool that automatically produced legally enforceable documents regardless of what you typed.

Being an expert at Access (or any database tool) no more qualifies you to implement correct, high quality database solutions than being a Word expert would qualify you to be a Pulitzer prize-winning author. It helps get the job done faster, but you still have to know how to do the job. (Most hiring managers do not understand this!)

3. Normalization does not mean adhering to Codd’s 12 rules.

Note: the guidelines are normally listed as "12" rules and were proposed by Ted Codd (not "and Date"). They are much quoted, including of course in Date's standard database text. See this site for a quick summary. Since there is a "Rule Zero," it would be fine to say the "13 Rules," but most texts do not call them such.

Codd’s 12 rules for relational database systems were geared toward those hoping to implement a relational DBMS and, as such, dictate certain behaviors of those software systems.

Normalization, in stark contrast, applies to the table-level design of one particular database.

See why it’s so important to be clear about “database” v. “DBMS”? I hope so.

(As an aside, being “normalized” also does not mean being “relational.” Adhering to 1NF is required. Beyond that they are simply design recommendations. See C.J. Date’s excellent “What First Normal Form Really Means” article for details.)

4. Normalization does not per se mean storing data in different tables and then linking back together.

Granted, such a design is usually a consequence of a normalized design. But normalization rules are strictly about removing certain types of redundancy from a certain table. Usually this is accomplished by decomposing the table in question into two or more related tables, for example splitting out a “Children” column into a separate “Child” table containing one row per child. But it’s important to note the difference.

It is especially important to note that just because you have separated data tables and relationships between them does not mean you have a normalized database or a relational database.

Legal jargon is probably necessary to produce a legal document, but just a glut of legalese scattered in my document does not mean it will be legally enforceable.

But possibly the biggest problem with the WOW quote is that it comes from a supposed industry “expert.” Unfortunately such ignorance of our field is rampant. You must be careful and pay attention to the fundamentals.

Keep in mind:

If you do not know the history of your own profession you can never hope to be but an amateur.
—anonymous

Note: For much deeper and longer treatment of many of the industry's database confusions, see Fabian Pascal's site, www.dbdebunk.com. If you regularly work with databases, you owe it to yourself to be familiar with his and Chris Date’s writings available at this site.
Permalink |  Trackback

Comments (1)  
Re: Microsoft Word is a Legal Document    By ewilson on 2/3/2007
Note: this post was copied from an earlier blog format and site. The original post mentioned that "my video library" was also a database. One comment suggested it was not a database, but the information about that library would be. I could rebut that an organized library of media is in fact a database, but it distracts from the point of the entry, so I removed that phrase.

Blog OldBlog Downloads
Copyright 2005-2009 Eric M. Wilson and Reimage LLC Terms Of Use Privacy Statement