[BiO BB] RE: Normalization WAS: database design question

Svensson, B.A.T. (HKG) B.A.T.Svensson at lumc.nl
Wed Mar 3 07:23:07 EST 2004

Dear Jordi,

It seams to me that your "beginner" is a quite advance fellow(?).
I still argue that one need to crawl before one can start to walk.

Even in advanced text book on database system, normalization
(i.e. a database design issues) are not described until the
basic of database system has been covered. Like conceptual
data models, schema's, keys attributes, relationships, roles,
indexes, domains, anomalies, data definition/manipulation
languages, etc, etc, etc...

In my opinion starting with normalization without learning
the basic concept of the above will result in nothing, since
you wont know what it is you want to decompose anyway.

But on the other hand learning the above will implicit teach
you how to do normalization - namely with the common sense
approach. Normalization theory is just a formal way to (read:
a way to make a machines be able to) do what we human already
know by heart and guts.

But even before learning these concept you still need other
foundation to stand on. In my opinion it is simply not enough
to buy an a desktop DBMS and think everything will solve it
self once one know normalization theory - it a little bit
tricker than that.

Put this another way; let say the novice know how to do
normalization. Further assume the novice have normalized
some data set. And now then? What to do next? How do the
novice get raw data in the right format to load it into
the RDBMS? And what means should be used to load data?
Does data need to be updated/replaced frequently - are
we talking genomic databases then we are into a nightmare.
But even if these things are are achieved how will the
novice know how to query the data? Etc, etc, etc...

Yes, there are many ways to Rome, but they are all long...

  * * *

You also claim there is 1 to 100 success ratio in diverting
from normalization. As you might understand, referring to your
experience can't be regarded as a solid proof of your claim.

Especially when there are unanswered question mark, as to what
normalization you are referring to, what is being normalized,
the expertise level of the designer, a comparison with the
unknown, etc, etc... 

The 1st primary goal of /any/ software development is:

		*** MAKE IT WORK! ***

This is the primary goal for the beginner to achieve, because
it doesn't matter how fancy your design is if it does not work
at the end of the day. By creating things that works, the novice
will learn from experience what will work smoothly and what does
not, and eventually become a prof.

If the novice just get the data into an RDBMS - in what-
so-ever-normal-form, he will eventually be able to construct
a query that can solves a specific problem for him, simply
because humans are creative beings. However an expert may
say that "the solution was a bit unusual".

In fact quite a lot novices creates, or uses, one single tab
delimited files (with God know what kind of normalization forms),
dump them into some kind of database manger (Excel, MS Access),
and then happily ignore any database design issues and just uses
a "flat" query to get what ever data they need. And this works
in most simplistic cases for the novices. In this way there will
never be a need to changes the layout, just pick what they need,
if something missing, another column will fix that.

I assume you never counted these guys in your 99% failure ratio?

  * * *

On database views:

as you know, view are only predefined queries in the system,
and is only of limited help (a database system is dynamic
in time), since they wont address the issues I already been
mentioning in longer terms.

  * * * 

I am satisfied with your last remark, that queries drives the
design - that is to identify the mini-world we want to model.

And when one design like this, a "good design" will no longer
be measured by the normalization level of the system, but by
the real usability of the system, i.e. according to the
primary goal: make it work!

Kind regards,


On Tue, 2004-03-02 at 19:36, JRambla wrote:
> Svensson,
> For me, learning normalization is like learning to drive.
> When you're experienced, you can make a comfortable use of driving rules:
> semaphore timing, how to measure incoming cars speed, safe car to car
> distance according to speed, when your car will fit in a parking space and
> so on. 
> However, if you're a beginner, you better stay with strict rules and driving
> professor rules-of-thumb. 
> I will advise any beginner to stay with normalization except if the problem
> domain mandates otherwise. From my experience, when the designer wants to be
> smarter than that, they fail, 99% of times. This means having to redesign
> the database after lots of data coherence pains. Most of times, you will
> need to edit some data, also.
> Some of your issues with normalization (too much joins, etc.) can be
> addressed by database objects like views. That's a principle for me: if you
> stay normalized, database engine can probably help you with your troubles,
> if you stay apart, you're usually on your own. You're a "desperado".
> When not talking to beginners, and they can't hear us, we can use another
> principle: "queries drives database design". You must design your database
> (or a version of it) optimizing against the queries you will perform the
> most usually or the most critical.
> In my humble opinion,
> Jordi Rambla
> Barcelona (Spain)
> -----Mensaje original-----
> De: bio_bulletin_board-admin at bioinformatics.org
> [mailto:bio_bulletin_board-admin at bioinformatics.org] En nombre de Michael
> Gruenberger
> Enviado el: martes, 02 de marzo de 2004 18:50
> Para: bio_bulletin_board at bioinformatics.org
> Asunto: RE: [BiO BB] RE: Normalization WAS: database design question
> Thanks for the clarifications! I pretty much agree with everything you
> said. Of course you need a lot of experience to be able to design good
> data models and once you have the experience you probably don't have to
> go through the formal process of normalizing a database, because you
> just instinctively know what works best.
> But still in order to get the experience you have to start somewhere and
> you have to give a beginner an entry point and showing them
> normalization with some good examples has worked for me in most cases...
> This thread started by someone asking for some guidance on how to design
> his database and I still think that pointing that guy to normalization
> was a good idea. It would be interesting to get some feedback though and
> to know whether reading about normalization helped that guy!
> Cheers,
> Michael.
> On Tue, 2004-03-02 at 17:00, Svensson, B.A.T. (HKG) wrote:
> > That's not what I tried to say; Applying the theory makes things
> > more complicated then they are. In short: it's a waste of time
> > in everyday work.
> _______________________________________________
> BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board

More information about the BBB mailing list