Monday, 20 July 2015

Normalization/ De Normalization in data warehousing

Now lot of places you will find of normalization definitions some of them are very technical and some seems to go beyond radar make you more confuse –
Here I have given it a try for making it simple –

For me normalization is simple, reduce redundancy, save some storage space and maintain consistency.
Removing redundancy helps OLTP system, as they have lot of UPDATES. Less rows to updates gives offcourse more performance.

De Normalization 
Though we don’t have any proper definition of term “De normalization” yet it’s been used frequently.
Let say de normalization will involve creating more attributes (columns) in one table. This will reduce the number join and can give better performance.

Star/Snow Flake and Normalized/De Normalized – What is what??

For me Star is less normalized so I’ll call it De Normalized.
If in real time system you have to maintain a Star Schema you will have to manage big/ huge dimensions.

Snow Flake is a Normalized;
You decompose relation ships here and create smaller structures. Example – Adventure work DW Product dimension.

So Big Question? - What to do when design for DWH –

Well I prefer to go with……….wait for it……………Reporting. Yes you heard it.
It’s not about normalization or de normalization. It’s about reporting and its performance.
Your End users don’t care whether you normalized or de normalized data, they just want to see their reports well with right numbers and fastest way possible.  In today’s time very handful of organization/customer/user will care for storage/space, all they want now is quickest reporting or analytics.

There are no fix rules of going Normalized will be better or vice a versa. Hence it is always based on your reporting and analytic requirement.  

No comments:

Post a Comment