The Data Warehousing Page |
  |
This Pages describes some basic DW terms.
We have defined the Steps in a Method to planning a Data Warehousing Project.
Be sure to check out our Data Warehouse Reference Page.
Data Warehousing is concerned with the activities involved with creating a Data Repository where you store all the facts you know about some large enterprise or activity, and then analyse the data in different ways,to look for patterns on the data. Typically this is done by large retail organisations, trying to identify patterns in buying behaviour so they can sell more stuff to the poor unsuspecting customer. The most commonly quoted example is that somewhere in the States they discovered that on Friday nights the sales of beer and diapers/nappies went up. The explanation seemed to be that men would call in to buy beer for the weekend and the wives would say "While you are in the supermarket,please pick up some diapers/nappies for the baby". As a result of this analysis they started putting diapers/nappies next to the beer and increased the sales of both items. TERMS IN COMMON USE in DW are :- 1) Facts - for example, sales in a shop for the beer and nappies example. 2) Dimensions - for example, date and time of purchase, product and product category. Then DW involves analysing Facts by different combinations of Diumensions. For example, "When do we sell most beer ?" and "When do we sell most nappies" - then they find the answer to both of these is Friday evening -and hey presto they can make mnore money by making it easier for the beer-buyers to buy nappies(and please the wife), and vice versa. From this example, you can see that most DW examples and work are pretty boring. A major growth area for DW at present is the storage and analysis of 'clickstream data' of traffic on Internet Web Sites. This is typically used to identify hits and page impressions to determine how many people will see a particular banner advertisement on the Web Site. An excellent book covering this very new area is "The Data Webhouse" by Ralph Kimball. 3) OLAP :- OLAP stands for 'On-Line Analytical Processing, and the term was first used by Dr.Ted Codd, of IBM Research, who was the father of the relational database and SQL. The beer and nappies example is a case of OLAP in practice. What it means is that you are online analysing masses of data. Whereas the common use of databases is to store details of 'transactions' such as Bank account details. OLAP requirements are not very well met by SQL so extensions have been developed by the vendors of OLAP products, like Brio or Cognos.