Monday, February 20, 2012

Rapidly Changing Dimension

Hi All,
I'm trying to figure out how I would best model the following situation
:
I'm trying to model a retailing case which is fairly easy except for
one mind boggling thing (at least for me). I'm having a SKU dimension
of around 150.000 unique products which is already a SCD Type 2 for
some attributes. In addition I'm willing to track changes of the sales
and purchase price. However these prices change almost weekly for quite
a lot of these products leading to a huge dimensional table when using
type 2.
As this is a numerical attribute I'm thinking about putting it into the
fact table; However first of all my fact table will grow with a couple
of gigs (around 1 billion rows) and secondly as not every product is
sold every day I do not have the possibility to view price over a
period on a day to day basis.
A second option would be to have a separate fact table (and olap cube)
and making a linked measure for both prices. However I don't know how
to fetch the correct price in the basic cube when the price-cube does
not have the same granularity of the date dimension but more of a
start-end date structure.
Anyone some brilliant ideas? I ran out of mind juice on this one.Hello DePuurt,
I would put the sales price in the fact table, and use type 1 to track
the current price in the product dimension. This should solve the day
to day price reporting and give you the changes of sales price over
time.
I also put a post together ages ago on different forms of type 2
implementation.
Check out:
http://bi-on-sql-server.blogspot.co...r.blogspot.com/|||A good idea, certainly valid.
However, I still have the issue on reporting the sales price over a
period of time. I want to be able to give a full price history of a
specific product over time; even if the thing didn't sell at all;
Unless I do you a complete full blown type 2 on the product dimension
I'm still not able to do this. An option would be to split the price of
the product dimension and have a subdimension tacking it. This tracking
dimension would have the natural key and the start/end date and both
prices. Creating a join with the original product dimension would yield
the exact information BUT now it's not in the cube if I go for OLAP.
Actually it all comes down to the desired functionality. Having a big
dimension table is the meast desired option in the main cube (sales
analysis), but is actually achievable when I accept the performance
drop. A second option yielding the same functionality would be to
create a second cube on prices. This means building one on the product
dimension with tracking dimension on price and this for the same date
granularity as the main cube. Using the lookup function the user
wouldn't notice it and performance would not be hindered when running
SQL reports. MDX is actually still quite fast, so I can take a small
hit (llokup) there. The last thing would be to go for your option, thus
limiting the possibilities for reporting price over time.
Thanks a lot,
DP.|||Hi DePuurt,
150K rows in a dimension table is nothing to be worried about....on a
recent project we had a 20M row dimension table...LOL!
But you are seeing one of the problems with type 2 dimensions when they
change quickly.....one client of mine had 90M rows in his customer
dimension table linking to 6B rows in a summary fact table...obviously
every question was slowed down...
The answer is to maintain history for type 2 dimensions without
maintaining it in the type 2 dimension table...
We do this all the time for big clients.
Peter
www.peternolan.com

No comments:

Post a Comment