Adventures in Time: The Date Dimension

In Business Intelligence, I think one of the hardest things for people to get a handle on is our good friend the date dimension. It  should always be the very first dimension you create in a warehouse or mart, and it’s either going to make the rest of your tables and cubes much easier to report off of or much harder. Sorry Microsoft, but the 20 columns or so you give us in Adventure Works just doesn’t really cover all of the different angles, as we all know that even though Sally in accounting likes to see dates by fiscal month and quarter, John in sales like to see dates by calendar week, month and year. When building a date dimension, there are several very distinct things we need to consider when creating it:

  1.  What are the different ways business users will want to look at it? Long month names, short month names, weekdays, weekends, holidays, etc. ?
  2. How can I structure my table so it’s efficient for all of my BI applications to look at it?
  3. Can this actually be easy to maintain if I have to build a humongous date table to answer all of the questions above?

Let’s start with Question 1. To really get an answer for this question, the best person to ask is the person who writes the reports. What do they typically see from the business? Who asks for what?

Question 2 is a bit longer than the answer to Question 1. The answer is yes, we can. By building all of our columns into our table, we can make it extremely easy and efficient for us to report off of them from SQL, Analysis Services, SSRS, and SharePoint. SSRS and SharePoint wind up deriving off of Analysis Services, so that’s a little bit of duplication there. 🙂 But we COULD go back straight to the date dimension if we wanted to.

And question 3 is yes again. The trick is to just build up your shortcuts up front to make Analysis Services, etc. quick when they process, but to do it in the table directly up front and not want and do some funky concatenation in AS.

A friend of mine gave me a date dimension script a while ago that generated the actual table and the initial population of it. I added some extra columns to it, wrapped it in a proc, and put it below. There are quite a few different pieces to it, I’ll outline them below:

The proc at the end that gets created sets the current day flag equal to today’s date. If you put that in a job and execute at midnight every night, your current day flag will update every day.

In the proc Generate Date, it’s set to check and make sure that the default member exists while it’s running. That way if it’s missed when the table is initially built, or is truncated for some reason, it will always be added. I generally use a default date dimension member as 19000101 (01/01/1900), but if you use something else feel free to modify it.

The function came from Tim Cullen over at MSSQLTips (http://www.mssqltips.com/sqlservertip/1537/tsql-function-to-determine-holidays-in-sql-server/) to calculate Easter. Thank you Tim, you have no idea how much time/frustration you saved me.

The Set Holidays proc is wrapped so it’s run as a part of the Generate Date dimension, so as new dates are added the holiday’s are automatically populated.

Thanks all!

 

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Common].[DimDate]’) AND type in (N’U’))
DROP TABLE [Common].[DimDate]
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Common].[GenerateDate]’) AND type in (N’P’, N’PC’))
DROP PROCEDURE [Common].[GenerateDate]
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Common].[SetCurrentDay]’) AND type in (N’P’, N’PC’))
DROP PROCEDURE [Common].[SetCurrentDay]
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Common].[SetHolidays]’) AND type in (N’P’, N’PC’))
DROP PROCEDURE [Common].[SetHolidays]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [Common].[DimDate](
[DateKey] [int] NOT NULL,
[FullDate] [datetime] NOT NULL,
[DisplayDate] [char](10) NOT NULL,
[DayShortName] [char](3) NULL,
[DayName] [varchar](9) NULL,
[CalendarYear] [smallint] NOT NULL,
[CalendarQuarterKey] [int] NOT NULL,
[CalendarQuarterNumber] [int] NOT NULL,
[CalendarQuarterShortName] [varchar](10) NULL,
[CalendarQuarterName] [char](9) NULL,
[CalendarQuarterYearName] [char](7) NULL,
[CalendarMonthKey] [int] NOT NULL,
[CalendarMonthNumber] [tinyint] NULL,
[CalendarMonthShortName] [char](3) NULL,
[CalendarMonthName] [varchar](9) NULL,
[CalendarMonthYearName] [char](8) NULL,
[CalendarWeekKey] [int] NOT NULL,
[CalendarWeekNumber] [char](2) NULL,
[CalendarWeekShortName] [char](3) NULL,
[CalendarWeekName] [varchar](10) NULL,
[CalendarWeekInMonth] [tinyint] NULL,
[CalendarWeekYearName] [varchar](12) NULL,
[CalendarDayOfYear] [smallint] NOT NULL,
[CalendarDayOfQuarter] [smallint] NOT NULL,
[CalendarDayOfMonth] [tinyint] NOT NULL,
[CalendarDayOfWeek] [tinyint] NOT NULL,
[IsWeekDay] [char](1) NULL,
[IsCurrentDay] [char](1) NULL,
[IsHoliday] [char](1) NULL,
[HolidayName] [varchar](25) NULL,
PRIMARY KEY CLUSTERED
(
[DateKey] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO

CREATE FUNCTION [dbo].[fn_EasterSundayByYear]
(@Year char(4))
RETURNS smalldatetime
AS
BEGIN
http://aa.usno.navy.mil/faq/docs/easter.php
http://www.mssqltips.com/sqlservertip/1537/tsql-function-to-determine-holidays-in-sql-server/
declare
@c int
, @n int
, @k int
, @i int
, @j int
, @l int
, @m int
, @d int
, @Easter datetime

set @c = (@Year / 100)
set @n = @Year – 19 * (@Year / 19)
set @k = (@c – 17) / 25
set @i = @c – @c / 4 – ( @c – @k) / 3 + 19 * @n + 15
set @i = @i – 30 * ( @i / 30 )
set @i = @i – (@i / 28) * (1 – (@i / 28) * (29 / (@i + 1)) * ((21 – @n) / 11))
set @j = @Year + @Year / 4 + @i + 2 – @c + @c / 4
set @j = @j – 7 * (@j / 7)
set @l = @i – @j
set @m = 3 + (@l + 40) / 44
set @d = @l + 28 – 31 * ( @m / 4 )

set @Easter = (select right(‘0’ + convert(varchar(2),@m),2) + ‘/’ + right(‘0’ + convert(varchar(2),@d),2) + ‘/’ + convert(char(4),@Year))

return @Easter
END
GO

CREATE PROCEDURE Common.SetHolidays
AS
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Thanksgiving’
WHERE CalendarMonthNumber = 11
AND CalendarWeekInMonth = 4
AND CalendarDayOfWeek = 5
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Independence Day’
WHERE CalendarMonthNumber = 7
AND CalendarDayOfMonth = 4
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Christmas Day’
WHERE CalendarMonthNumber = 12
AND CalendarDayOfMonth = 25
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Christmas Eve’
WHERE CalendarMonthNumber = 12
AND CalendarDayOfMonth = 24
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘New Year”s Day’
WHERE CalendarMonthNumber = 1
AND CalendarDayOfMonth = 1
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Valentine”s Day’
WHERE CalendarMonthNumber = 2
AND CalendarDayOfMonth = 14
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘St. Patrick”s Day’
WHERE CalendarMonthNumber = 3
AND CalendarDayOfMonth = 17
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Memorial Day’
WHERE DateKey In (SELECT MAX(DateKey)
FROM Common.DimDate
WHERE CalendarMonthNumber = 5
AND CalendarDayOfWeek = 2)
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Labor Day’
WHERE DateKey In (SELECT MIN(DateKey)
FROM Common.DimDate
WHERE CalendarMonthNumber = 9
AND CalendarDayOfWeek = 2)
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Easter’
WHERE DateKey In (SELECT DISTINCT
cast(convert(char(8), dbo.fn_EasterSundayByYear(CalendarYear), 112) as int)
FROM Common.DimDate)
UPDATE Common.DimDate
SET IsHoliday = ‘N’
WHERE IsHoliday IS NULL

/****** Object:  StoredProcedure [dimensions].[GenerateDate]    Script Date: 07/18/2012 08:51:20 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE PROCEDURE [Common].[GenerateDate]
@StartDate DATE
,@EndDate DATE
AS
–SET @StartDate = ‘1/1/2004′
–SET @EndDate = ’12/31/2013′
SET NOCOUNT ON;

IF NOT EXISTS(SELECT DateKey FROM [Common].[DimDate] WHERE DateKey = 19000101)
BEGIN
INSERT INTO [Common].[DimDate]
([DateKey]
,[FullDate]
,[DisplayDate]
,[DayShortName]
,[DayName]
,[CalendarYear]
,[CalendarQuarterKey]
,[CalendarQuarterNumber]
,[CalendarQuarterShortName]
,[CalendarQuarterName]
,[CalendarQuarterYearName]
,[CalendarMonthKey]
,[CalendarMonthNumber]
,[CalendarMonthShortName]
,[CalendarMonthName]
,[CalendarMonthYearName]
,[CalendarWeekKey]
,[CalendarWeekNumber]
,[CalendarWeekShortName]
,[CalendarWeekName]
,[CalendarWeekInMonth]
,[CalendarWeekYearName]
,[CalendarDayOfYear]
,[CalendarDayOfQuarter]
,[CalendarDayOfMonth]
,[CalendarDayOfWeek]
,[IsWeekDay]
,[IsCurrentDay]
,[IsHoliday]
,[HolidayName])
VALUES(
19000101 –[DateKey]
,’1900-01-01′–[FullDate]
,’01/01/1900′–[DisplayDate]
,’Mon’–[DayShortName]
,’Monday’–[DayName]
,1900–[CalendarYear]
,190001–[CalendarQuarterKey]
,1–[CalendarQuarterNumber]
,’Q1′–[CalendarQuarterShortName]
,’Quarter 1′–[CalendarQuarterName]
,’Q1 1900′–[CalendarQuarterYearName]
,190001–[CalendarMonthKey]
,1–[CalendarMonthNumber]
,’Jan’–[CalendarMonthShortName]
,’January’–[CalendarMonthName]
,’Jan 1900′–[CalendarMonthYearName]
,190001–[CalendarWeekKey]
,’01’–[CalendarWeekNumber]
,’W1′–[CalendarWeekShortName]
,’Week 1′–[CalendarWeekName]
, 1 –[CalendarWeekInMonth]
,’W1 1900′–[CalendarWeekYearName]
,1–[CalendarDayOfYear]
,1–[CalendarDayOfQuarter]
,1–[CalendarDayOfMonth]
,2–[CalendarDayOfWeek]
,’Y’–[IsWeekDay]
,’N’–[IsCurrentDay]
,’Y’–[IsHoliday]
,’New Year”s Day’) –[HolidayName]
END
WHILE @StartDate <= @EndDate

BEGIN
IF NOT EXISTS(SELECT DateKey FROM [Common].[DimDate] WHERE DateKey=CONVERT(VARCHAR,@StartDate,112))

BEGIN

DECLARE @Year CHAR(4)
DECLARE @Quarter CHAR(2)
DECLARE @Month CHAR(2)
DECLARE @Week CHAR(2)
DECLARE @Day CHAR(2)

SET @Year = CAST(DATEPART(YEAR, @StartDate) AS VARCHAR)
SET @Quarter = RIGHT(‘0’ + CAST(DATEPART(QUARTER, @StartDate) AS VARCHAR), 2)
SET @Month = RIGHT(‘0’ + CAST(DATEPART(MONTH, @StartDate) AS VARCHAR), 2)
SET @Week = RIGHT(‘0’ + CAST(DATEPART(WEEK, @StartDate) AS VARCHAR), 2)
SET @Day = RIGHT(‘0’ + CAST(DATEPART(DAY, @StartDate) AS VARCHAR), 2)

INSERT INTO [Common].[DimDate](
DateKey
,FullDate
,DisplayDate
,DayShortName
,DayName
,CalendarYear
,CalendarQuarterKey
,CalendarQuarterNumber
,CalendarQuarterShortName
,CalendarQuarterName
,CalendarQuarterYearName
,CalendarMonthKey
,CalendarMonthNumber
,CalendarMonthShortName
,CalendarMonthName
,CalendarMonthYearName
,CalendarWeekKey
,CalendarWeekNumber
,CalendarWeekShortName
,CalendarWeekName
,CalendarWeekInMonth
,CalendarWeekYearName
,CalendarDayOfYear
,CalendarDayOfQuarter
,CalendarDayOfMonth
,CalendarDayOfWeek
,IsWeekDay
,IsCurrentDay
)
VALUES(
@Year + @Month + @Day–DateKey
,@StartDate –FullDate
,@Month + ‘/’ + @Day + ‘/’ + @Year –DisplayDate
,LEFT(DATENAME(WEEKDAY, @StartDate), 3) –DayShortName
,DATENAME(WEEKDAY, @StartDate) –DayName
,@Year –CalendarYear
,@Year + @Quarter –CalendarQuarterKey
,@Quarter –CalendarQuarterNumber
,’Q’ + CAST(CAST(@Quarter AS TINYINT) AS VARCHAR) –CalendarQuarterShortName
,’Quarter ‘ + CAST(CAST(@Quarter AS TINYINT) AS VARCHAR) –CalendarQuarterName
,’Q’ + CAST(CAST(@Quarter AS TINYINT) AS VARCHAR) + ‘ ‘ + @Year –CalendarQuarterYearName
,@Year + @Month –CalendarMonthKey
,@Month –CalendarMonthNumber
,LEFT(DATENAME(MONTH, @StartDate), 3) –CalendarMonthShortName
,DATENAME(MONTH, @StartDate) –CalendarMonthName
,LEFT(DATENAME(MONTH, @StartDate), 3) + ‘ ‘ + @Year –CalendarMonthYearName
,@Year + @Week –CalendarWeekKey
,@Week –CalendarWeekNumber
,’W’ + CAST(CAST(@Week AS TINYINT) AS VARCHAR) –CalendarWeekShortName
,’Week ‘ + CAST(CAST(@Week AS TINYINT) AS VARCHAR) –CalendarWeekName
,DATEDIFF(WEEK, DATEADD(MONTH, DATEDIFF(MONTH, 0, @StartDate), 0), @StartDate) + 1 –CalendarWeekInMonth
,’W’ + CAST(CAST(@Week AS TINYINT) AS VARCHAR) + ‘ ‘ + @Year –CalendarWeekYearName
,DATEPART(DAYOFYEAR, @StartDate) –CalendarDayOfYear
,DATEDIFF(DAY, DATEADD(QUARTER, DATEDIFF(QUARTER, 0, @StartDate), 0), @StartDate) + 1 –CalendarDayOfQuarter
,DATEPART(DAY, @StartDate) –CalendarDayOfMonth
,DATEPART(WEEKDAY, @StartDate) –CalendarDayOfWeek
,CASE WHEN DATEPART(WEEKDAY, @StartDate) IN (1,7) THEN ‘N’ ELSE ‘Y’ END –WeekDay
,’N’ –CurrentDay
)
END

SET @StartDate = DATEADD(d, 1, @StartDate)

EXEC Common.SetHolidays

END

GO

CREATE procedure [Common].[SetCurrentDay]
AS

UPDATE Common.DimDate
SET IsCurrentDay = ‘N’
WHERE IsCurrentDay = ‘Y’

UPDATE Common.DimDate
SET IsCurrentDay = ‘Y’
WHERE DateKey = ((Year(GETDATE())*10000)+(Month(GETDATE())*100)+(Day(GETDATE())))

GO

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s