PowerView as a SharePoint web part

Today I got a very interesting request from the client I’ve been working with. They loved the new features of PowerView, and wanted to embed the image of their powerview report on the home page of their BI Center site. My first reaction was, this is SharePoint 2010 and PowerView came out in 2012, so no way, ain’t gonna happen. But then I thought about a few more minutes and thought, why not? All PowerView really is to SharePoint is a Silverlight plug in, so it should be doable, right?

The answer lies in the “Other Settings” property in the Silverlight web part, which just requires a little bit of custom configuration to tell it to play nicely with the other kids. I’m going to go through creating it here:

On your site homepage (or wherever you want to create the web part, click on the page tab and then click edit. If there is no browse or page tab, click on the Site Actions dropdown and select “Show Ribbon” to make it appear.

Once you’ve clicked on edit, click the “Add a web part link”

Under “Media and Content”, select the Silverlight web part component, and click Add:

2012-12-04T18-17-22_0

After you do that, you’ll get a pop up asking you to enter the Silverlight web part URL. To find that URL, you’ll need to open up another browser tab, go to your PowerPivot gallery, and open the PowerView report. Once it opens, click on Edit Report, and once it loads, go up to (in IE, Page –> View Source). Look for a larger paragraph towards the bottom, and find the tag that starts with param name = “source”) Right after that it will say Value = “something”, where the something in quotes will end with a .Bootstrapper.xap? with a guid at the end of it. Put your Sharepoint browser connection string in front of it, and then copy it into your web part URL you still have open on your other tab.

2012-12-04T18-17-22_1

So your final connection string will look something like: http://MySharePointBIPortal/_layouts/ReportServer/My.Bootstrapper.XAP file.

After you’ve pasted it in you’ll get an error message that says “An Unexpected error has occurred.” That’s expected, so don’t worry about it for right now.

2012-12-04T18-17-22_2

Click on the dropdown and select “Edit Web Part” and a properties window will show up. At the bottom there is a + button with Other Settings on it, but all that’s there is a blank text box with an ellipses. Click the ellipses and then go back to your “view source” window that you have up from earlier.

2012-12-04T18-17-22_3

This time, look for the tag that says something like Init Parameters or Initialize Parameters and copy the entire piece out the value section. It will end after it says something like ReportSection=ReportSection. Paste that into the the text editor, making sure to replace the special characters with the correct colons and whatnot. I found it easier to just copy the URL for the Sharepoint site and kind of massage it a little.

After you’re done with that, click Apply and Voila! You have an incredibly awesome Power View report embedded as a SharePoint web part for your page. If it doesn’t work right away check the ItemPath and ReportServerURI settings, they’re usually the big culprits of it.

Advertisements

Sorting Month Names in DAX

I was recently helping a client set up their PowerPivot workbook for the first time and when they brought the months over on the pivot table, the results of the month weren’t sorted, so for example, April and August were the first months the list. Typically, users want to see months in sequential order (January, February, etc.). As DAX doesn’t have a key order by field, it’s a little bit different than in SSAS, where you can just order by the different key field. The solution is to create a separate column which is a concatenation of your Month number and month name fields so PowerPivot will display it in order.

 

For example, in your date dimension you have  a column called Month Number and a column called Month name

image

 

When you drag it over in excel, it’s completely out of order:

 

April 90
August 124
December 93
February 85
January 93
July 124
June 90
March 93
May 93
November 120
October 93
September 90

 

To sort it, add a column to the end of your PowerPivot workbook in the row context:

 

image

 

In the column, type up your formula. Concatenate the month number and the month number with a formula. In between the quotations, you can set it to be whatever you, or your business users, would like to see.

 

=[MonthNumberOfYear]& “-“&[EnglishMonthName]

 

Your new column will look like:

 

image

 

Right click and click rename, give your column a name, and go back to Excel. Drag over your new column in your PivotTable, and Voila! Months ordered in PowerPivot.

 

01-January 93
02-February 85
03-March 93
04-April 90
05-May 93
06-June 90
07-July 124
08-August 124
09-September 90
10-October 93
11-November 120
12-December 93

A view to a…warehouse

As I sit here in Boston, I can’t help but find myself pondering views. At its core, what is a view, and specifically in regards to business intelligence, why is it such good practice to create views on our data warehouse tables? So as any sane person would, I immediately pulled out my computer and starting looking up best practices and asking the all important question: “Why?”

Immediately, I stumbled across a Microsoft whitepaper: http://msdn.microsoft.com/en-us/library/aa902663%28v=SQL.80%29.aspx

 

And in reading it, 1 particular section stuck out to me more than anything else:

“Using views is good system design for a number of reasons:

  • Using views improves flexibility by providing a buffer between the physical database and downstream systems, like user access and cube population.
  • It increases the likelihood that you can make significant physical changes to the system while minimizing disruption to other applications. “

In particular, one thing I’ve really begun to preach is the first bullet point in particular – “Providing a buffer between the physical database and downstream systems.”

It used to be that you would build your data warehouse, build your SSAS cube on top of it, and build your SSRS reports off of either a) your SSAS cubes, b) your data warehouse, or c) both. However, with the recent advent of Power Pivot, Power View, and the BISM tabular model, times are changing. Before if you ignored this practice, it wasn’t that bad, because more likely than not, it was just a field in a DSV in Analysis Services you had to change. But now, with all of the additions to the family, that is A LOT more stuff you have to update if you ignore it. Nowadays, ignoring and physically tying to the table creates dependencies on an almost infinite amount of systems, which could have you transforming from a BI developer into a DBA to watch all of those tables and make sure they perform. Views are  a tool in the BI developers arsenal to give systems, and business users, insight into what’s going on while still keeping the structure in relative obscurity.

The other item I really like about is the “significant physical changes” piece. I’ve seen time and time again where business try to create something themselves, it doesn’t work, so they call someone, but their are already reports and cubes in production that reference the data warehouse. So it’s not an option to just “blow it away” and start over again. Views give us that functionality to support existing processes while we rebuild the inside.

Adventures in Time: The Date Dimension

In Business Intelligence, I think one of the hardest things for people to get a handle on is our good friend the date dimension. It  should always be the very first dimension you create in a warehouse or mart, and it’s either going to make the rest of your tables and cubes much easier to report off of or much harder. Sorry Microsoft, but the 20 columns or so you give us in Adventure Works just doesn’t really cover all of the different angles, as we all know that even though Sally in accounting likes to see dates by fiscal month and quarter, John in sales like to see dates by calendar week, month and year. When building a date dimension, there are several very distinct things we need to consider when creating it:

  1.  What are the different ways business users will want to look at it? Long month names, short month names, weekdays, weekends, holidays, etc. ?
  2. How can I structure my table so it’s efficient for all of my BI applications to look at it?
  3. Can this actually be easy to maintain if I have to build a humongous date table to answer all of the questions above?

Let’s start with Question 1. To really get an answer for this question, the best person to ask is the person who writes the reports. What do they typically see from the business? Who asks for what?

Question 2 is a bit longer than the answer to Question 1. The answer is yes, we can. By building all of our columns into our table, we can make it extremely easy and efficient for us to report off of them from SQL, Analysis Services, SSRS, and SharePoint. SSRS and SharePoint wind up deriving off of Analysis Services, so that’s a little bit of duplication there. 🙂 But we COULD go back straight to the date dimension if we wanted to.

And question 3 is yes again. The trick is to just build up your shortcuts up front to make Analysis Services, etc. quick when they process, but to do it in the table directly up front and not want and do some funky concatenation in AS.

A friend of mine gave me a date dimension script a while ago that generated the actual table and the initial population of it. I added some extra columns to it, wrapped it in a proc, and put it below. There are quite a few different pieces to it, I’ll outline them below:

The proc at the end that gets created sets the current day flag equal to today’s date. If you put that in a job and execute at midnight every night, your current day flag will update every day.

In the proc Generate Date, it’s set to check and make sure that the default member exists while it’s running. That way if it’s missed when the table is initially built, or is truncated for some reason, it will always be added. I generally use a default date dimension member as 19000101 (01/01/1900), but if you use something else feel free to modify it.

The function came from Tim Cullen over at MSSQLTips (http://www.mssqltips.com/sqlservertip/1537/tsql-function-to-determine-holidays-in-sql-server/) to calculate Easter. Thank you Tim, you have no idea how much time/frustration you saved me.

The Set Holidays proc is wrapped so it’s run as a part of the Generate Date dimension, so as new dates are added the holiday’s are automatically populated.

Thanks all!

 

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Common].[DimDate]’) AND type in (N’U’))
DROP TABLE [Common].[DimDate]
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Common].[GenerateDate]’) AND type in (N’P’, N’PC’))
DROP PROCEDURE [Common].[GenerateDate]
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Common].[SetCurrentDay]’) AND type in (N’P’, N’PC’))
DROP PROCEDURE [Common].[SetCurrentDay]
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Common].[SetHolidays]’) AND type in (N’P’, N’PC’))
DROP PROCEDURE [Common].[SetHolidays]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [Common].[DimDate](
[DateKey] [int] NOT NULL,
[FullDate] [datetime] NOT NULL,
[DisplayDate] [char](10) NOT NULL,
[DayShortName] [char](3) NULL,
[DayName] [varchar](9) NULL,
[CalendarYear] [smallint] NOT NULL,
[CalendarQuarterKey] [int] NOT NULL,
[CalendarQuarterNumber] [int] NOT NULL,
[CalendarQuarterShortName] [varchar](10) NULL,
[CalendarQuarterName] [char](9) NULL,
[CalendarQuarterYearName] [char](7) NULL,
[CalendarMonthKey] [int] NOT NULL,
[CalendarMonthNumber] [tinyint] NULL,
[CalendarMonthShortName] [char](3) NULL,
[CalendarMonthName] [varchar](9) NULL,
[CalendarMonthYearName] [char](8) NULL,
[CalendarWeekKey] [int] NOT NULL,
[CalendarWeekNumber] [char](2) NULL,
[CalendarWeekShortName] [char](3) NULL,
[CalendarWeekName] [varchar](10) NULL,
[CalendarWeekInMonth] [tinyint] NULL,
[CalendarWeekYearName] [varchar](12) NULL,
[CalendarDayOfYear] [smallint] NOT NULL,
[CalendarDayOfQuarter] [smallint] NOT NULL,
[CalendarDayOfMonth] [tinyint] NOT NULL,
[CalendarDayOfWeek] [tinyint] NOT NULL,
[IsWeekDay] [char](1) NULL,
[IsCurrentDay] [char](1) NULL,
[IsHoliday] [char](1) NULL,
[HolidayName] [varchar](25) NULL,
PRIMARY KEY CLUSTERED
(
[DateKey] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO

CREATE FUNCTION [dbo].[fn_EasterSundayByYear]
(@Year char(4))
RETURNS smalldatetime
AS
BEGIN
http://aa.usno.navy.mil/faq/docs/easter.php
http://www.mssqltips.com/sqlservertip/1537/tsql-function-to-determine-holidays-in-sql-server/
declare
@c int
, @n int
, @k int
, @i int
, @j int
, @l int
, @m int
, @d int
, @Easter datetime

set @c = (@Year / 100)
set @n = @Year – 19 * (@Year / 19)
set @k = (@c – 17) / 25
set @i = @c – @c / 4 – ( @c – @k) / 3 + 19 * @n + 15
set @i = @i – 30 * ( @i / 30 )
set @i = @i – (@i / 28) * (1 – (@i / 28) * (29 / (@i + 1)) * ((21 – @n) / 11))
set @j = @Year + @Year / 4 + @i + 2 – @c + @c / 4
set @j = @j – 7 * (@j / 7)
set @l = @i – @j
set @m = 3 + (@l + 40) / 44
set @d = @l + 28 – 31 * ( @m / 4 )

set @Easter = (select right(‘0’ + convert(varchar(2),@m),2) + ‘/’ + right(‘0’ + convert(varchar(2),@d),2) + ‘/’ + convert(char(4),@Year))

return @Easter
END
GO

CREATE PROCEDURE Common.SetHolidays
AS
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Thanksgiving’
WHERE CalendarMonthNumber = 11
AND CalendarWeekInMonth = 4
AND CalendarDayOfWeek = 5
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Independence Day’
WHERE CalendarMonthNumber = 7
AND CalendarDayOfMonth = 4
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Christmas Day’
WHERE CalendarMonthNumber = 12
AND CalendarDayOfMonth = 25
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Christmas Eve’
WHERE CalendarMonthNumber = 12
AND CalendarDayOfMonth = 24
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘New Year”s Day’
WHERE CalendarMonthNumber = 1
AND CalendarDayOfMonth = 1
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Valentine”s Day’
WHERE CalendarMonthNumber = 2
AND CalendarDayOfMonth = 14
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘St. Patrick”s Day’
WHERE CalendarMonthNumber = 3
AND CalendarDayOfMonth = 17
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Memorial Day’
WHERE DateKey In (SELECT MAX(DateKey)
FROM Common.DimDate
WHERE CalendarMonthNumber = 5
AND CalendarDayOfWeek = 2)
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Labor Day’
WHERE DateKey In (SELECT MIN(DateKey)
FROM Common.DimDate
WHERE CalendarMonthNumber = 9
AND CalendarDayOfWeek = 2)
UPDATE Common.DimDate
SET IsHoliday = ‘Y’,
HolidayName = ‘Easter’
WHERE DateKey In (SELECT DISTINCT
cast(convert(char(8), dbo.fn_EasterSundayByYear(CalendarYear), 112) as int)
FROM Common.DimDate)
UPDATE Common.DimDate
SET IsHoliday = ‘N’
WHERE IsHoliday IS NULL

/****** Object:  StoredProcedure [dimensions].[GenerateDate]    Script Date: 07/18/2012 08:51:20 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE PROCEDURE [Common].[GenerateDate]
@StartDate DATE
,@EndDate DATE
AS
–SET @StartDate = ‘1/1/2004′
–SET @EndDate = ’12/31/2013′
SET NOCOUNT ON;

IF NOT EXISTS(SELECT DateKey FROM [Common].[DimDate] WHERE DateKey = 19000101)
BEGIN
INSERT INTO [Common].[DimDate]
([DateKey]
,[FullDate]
,[DisplayDate]
,[DayShortName]
,[DayName]
,[CalendarYear]
,[CalendarQuarterKey]
,[CalendarQuarterNumber]
,[CalendarQuarterShortName]
,[CalendarQuarterName]
,[CalendarQuarterYearName]
,[CalendarMonthKey]
,[CalendarMonthNumber]
,[CalendarMonthShortName]
,[CalendarMonthName]
,[CalendarMonthYearName]
,[CalendarWeekKey]
,[CalendarWeekNumber]
,[CalendarWeekShortName]
,[CalendarWeekName]
,[CalendarWeekInMonth]
,[CalendarWeekYearName]
,[CalendarDayOfYear]
,[CalendarDayOfQuarter]
,[CalendarDayOfMonth]
,[CalendarDayOfWeek]
,[IsWeekDay]
,[IsCurrentDay]
,[IsHoliday]
,[HolidayName])
VALUES(
19000101 –[DateKey]
,’1900-01-01′–[FullDate]
,’01/01/1900′–[DisplayDate]
,’Mon’–[DayShortName]
,’Monday’–[DayName]
,1900–[CalendarYear]
,190001–[CalendarQuarterKey]
,1–[CalendarQuarterNumber]
,’Q1′–[CalendarQuarterShortName]
,’Quarter 1′–[CalendarQuarterName]
,’Q1 1900′–[CalendarQuarterYearName]
,190001–[CalendarMonthKey]
,1–[CalendarMonthNumber]
,’Jan’–[CalendarMonthShortName]
,’January’–[CalendarMonthName]
,’Jan 1900′–[CalendarMonthYearName]
,190001–[CalendarWeekKey]
,’01’–[CalendarWeekNumber]
,’W1′–[CalendarWeekShortName]
,’Week 1′–[CalendarWeekName]
, 1 –[CalendarWeekInMonth]
,’W1 1900′–[CalendarWeekYearName]
,1–[CalendarDayOfYear]
,1–[CalendarDayOfQuarter]
,1–[CalendarDayOfMonth]
,2–[CalendarDayOfWeek]
,’Y’–[IsWeekDay]
,’N’–[IsCurrentDay]
,’Y’–[IsHoliday]
,’New Year”s Day’) –[HolidayName]
END
WHILE @StartDate <= @EndDate

BEGIN
IF NOT EXISTS(SELECT DateKey FROM [Common].[DimDate] WHERE DateKey=CONVERT(VARCHAR,@StartDate,112))

BEGIN

DECLARE @Year CHAR(4)
DECLARE @Quarter CHAR(2)
DECLARE @Month CHAR(2)
DECLARE @Week CHAR(2)
DECLARE @Day CHAR(2)

SET @Year = CAST(DATEPART(YEAR, @StartDate) AS VARCHAR)
SET @Quarter = RIGHT(‘0’ + CAST(DATEPART(QUARTER, @StartDate) AS VARCHAR), 2)
SET @Month = RIGHT(‘0’ + CAST(DATEPART(MONTH, @StartDate) AS VARCHAR), 2)
SET @Week = RIGHT(‘0’ + CAST(DATEPART(WEEK, @StartDate) AS VARCHAR), 2)
SET @Day = RIGHT(‘0’ + CAST(DATEPART(DAY, @StartDate) AS VARCHAR), 2)

INSERT INTO [Common].[DimDate](
DateKey
,FullDate
,DisplayDate
,DayShortName
,DayName
,CalendarYear
,CalendarQuarterKey
,CalendarQuarterNumber
,CalendarQuarterShortName
,CalendarQuarterName
,CalendarQuarterYearName
,CalendarMonthKey
,CalendarMonthNumber
,CalendarMonthShortName
,CalendarMonthName
,CalendarMonthYearName
,CalendarWeekKey
,CalendarWeekNumber
,CalendarWeekShortName
,CalendarWeekName
,CalendarWeekInMonth
,CalendarWeekYearName
,CalendarDayOfYear
,CalendarDayOfQuarter
,CalendarDayOfMonth
,CalendarDayOfWeek
,IsWeekDay
,IsCurrentDay
)
VALUES(
@Year + @Month + @Day–DateKey
,@StartDate –FullDate
,@Month + ‘/’ + @Day + ‘/’ + @Year –DisplayDate
,LEFT(DATENAME(WEEKDAY, @StartDate), 3) –DayShortName
,DATENAME(WEEKDAY, @StartDate) –DayName
,@Year –CalendarYear
,@Year + @Quarter –CalendarQuarterKey
,@Quarter –CalendarQuarterNumber
,’Q’ + CAST(CAST(@Quarter AS TINYINT) AS VARCHAR) –CalendarQuarterShortName
,’Quarter ‘ + CAST(CAST(@Quarter AS TINYINT) AS VARCHAR) –CalendarQuarterName
,’Q’ + CAST(CAST(@Quarter AS TINYINT) AS VARCHAR) + ‘ ‘ + @Year –CalendarQuarterYearName
,@Year + @Month –CalendarMonthKey
,@Month –CalendarMonthNumber
,LEFT(DATENAME(MONTH, @StartDate), 3) –CalendarMonthShortName
,DATENAME(MONTH, @StartDate) –CalendarMonthName
,LEFT(DATENAME(MONTH, @StartDate), 3) + ‘ ‘ + @Year –CalendarMonthYearName
,@Year + @Week –CalendarWeekKey
,@Week –CalendarWeekNumber
,’W’ + CAST(CAST(@Week AS TINYINT) AS VARCHAR) –CalendarWeekShortName
,’Week ‘ + CAST(CAST(@Week AS TINYINT) AS VARCHAR) –CalendarWeekName
,DATEDIFF(WEEK, DATEADD(MONTH, DATEDIFF(MONTH, 0, @StartDate), 0), @StartDate) + 1 –CalendarWeekInMonth
,’W’ + CAST(CAST(@Week AS TINYINT) AS VARCHAR) + ‘ ‘ + @Year –CalendarWeekYearName
,DATEPART(DAYOFYEAR, @StartDate) –CalendarDayOfYear
,DATEDIFF(DAY, DATEADD(QUARTER, DATEDIFF(QUARTER, 0, @StartDate), 0), @StartDate) + 1 –CalendarDayOfQuarter
,DATEPART(DAY, @StartDate) –CalendarDayOfMonth
,DATEPART(WEEKDAY, @StartDate) –CalendarDayOfWeek
,CASE WHEN DATEPART(WEEKDAY, @StartDate) IN (1,7) THEN ‘N’ ELSE ‘Y’ END –WeekDay
,’N’ –CurrentDay
)
END

SET @StartDate = DATEADD(d, 1, @StartDate)

EXEC Common.SetHolidays

END

GO

CREATE procedure [Common].[SetCurrentDay]
AS

UPDATE Common.DimDate
SET IsCurrentDay = ‘N’
WHERE IsCurrentDay = ‘Y’

UPDATE Common.DimDate
SET IsCurrentDay = ‘Y’
WHERE DateKey = ((Year(GETDATE())*10000)+(Month(GETDATE())*100)+(Day(GETDATE())))

GO

SSIS Configurations

SSIS has a really versatile set of configuration options that make it (almost) easy to promote packages from dev to production. For anyone in the Denver CO area I’ll also be doing a presentation on this on Saturday at the SQL Saturday there. If you haven’t registered yet, please do so! It really is a great way to network with your peers on a local and national basis. To register, go here:

http://sqlsaturday.com/169/eventhome.aspx

 

There are 5 different ways we can promote our code through our different environments. They are, in no particular order:

  • registry key
  • xml configuration file
  • sql config table
  • parent child connection
  • environment variable

Over the next couple of weeks, I’ll outline each of these in detail. Today we’ll cover XML Configuration file

XML configuration files can be incredibly powerful and your worst enemy at the same time. We can use XML configuration files to configure multiple properties from file that exists on the server, or we can use it as a master config file to point the rest of our configuration options in our packages to a SQL table to resolve the rest of our connections. To access Package Configurations in BIDS 2008, go to the menu bar at the top of the page and select “package configurations” from the list.

Check the box in the package configurations pane that appears and click “Add”

A screen similar to the below will appear:

 

Select “XML Configuration file” from the dropdown and then select the Browse button that is active. Select a location and give your folder a name.

 

Click next.
Check the box next to the property you wish to change.

 

Click next again, give it a name that makes sense so you can reference it. As a standard, I usually give it the same name as the connection (or whatever it is) that I’m configuring. Then it’s easy to reference if you want it again later.

 

In a future post, I’ll go over the additional ones. 🙂

 

~Schmidty

 

 

 

Enhancing the Data Dictionary

So if you’ve read my previous post (and hopefully used it :)), you’ll have a nice data dictionary you can use to query to provide to your business and other IT members to use when they have questions. Querying system DMV’s is great, but unfortunately it doesn’t have the capability to track some additional features that aren’t contained by default. (Source table, source column, description, etc.) Our answer for this lies in extended properties. For anyone who hasn’t used them before, extended properties are incredibly powerful self-created metadata. If you just want to add one extended property to a column,  it’s pretty easy. Just call the sys.sp_addextendedproperty proc with the following parameters:

EXEC sys.sp_addextendedproperty
@name= ‘Name’,
@value = ‘text of what Im adding’,
@level0type=N’SCHEMA’,
@level0name = ‘SchemaName’,
@level1type=N’TABLE’,
@level1name=’TableName’,
@level2type=N’COLUMN’,
@level2name=’ColumnName’
This works great if you want to add just 1 property. However, how many of us actually get the luxury of only needing to track 1 extended property? I really wish Microsoft would make it easier to manager this across multiple extended properties (like a sp_addmultipleextendedproperties proc or something). But they don’t, so here we are. 🙂 In my shop, our requirements are to track 6 different pieces of data for a column. They are: source table, source column, column description, null value, sample values, and any associated business logic. The plan is to eventually expose the business logic as a master data management layer that they can use to custom set the rules as SSIS is pulling it in. So SSIS would pull that column from the view for the logic. With that many values, executing an addextendedproperties proc for each column 6 different times is, I feel, unrealistic, for a developer. It would be nice if we just had a row input like we do in Excel or a SQL table that we could use. 🙂 Which is where the table “DataDictionaryInserts” comes into play. Create a table (this is using our 6 different values) with the below script:

CREATE TABLE [dbo].[DataDictionaryInserts](
[RecordNumber] [int] IDENTITY(1,1) NOT NULL,
[TableName] [sysname] NOT NULL,
[ColumnName] [sysname] NULL,
[SourceTable] [sql_variant] NULL,
[SourceColumn] [sql_variant] NULL,
[ColumnDescription] [sql_variant] NULL,
[NullValue] [sql_variant] NULL,
[SampleValues] [sql_variant] NULL,
[BusinessLogic] [sql_variant] NULL
)

You’ll see that each of the 6 different values we want to track is normalized out as it’s own separate column. The only values we need to pass for addextenedproperties to be able to read it is table and column name (schema is looked up in a DMV and passed into the proc as it runs).  This makes as little as is needed to actually be typed inserted, which makes us devs happy. 🙂
In our proc, we’re going to pivot these different values out to a denormalized approach and then use a cursor to loop through the result set and keep calling the proc. Put this on a nightly job, and you can see the advantages. It automatically runs at night and all of your inserts go. Normally I would advise against a cursor, but in this case our row sets are almost always really small so the performance impact is almost nonexistent. The longest I’ve ever seen it take to run is 2 seconds. We’re going to do 2 steps in our proc: the first will pivot out our inserts table, and the second will actually do the insert. The syntax for the pivot would be:

SELECT SchemaName,
TableName,
ColumnName,
Type,
Value
INTO #DataDictionaryValues
FROM
(SELECT s.Name AS ‘SchemaName’, d.TableName, d.ColumnName, d.SourceTable, d.SourceColumn, d.ColumnDescription, d.NullValue, d.SampleValues, d.BusinessLogic
FROM admin.DataDictionaryInserts d
JOIN sys.tables t ON
d.TableName=t.name
JOIN sys.schemas s ON
t.schema_id=s.schema_id
) p
UNPIVOT
(Value FOR Type IN
(SourceTable, SourceColumn, ColumnDescription, NullValue, SampleValues, BusinessLogic)
)AS unpvt;

 

And the syntax for our inserts would be:

DECLARE @schema VARCHAR(50);
DECLARE @table VARCHAR(50);
DECLARE @Column VARCHAR(50);
DECLARE @name VARCHAR(100);
DECLARE @value SQL_VARIANT

DECLARE c_ins CURSOR LOCAL FOR
SELECT DISTINCT SchemaName, TableName, ColumnName, Type, Value FROM #DataDictionaryValues

OPEN c_ins
FETCH NEXT FROM c_ins INTO @schema, @table, @Column, @name, @value

WHILE @@FETCH_STATUS = 0

BEGIN
IF EXISTS(SELECT * FROM sys.extended_properties ep
JOIN sys.columns c ON
c.object_id=ep.major_id
AND c.column_id=ep.minor_id
JOIN sys.tables t ON
t.object_id=c.object_id
WHERE c.name = @Column
AND t.name = @table
AND ep.name=@name)
BEGIN
EXEC sys.sp_dropextendedproperty @name = @name,
@level0type = N’SCHEMA’, — varchar(128)
@level0name = @schema, — sysname
@level1type = N’TABLE’, — varchar(128)
@level1name = @table, — sysname
@level2type = N’COLUMN’, — varchar(128)
@level2name = @Column — sysname
END

EXEC sys.sp_addextendedproperty
@name= @name,
@value = @value,
@level0type=N’SCHEMA’,
@level0name = @schema,
@level1type=N’TABLE’,
@level1name=@table,
@level2type=N’COLUMN’,
@level2name=@Column

DELETE FROM admin.DataDictionaryInserts
WHERE TableName = @table
AND ColumnName = @column

FETCH NEXT FROM c_ins INTO @schema, @table, @Column, @name, @value

END

CLOSE c_ins
DEALLOCATE c_ins

Now, we’ll also want to wrap this in a transaction, and declare a statement to drop our temp table in case it exists. So our complete proc/insert would look like:
CREATE PROCEDURE [dbo].[insert_ColumnProperty]
AS

SET XACT_ABORT ON

IF OBJECT_ID(‘tempdb..#DataDictionaryValues’) IS NOT NULL
BEGIN
DROP TABLE #DataDictionaryValues
END

SELECT SchemaName,
TableName,
ColumnName,
Type,
Value
INTO #DataDictionaryValues
FROM
(SELECT s.Name AS ‘SchemaName’, d.TableName, d.ColumnName, d.SourceTable, d.SourceColumn, d.ColumnDescription, d.NullValue, d.SampleValues, d.BusinessLogic
FROM admin.DataDictionaryInserts d
JOIN sys.tables t ON
d.TableName=t.name
JOIN sys.schemas s ON
t.schema_id=s.schema_id
) p
UNPIVOT
(Value FOR Type IN
(SourceTable, SourceColumn, ColumnDescription, NullValue, SampleValues, BusinessLogic)
)AS unpvt;

BEGIN TRY
DECLARE @schema VARCHAR(50);
DECLARE @table VARCHAR(50);
DECLARE @Column VARCHAR(50);
DECLARE @name VARCHAR(100);
DECLARE @value SQL_VARIANT

DECLARE c_ins CURSOR LOCAL FOR
SELECT DISTINCT SchemaName, TableName, ColumnName, Type, Value FROM #DataDictionaryValues

OPEN c_ins
FETCH NEXT FROM c_ins INTO @schema, @table, @Column, @name, @value

WHILE @@FETCH_STATUS = 0

BEGIN
IF EXISTS(SELECT * FROM sys.extended_properties ep
JOIN sys.columns c ON
c.object_id=ep.major_id
AND c.column_id=ep.minor_id
JOIN sys.tables t ON
t.object_id=c.object_id
WHERE c.name = @Column
AND t.name = @table
AND ep.name=@name)
BEGIN
EXEC sys.sp_dropextendedproperty @name = @name,
@level0type = N’SCHEMA’, — varchar(128)
@level0name = @schema, — sysname
@level1type = N’TABLE’, — varchar(128)
@level1name = @table, — sysname
@level2type = N’COLUMN’, — varchar(128)
@level2name = @Column — sysname
END

EXEC sys.sp_addextendedproperty
@name= @name,
@value = @value,
@level0type=N’SCHEMA’,
@level0name = @schema,
@level1type=N’TABLE’,
@level1name=@table,
@level2type=N’COLUMN’,
@level2name=@Column

DELETE FROM admin.DataDictionaryInserts
WHERE TableName = @table
AND ColumnName = @column

FETCH NEXT FROM c_ins INTO @schema, @table, @Column, @name, @value

END

CLOSE c_ins
DEALLOCATE c_ins

END TRY

BEGIN CATCH
DECLARE @ErrMsg NVARCHAR(4000);
DECLARE @ErrSeverity INT;
DECLARE @ErrorState INT;

SELECT
@ErrMsg = ERROR_MESSAGE(),
@ErrSeverity = ERROR_SEVERITY(),
@ErrorState = ERROR_STATE();

RAISERROR(@ErrMsg, @ErrSeverity, @ErrorState);
END CATCH

 

 

Now we just need to add it into our view. Quick alter statement to query extended properties and join it to our other system DMV data points, and whoa-la! The semi-annoying part is we insert our extended properties in a normalized fashion, denormalize them to insert them, and then renormalize them on the other side. However a quick CTE with a pivot statements pivots our properties back out into columns.

ALTER VIEW [admin].[DataDictionary] AS

WITH CTEPivot
AS
(
SELECT
major_id,
minor_id,
[SourceTable] AS ‘SourceTable’,
[SourceColumn] AS ‘SourceColumn’,
[ColumnDescription] AS ‘ColumnDescription’,
[NullValue] AS ‘NullValue’,
[SampleValues] AS ‘SampleValues’,
[BusinessLogic] AS ‘BusinessLogic’

FROM
(SELECT
ep.major_id, ep.minor_id, ep.name,ep.value
FROM sys.extended_properties ep
WHERE minor_id <> 0) AS Query
PIVOT
(
MAX(Query.Value)
FOR
[name]
IN
([SourceTable],
[SourceColumn],
[ColumnDescription],
[NullValue],
[SampleValues],
[BusinessLogic])
)
AS pvt
)

SELECT
s.name AS ‘SchemaName’,
t.name AS ‘TableName’,
c.name AS ‘ColumnName’,
CASE
WHEN c.is_nullable = 0 THEN ‘No’
WHEN c.is_nullable = 1 THEN ‘Yes’
END AS ‘Nullable’,
st.name AS ‘DataType’,
c.max_length AS ‘Length’,
c.precision AS ‘Precision’,
c.scale AS ‘Scale’,
CASE
WHEN c.is_identity = 0 THEN ‘N’
WHEN c.Is_identity = 1 THEN ‘Y’
END AS ‘IdentityColumn’,
cp.SourceTable,
cp.SourceColumn,
cp.ColumnDescription,
cp.NullValue,
cp.SampleValues,
cp.BusinessLogic
FROM [sys].[tables] t WITH(NOLOCK)
LEFT JOIN sys.columns c WITH(NOLOCK) ON
t.object_id=c.object_id
LEFT JOIN sys.schemas s WITH(NOLOCK) ON
t.schema_id=s.schema_id
LEFT JOIN CTEPivot cp WITH(NOLOCK) ON
c.object_id=cp.major_id
AND c.column_id=cp.minor_id
LEFT JOIN [sys].[systypes] st WITH(NOLOCK) ON
c.system_type_id=st.xtype
WHERE s.NAME NOT LIKE ‘admin’

GO

 

 

 

 

Sorry for the long post as well.

 

Hope it helps you in your travels. 🙂

 

 

Starting a data dictionary

One of the most important things I’ve encountered as a business intelligence developer is having a data dictionary. Without the business fully understanding the fields in their data warehouse, they will not be as inclined to use this magnificent new reporting tool they have. A challenge we have as developers is that business users love excel, but maintaining something like a data dictionary in excel is cumbersome and extremely….painful. That’s where system DMV’s come into play and can really help us. 🙂

To make this easier, I’m going to do this as a 2 part post – part 1 is going to be just pulling the data out of the system we need, and part 2 will be utilizing extended properties and some custom procs to really enhance our dictionary and make it pop. So with that being said:

System DMV’s:

The data we need to access exists on the object level, so we need to look at the objects, types, and indexes section of the system views. Specifically, we’re looking for data around columns, which lives in sys.columns. We’ll also need sys.schemas, sys.tables and sys.systypes. sys.columns and sys.tables join through an equijoin on object_id and sys.columns and sys.systypes join on a non-equijoin between system_type_id and xtype. Putting this together quickly we can generate a simple quick query which gives us all of the columns and tables and what schema they’re in.

So our query shapes up as:

SELECT
s.name AS ‘SchemaName’,
t.name AS ‘TableName’,
c.name AS ‘ColumnName’,
CASE
WHEN c.is_nullable = 0 THEN ‘No’
WHEN c.is_nullable = 1 THEN ‘Yes’
END AS ‘Nullable’,
st.name AS ‘DataType’,
c.max_length AS ‘Length’,
c.precision AS ‘Precision’,
c.scale AS ‘Scale’,
CASE
WHEN c.is_identity = 0 THEN ‘N’
WHEN c.Is_identity = 1 THEN ‘Y’
END AS ‘IdentityColumn’
FROM [sys].[tables] t
LEFT JOIN sys.columns c  ON
t.object_id=c.object_id
LEFT JOIN sys.schemas s  ON
t.schema_id=s.schema_id
LEFT JOIN [sys].[systypes] st  ON
c.system_type_id=st.xtype

Now we have a really nice query that tells us every column, what table they’re in, what schema, and then whether it’s nullable, the data type, precision, scale, and if it’s an identity column.

In part 2 I’ll post leveraging this query to use extended properties and then eventually, creating an SSRS report to provide out.