Enhancing the Data Dictionary

So if you’ve read my previous post (and hopefully used it :)), you’ll have a nice data dictionary you can use to query to provide to your business and other IT members to use when they have questions. Querying system DMV’s is great, but unfortunately it doesn’t have the capability to track some additional features that aren’t contained by default. (Source table, source column, description, etc.) Our answer for this lies in extended properties. For anyone who hasn’t used them before, extended properties are incredibly powerful self-created metadata. If you just want to add one extended property to a column,  it’s pretty easy. Just call the sys.sp_addextendedproperty proc with the following parameters:

EXEC sys.sp_addextendedproperty
@name= ‘Name’,
@value = ‘text of what Im adding’,
@level0type=N’SCHEMA’,
@level0name = ‘SchemaName’,
@level1type=N’TABLE’,
@level1name=’TableName’,
@level2type=N’COLUMN’,
@level2name=’ColumnName’
This works great if you want to add just 1 property. However, how many of us actually get the luxury of only needing to track 1 extended property? I really wish Microsoft would make it easier to manager this across multiple extended properties (like a sp_addmultipleextendedproperties proc or something). But they don’t, so here we are. 🙂 In my shop, our requirements are to track 6 different pieces of data for a column. They are: source table, source column, column description, null value, sample values, and any associated business logic. The plan is to eventually expose the business logic as a master data management layer that they can use to custom set the rules as SSIS is pulling it in. So SSIS would pull that column from the view for the logic. With that many values, executing an addextendedproperties proc for each column 6 different times is, I feel, unrealistic, for a developer. It would be nice if we just had a row input like we do in Excel or a SQL table that we could use. 🙂 Which is where the table “DataDictionaryInserts” comes into play. Create a table (this is using our 6 different values) with the below script:

CREATE TABLE [dbo].[DataDictionaryInserts](
[RecordNumber] [int] IDENTITY(1,1) NOT NULL,
[TableName] [sysname] NOT NULL,
[ColumnName] [sysname] NULL,
[SourceTable] [sql_variant] NULL,
[SourceColumn] [sql_variant] NULL,
[ColumnDescription] [sql_variant] NULL,
[NullValue] [sql_variant] NULL,
[SampleValues] [sql_variant] NULL,
[BusinessLogic] [sql_variant] NULL
)

You’ll see that each of the 6 different values we want to track is normalized out as it’s own separate column. The only values we need to pass for addextenedproperties to be able to read it is table and column name (schema is looked up in a DMV and passed into the proc as it runs).  This makes as little as is needed to actually be typed inserted, which makes us devs happy. 🙂
In our proc, we’re going to pivot these different values out to a denormalized approach and then use a cursor to loop through the result set and keep calling the proc. Put this on a nightly job, and you can see the advantages. It automatically runs at night and all of your inserts go. Normally I would advise against a cursor, but in this case our row sets are almost always really small so the performance impact is almost nonexistent. The longest I’ve ever seen it take to run is 2 seconds. We’re going to do 2 steps in our proc: the first will pivot out our inserts table, and the second will actually do the insert. The syntax for the pivot would be:

SELECT SchemaName,
TableName,
ColumnName,
Type,
Value
INTO #DataDictionaryValues
FROM
(SELECT s.Name AS ‘SchemaName’, d.TableName, d.ColumnName, d.SourceTable, d.SourceColumn, d.ColumnDescription, d.NullValue, d.SampleValues, d.BusinessLogic
FROM admin.DataDictionaryInserts d
JOIN sys.tables t ON
d.TableName=t.name
JOIN sys.schemas s ON
t.schema_id=s.schema_id
) p
UNPIVOT
(Value FOR Type IN
(SourceTable, SourceColumn, ColumnDescription, NullValue, SampleValues, BusinessLogic)
)AS unpvt;

 

And the syntax for our inserts would be:

DECLARE @schema VARCHAR(50);
DECLARE @table VARCHAR(50);
DECLARE @Column VARCHAR(50);
DECLARE @name VARCHAR(100);
DECLARE @value SQL_VARIANT

DECLARE c_ins CURSOR LOCAL FOR
SELECT DISTINCT SchemaName, TableName, ColumnName, Type, Value FROM #DataDictionaryValues

OPEN c_ins
FETCH NEXT FROM c_ins INTO @schema, @table, @Column, @name, @value

WHILE @@FETCH_STATUS = 0

BEGIN
IF EXISTS(SELECT * FROM sys.extended_properties ep
JOIN sys.columns c ON
c.object_id=ep.major_id
AND c.column_id=ep.minor_id
JOIN sys.tables t ON
t.object_id=c.object_id
WHERE c.name = @Column
AND t.name = @table
AND ep.name=@name)
BEGIN
EXEC sys.sp_dropextendedproperty @name = @name,
@level0type = N’SCHEMA’, — varchar(128)
@level0name = @schema, — sysname
@level1type = N’TABLE’, — varchar(128)
@level1name = @table, — sysname
@level2type = N’COLUMN’, — varchar(128)
@level2name = @Column — sysname
END

EXEC sys.sp_addextendedproperty
@name= @name,
@value = @value,
@level0type=N’SCHEMA’,
@level0name = @schema,
@level1type=N’TABLE’,
@level1name=@table,
@level2type=N’COLUMN’,
@level2name=@Column

DELETE FROM admin.DataDictionaryInserts
WHERE TableName = @table
AND ColumnName = @column

FETCH NEXT FROM c_ins INTO @schema, @table, @Column, @name, @value

END

CLOSE c_ins
DEALLOCATE c_ins

Now, we’ll also want to wrap this in a transaction, and declare a statement to drop our temp table in case it exists. So our complete proc/insert would look like:
CREATE PROCEDURE [dbo].[insert_ColumnProperty]
AS

SET XACT_ABORT ON

IF OBJECT_ID(‘tempdb..#DataDictionaryValues’) IS NOT NULL
BEGIN
DROP TABLE #DataDictionaryValues
END

SELECT SchemaName,
TableName,
ColumnName,
Type,
Value
INTO #DataDictionaryValues
FROM
(SELECT s.Name AS ‘SchemaName’, d.TableName, d.ColumnName, d.SourceTable, d.SourceColumn, d.ColumnDescription, d.NullValue, d.SampleValues, d.BusinessLogic
FROM admin.DataDictionaryInserts d
JOIN sys.tables t ON
d.TableName=t.name
JOIN sys.schemas s ON
t.schema_id=s.schema_id
) p
UNPIVOT
(Value FOR Type IN
(SourceTable, SourceColumn, ColumnDescription, NullValue, SampleValues, BusinessLogic)
)AS unpvt;

BEGIN TRY
DECLARE @schema VARCHAR(50);
DECLARE @table VARCHAR(50);
DECLARE @Column VARCHAR(50);
DECLARE @name VARCHAR(100);
DECLARE @value SQL_VARIANT

DECLARE c_ins CURSOR LOCAL FOR
SELECT DISTINCT SchemaName, TableName, ColumnName, Type, Value FROM #DataDictionaryValues

OPEN c_ins
FETCH NEXT FROM c_ins INTO @schema, @table, @Column, @name, @value

WHILE @@FETCH_STATUS = 0

BEGIN
IF EXISTS(SELECT * FROM sys.extended_properties ep
JOIN sys.columns c ON
c.object_id=ep.major_id
AND c.column_id=ep.minor_id
JOIN sys.tables t ON
t.object_id=c.object_id
WHERE c.name = @Column
AND t.name = @table
AND ep.name=@name)
BEGIN
EXEC sys.sp_dropextendedproperty @name = @name,
@level0type = N’SCHEMA’, — varchar(128)
@level0name = @schema, — sysname
@level1type = N’TABLE’, — varchar(128)
@level1name = @table, — sysname
@level2type = N’COLUMN’, — varchar(128)
@level2name = @Column — sysname
END

EXEC sys.sp_addextendedproperty
@name= @name,
@value = @value,
@level0type=N’SCHEMA’,
@level0name = @schema,
@level1type=N’TABLE’,
@level1name=@table,
@level2type=N’COLUMN’,
@level2name=@Column

DELETE FROM admin.DataDictionaryInserts
WHERE TableName = @table
AND ColumnName = @column

FETCH NEXT FROM c_ins INTO @schema, @table, @Column, @name, @value

END

CLOSE c_ins
DEALLOCATE c_ins

END TRY

BEGIN CATCH
DECLARE @ErrMsg NVARCHAR(4000);
DECLARE @ErrSeverity INT;
DECLARE @ErrorState INT;

SELECT
@ErrMsg = ERROR_MESSAGE(),
@ErrSeverity = ERROR_SEVERITY(),
@ErrorState = ERROR_STATE();

RAISERROR(@ErrMsg, @ErrSeverity, @ErrorState);
END CATCH

 

 

Now we just need to add it into our view. Quick alter statement to query extended properties and join it to our other system DMV data points, and whoa-la! The semi-annoying part is we insert our extended properties in a normalized fashion, denormalize them to insert them, and then renormalize them on the other side. However a quick CTE with a pivot statements pivots our properties back out into columns.

ALTER VIEW [admin].[DataDictionary] AS

WITH CTEPivot
AS
(
SELECT
major_id,
minor_id,
[SourceTable] AS ‘SourceTable’,
[SourceColumn] AS ‘SourceColumn’,
[ColumnDescription] AS ‘ColumnDescription’,
[NullValue] AS ‘NullValue’,
[SampleValues] AS ‘SampleValues’,
[BusinessLogic] AS ‘BusinessLogic’

FROM
(SELECT
ep.major_id, ep.minor_id, ep.name,ep.value
FROM sys.extended_properties ep
WHERE minor_id <> 0) AS Query
PIVOT
(
MAX(Query.Value)
FOR
[name]
IN
([SourceTable],
[SourceColumn],
[ColumnDescription],
[NullValue],
[SampleValues],
[BusinessLogic])
)
AS pvt
)

SELECT
s.name AS ‘SchemaName’,
t.name AS ‘TableName’,
c.name AS ‘ColumnName’,
CASE
WHEN c.is_nullable = 0 THEN ‘No’
WHEN c.is_nullable = 1 THEN ‘Yes’
END AS ‘Nullable’,
st.name AS ‘DataType’,
c.max_length AS ‘Length’,
c.precision AS ‘Precision’,
c.scale AS ‘Scale’,
CASE
WHEN c.is_identity = 0 THEN ‘N’
WHEN c.Is_identity = 1 THEN ‘Y’
END AS ‘IdentityColumn’,
cp.SourceTable,
cp.SourceColumn,
cp.ColumnDescription,
cp.NullValue,
cp.SampleValues,
cp.BusinessLogic
FROM [sys].[tables] t WITH(NOLOCK)
LEFT JOIN sys.columns c WITH(NOLOCK) ON
t.object_id=c.object_id
LEFT JOIN sys.schemas s WITH(NOLOCK) ON
t.schema_id=s.schema_id
LEFT JOIN CTEPivot cp WITH(NOLOCK) ON
c.object_id=cp.major_id
AND c.column_id=cp.minor_id
LEFT JOIN [sys].[systypes] st WITH(NOLOCK) ON
c.system_type_id=st.xtype
WHERE s.NAME NOT LIKE ‘admin’

GO

 

 

 

 

Sorry for the long post as well.

 

Hope it helps you in your travels. 🙂

 

 

Starting a data dictionary

One of the most important things I’ve encountered as a business intelligence developer is having a data dictionary. Without the business fully understanding the fields in their data warehouse, they will not be as inclined to use this magnificent new reporting tool they have. A challenge we have as developers is that business users love excel, but maintaining something like a data dictionary in excel is cumbersome and extremely….painful. That’s where system DMV’s come into play and can really help us. 🙂

To make this easier, I’m going to do this as a 2 part post – part 1 is going to be just pulling the data out of the system we need, and part 2 will be utilizing extended properties and some custom procs to really enhance our dictionary and make it pop. So with that being said:

System DMV’s:

The data we need to access exists on the object level, so we need to look at the objects, types, and indexes section of the system views. Specifically, we’re looking for data around columns, which lives in sys.columns. We’ll also need sys.schemas, sys.tables and sys.systypes. sys.columns and sys.tables join through an equijoin on object_id and sys.columns and sys.systypes join on a non-equijoin between system_type_id and xtype. Putting this together quickly we can generate a simple quick query which gives us all of the columns and tables and what schema they’re in.

So our query shapes up as:

SELECT
s.name AS ‘SchemaName’,
t.name AS ‘TableName’,
c.name AS ‘ColumnName’,
CASE
WHEN c.is_nullable = 0 THEN ‘No’
WHEN c.is_nullable = 1 THEN ‘Yes’
END AS ‘Nullable’,
st.name AS ‘DataType’,
c.max_length AS ‘Length’,
c.precision AS ‘Precision’,
c.scale AS ‘Scale’,
CASE
WHEN c.is_identity = 0 THEN ‘N’
WHEN c.Is_identity = 1 THEN ‘Y’
END AS ‘IdentityColumn’
FROM [sys].[tables] t
LEFT JOIN sys.columns c  ON
t.object_id=c.object_id
LEFT JOIN sys.schemas s  ON
t.schema_id=s.schema_id
LEFT JOIN [sys].[systypes] st  ON
c.system_type_id=st.xtype

Now we have a really nice query that tells us every column, what table they’re in, what schema, and then whether it’s nullable, the data type, precision, scale, and if it’s an identity column.

In part 2 I’ll post leveraging this query to use extended properties and then eventually, creating an SSRS report to provide out.

 

Truncating a table with foreign keys

So  your in development, and you have a need to truncate a table.  You write your all-too-familiar “TRUNCATE TABLE MySchema.MyTable” But wait, you get this error message when you try and execute

Msg 4712, Level 16, State 1, Line 1
Cannot truncate table ‘MySchema.MyTable’ because it is being referenced by a FOREIGN KEY constraint.

What? Why? This error is telling you that your table is being referenced by another table in the database, and cannot be truncated. But what if you have already truncated the other table and you really do want to empty it?

To get around this, run a DELETE FROM command and do not put in a where clause. So your command would be:

DELETE FROM Myschema.MyTable

Now you have an empty table, but your identity column, if you have one, is going to be off. To correct it, run a dbcc command to reset it:

DBCC CHECKIDENT (‘MyTable’, RESEED, 0)

Where MyTable is the name of your table. Please note, if your table is not in the dbo schema, you will have to specify the schema name and then the table name. Like (MySchema.MyTable). And 0 is the number you want to reseed to -1 (meaning that, in this case, the next row inserted would have an identity value of 1)

Hope it helps you in your travels.

Using the netonly switch

The company I currently work for keeps all of their environments in separate domains. Not an uncommon scenario, so OK, we can work with it. However, the challenge has always been how to have a shared database development location that everyone can use without remoting over to the machine. The problem with that is handling multiple people. It works fine until you reach a predetermined number of developers (2). Since the development machines are in a different domain, simply right-clicking, selecting runas, and passing new credentials doesn’t work. The answer, as with most things, lies in a simple change. It’s called netonly, and it’s extremely handy.

To use it, open up notepad, type in the code, and then save it as a batch file.  Then you can authenticate across! I use this regularly to connect across the domain to our servers in our development domain while staying on my local machine. DOMAIN\username is the domain\username that you want to pass in.

runas /netonly /user:DOMAIN\username  “Program File Path”

For example, for SQL Server Management Studio 2008 R2 the file path would be below (on 32 bit windows xp, don’t ask :)):

runas /netonly /user:DOMAIN\username “C:\Program Files\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\ssms.exe”

NOW we’re getting somewhere. Next point is what if you want to pass the DOMAIN\username in as a variable(so you can pass this awesome new tool you found around to everyone in your shop)?

Again, the answer is just a quick change to our code.

@echo off

set /p user=”Username: ”

runas /netonly /user:%user% “C:\Program Files\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\Ssms.exe”

Now we can take this and use to authenticate quickly and easily across domains using the credentials we want, even when crossing to a different domain.

 

Hope it helps you in your travels.

 

One more quick note: This is my first blog post, so feedback is appreciated. 🙂