Starting a data dictionary

One of the most important things I’ve encountered as a business intelligence developer is having a data dictionary. Without the business fully understanding the fields in their data warehouse, they will not be as inclined to use this magnificent new reporting tool they have. A challenge we have as developers is that business users love excel, but maintaining something like a data dictionary in excel is cumbersome and extremely….painful. That’s where system DMV’s come into play and can really help us. 🙂

To make this easier, I’m going to do this as a 2 part post – part 1 is going to be just pulling the data out of the system we need, and part 2 will be utilizing extended properties and some custom procs to really enhance our dictionary and make it pop. So with that being said:

System DMV’s:

The data we need to access exists on the object level, so we need to look at the objects, types, and indexes section of the system views. Specifically, we’re looking for data around columns, which lives in sys.columns. We’ll also need sys.schemas, sys.tables and sys.systypes. sys.columns and sys.tables join through an equijoin on object_id and sys.columns and sys.systypes join on a non-equijoin between system_type_id and xtype. Putting this together quickly we can generate a simple quick query which gives us all of the columns and tables and what schema they’re in.

So our query shapes up as:

SELECT
s.name AS ‘SchemaName’,
t.name AS ‘TableName’,
c.name AS ‘ColumnName’,
CASE
WHEN c.is_nullable = 0 THEN ‘No’
WHEN c.is_nullable = 1 THEN ‘Yes’
END AS ‘Nullable’,
st.name AS ‘DataType’,
c.max_length AS ‘Length’,
c.precision AS ‘Precision’,
c.scale AS ‘Scale’,
CASE
WHEN c.is_identity = 0 THEN ‘N’
WHEN c.Is_identity = 1 THEN ‘Y’
END AS ‘IdentityColumn’
FROM [sys].[tables] t
LEFT JOIN sys.columns c  ON
t.object_id=c.object_id
LEFT JOIN sys.schemas s  ON
t.schema_id=s.schema_id
LEFT JOIN [sys].[systypes] st  ON
c.system_type_id=st.xtype

Now we have a really nice query that tells us every column, what table they’re in, what schema, and then whether it’s nullable, the data type, precision, scale, and if it’s an identity column.

In part 2 I’ll post leveraging this query to use extended properties and then eventually, creating an SSRS report to provide out.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s