SSAS: Remote Partitions

UPDATE 5/29/2014: I noticed today on msdn that Microsoft has posted that remote partition will be deprecated in a future release of SQL Server. So while this is a great tip for performance tuning now, know that in the future Microsoft is planning on discontinuing this functionality (although personally I think I would disagree with that decision, unless we’re getting some better for performance tuning! :) For more information, see here)


Last week I blogged about setting the partition slice in SSAS and how it can help with performance (see here). Sometimes, for scalability reasons it may be helpful to store partitions on a different server or instance where you can take advantage of additional cpu and memory in order to gain faster performance. This functionality does exist, in a feature referred to as remote partitions. Remote partitions reside on a server different from the main cube. For simpler administration, remote partitions are still managed on the instance that the parent cube is defined. One key caveat to this awesome functionality is that both servers must be running the same instance and SP level (in order to ensure compatibility). To start, you will need at least 2 instances of SSAS in multidimensional mode. Remote partitions are much easier to set up than you might think, and if you’ve already created partitions you’ve already passed the configuration screen.

First, let’s start out with why you would want to create remote partitions. Remote partitions allow the remote instance (the secondary database) to perform most of the memory and CPU utilization, allowing the master (primary) database location the ability to keep its resources free. As per BOL:

When remote partitions are included in a measure group, the memory and CPU utilization of the cube is distributed across all the partitions in the measure group. For example, when a remote partition is processed, either alone or as part of parent cube processing, most of the memory and CPU utilization for that partition occurs on the remote instance of Analysis Services.

Additionally, this allows us to scale out Analysis Services across multiple instances. This can be represented via the below diagram:


Since “the memory and CPU utilization of the cube is distributed across all of the partitions”, we can have multiple AS servers working in parallel to answer a query as needed. Quite handy! For example, if I query the Adventure Works cube for Internet Gross Profit Amount by country by year it takes approximately 96 milliseconds: (on my local SSD)

SELECT NON EMPTY Hierarchize({DrilldownLevel({[Date].[Calendar].[All Periods]},,,INCLUDE_CALC_MEMBERS)})
NON EMPTY Hierarchize({DrilldownLevel({DrilldownLevel({[Customer].[Customer Geography].[All Customers]},,,INCLUDE_CALC_MEMBERS)},
[Customer].[Customer Geography].[Country],INCLUDE_CALC_MEMBERS)})
FROM [Adventure Works]
WHERE ([Measures].[Internet Gross Profit])

Moving the 2005, 2006, and 2007 partition to a remote instance, this same query runs in about 46 milliseconds from a cold cache. So we gained around 50% improvement! Nice! So how do we configure these remote partitions? For the remainder of this post, I’m going to walk through how we can configure these remote partitions. Referring to the diagram above, before we can store anything on a secondary database, we first need to create it. Open up SSDT, and create a new analysis services project. Give it a name, and create a data source pointing to your master analysis services database. You are going to want to use the Native OLE DB driver for Analysis Services. Deploy it to your secondary instance that you’re going to be using for your remote partition storage locations. On my laptop, I have an instance called .\REMOTE, which is where I’ll be deploying this too:

image  image

Now there are 2 server level properties we need to adjust, in both the master and the secondary server. Click on the server name, and go to properties. On the general page, you are looking for the Feature \ LinkFromOtherInstanceEnabled and Feature \ LinkToOtherInstanceEnabled properties. By default these are set to false, we want to change them to true. Once you finish the first one, make sure you change it for the second server as well.


Now we want to go back into our remote database that we just created. Right click on it, and select SCRIPT DATABASE AS –> ALTER TO –> NEW QUERY WINDOW. After the data source impersonation info tag closes, you want to insert the following line:


So my complete script executing this against an AdventureWorks database I have on my local machine looks like the following:


Run it, and you’ll should get your success XML return. Finally we’re ready to change our partition location in our primary database. Open up SSDT, and open the project for your original solution (with all your dimensions, cubes, and other stuff). Open up cubes, and go to the partitions tab. If you have an existing partition that you want to move, click on it, go to properties, and under the Configurable section select Storage Location and expand the ellipses:


Select “Remote Analysis Services instance”, and then select the database name. The first time you do it you will need to click “NEW”, and create an Analysis Services connection to your remote instance. Select whether the storage location should be at the default or in another location. It’s generally a good rule of thumb that your partitions should be on faster drives the more current the data is. For example, if I have a 2013 partition, and then another partition for older data, it’s ok to have my older data partition on a slower, SATA drive, but my 2013 partition should be on something significantly quicker, as I’m accessing it more often. Remember that the goal here is to get speed & performance up as well. We are moving this to a separate instance in order to allow Analysis Services to compute millions of records over a different partition in addition to what the master server is doing. Moving all of this data and then sticking it on the old drive you found in the basement isn’t going to give you the lift you want. Click OK.

If you are creating a new partition, select New partition, and select the Measure group you wish to partition. Once you click Next, select whether the partition is query bound or table bound, and click Next. If you followed along in the last exercise the next screen will look quite familiar:


Click next, and give your new partition a name. That’s all there is to it! Make sure you do a process full on your master database the first time after you initially configure. Administration and management of the new remote partition is all done via the master database. You can scale this out to as many secondary databases as needed, if you want. Be careful with over-saturation though. I didn’t test it but I would imagine there is an ROI point similar to SSIS’, where you reach a point of diminished return.

Happy tuning!

SSAS: Partition Slicing

In SSAS multidimensional there is a concept within creating partitions for your measure groups called slicing. Slicing can be thought of as the definition Analysis Services uses to find data within partitions at query time. We are all aware of the benefit that creating multiple partitions can have in improving Analysis Services multidimensional performance, but how can partition slices help us further? As per BOL, “Explicitly setting the Slice property can improve query performance by overriding default slices generated for MOLAP and HOLAP partitions”. This means that slicing partitions ourselves could have a profound impact on query performance.

Let’s take a look in more detail. Starting off, we’ll turn SQL Server Profiler on against our SSAS instance. For this post, I’m just going to use the Adventure Works 2012 cube. My partitions currently in my SSAS solution appear like the below for Internet Orders, partitioned by year:



Now let’s an MDX query to show the problem:

SELECT [Measures].[Internet Order Count] ON 0,
    [Date].[Calendar].[Calendar Year].&[2008] ON 1 –2008
FROM [Adventure Works]


Running a profiler trace against this clearly shows our problem:


We are looking for data in one specific partition (2008), yet the formula engine thought that it needed to look through every partition to find the answer! In a large cube, the impact of this can be disastrous. Scanning every partition is very inefficient when we know from a logical standpoint that our partition strategy should answer our question.

During query time, SSAS internally reads the partition header files to determine what partition contains the data it is trying to read. Navigating to a deployed cube to an Analysis Services instance in File Explorer (typically located at C:\Program Files\Microsoft SQL Server\MSAS11.MSSQLSERVER\OLAP\Data), under the cube name folder, cube name.cub folder, all of the measure group partitions are located. Inside of these folders, generally a few levels down, lives an info.*.xml file, which contains relevant information for analysis services on what is contained in that partition. Frequently, it seems, the automatic detection available in MOLAP storage mode doesn’t always detect the partition where it belongs, and ALL partitions are scanned. So how can we fix this issue? Fortunately, there is a property available in SSAS that tells the engine specifically what a partition is sliced by, allowing us greater control over performance by targeting where query scans are sent during execution. Opening up our project in data tools, on the partition screen inside of the cube, I am going to select each partition and set the partition slice property:


Re-processing the project and re-executing the query yields the following results in profiler:



Voila, which is what we are looking for! Now our queries will be much more efficient when being sent from client applications, only hitting relevant partition data.

So we’ve learned that it is very important to set the slice property when creating partitions. However, it is important that these slices should correspond to the where clause in the partition inside. 

BI Administration: The Business Intelligence Management Data Mart

Firstly, I’d like to thank everyone who attended my session this morning for Pragmatic Works Training on the T’s. I always enjoy presenting, Today’s topic was about management and administration of business intelligence solutions. The theme of the overall presentation was that although there are many ways to collect the meta data around our business intelligence solution, the data itself still lends itself to a dimensional structure, that we can use to model into various facts and dimensions. Below is a data model that represents performance between ssis, ssas, and ssrs. These are key performance metrics surrounding these 3 development tools. As developers/administrators, it is our responsibility to ensure that we maintain and monitor this information to adhere to SLA’s and ensure timely performance. A overall diagram of the architecture would look like the below:




A complete recording of the webcast is available here, and the complete solution and database backup is available here.

Power Pivot: Adding tooltips to your Power View report

In Excel 2013, did you know that you can make descriptions available to users as they’re browsing the model in Power View? I recently discovered this, and thought it was a really nice feature that you can use to add some extensibility to your Power View reports. In this blog post, I’d like to walk through and show how easy it is to get this set up for use in your environment.


Let’s start with a blank Excel workbook, and open up the Power Pivot window:





Let’s import some data into Power Pivot. For this example, I’m just going to import in the Internet Sales table from Adventure Works:




Now that it is loaded, find the UnitPrice column and right click on it. Look for the Description field.



Select it, and the Description dialog window will appear. Type in “The price per unit that was charged.” and click OK. That’s it! Now let’s see it in action. Switch back over to excel view, and insert a Power View sheet.




Expand the Internet Sales table, and hover over the Unit Price column that you added the description to.




Your description now shows as a tooltip for report users to navigate and get additional information around. What’s even better, is this works exactly the same way in a tabular database when you’re building it. To demonstrate, open up the Adventure Works tabular database in 2012, and find the Internet Sales table again. Find the Unit Price column, right click and go to description.




In description, type in “The price per unit that was charged.” and click OK. Then build and deploy your solution. After it finishes deploying, open up Excel and connect to your tabular database.




After you click Finish, select Power View report under the Import Data dialog box and click Ok.




After your Power View sheet, open up the table list and find the Unit Price column again. Notice that your description can be implemented as a tooltip this way too.





That’s all for now! Short blog post, but very useful!

SSAS Clearing the cache

In SSAS, when you’re trying to do performance tests, it’s important to know what performance is from the cold cache. You definitely don’t want a warm cache messing with your measures that you’re trying to get metrics on! An important item of consideration is that SSAS is NOT like the database engine side of SQL Server, in that you run one command and you’re done. Since SSAS has 3 separate distinct parts (the formula engine, the storage engine, and the file system) you want to ensure that you clear the cache in all 3 places in order to get a clean run.

Clearing the formula engine and the storage engine is pretty simple, just open up an XMLA query window and run the following command: (I had some issues getting wordpress to take the “object” tag, so be sure to insert the “<” back in prior to running. Sorry!)

<ClearCache xmlns=””&gt;


<DatabaseID>Your Database Name Here</DatabaseID>



In addition to this, there is also the small matter of the actual physical files that are created for the cache. Just running the above is going to clear out half of your cache, but remember that since Analysis Services creates the files physically, just clearing out the logical layer isn’t going to get us the right results. Luckily, we have an app codeplex project for that! It’s an AS Stored Procedure project located here. Download it, install the Assembly into Analysis Services, and then run your Analysis Services mdx script like normal!

Once you’ve registered the assembly, the actual script is below. Each of these calls needs to be executed separately.

//checks the size of the Windows File System Cache

call ASSP.GetFileSystemCacheBytes()

Running this should yield something that looks like the below:


//clears the size of the Windows File System Cache

call ASSP.ClearFileSystemCache()

And you’ll get a window that looks like the below:


Sorry, nothing exciting here. :) Just a note that the command completed successfully.

Now let’s run our first command again:

//checks the size of the Windows File System Cache

call ASSP.GetFileSystemCacheBytes()

Our new results windows looks like:


Notice how much of an improvement we gained. This was just using my local multidimensional Adventure Works project. An important item to note is that the asstoredprocedures codeplex project is not offically supported by Microsoft, as it’s a codeplex project. The only other alternative that I know of that you could use to clear the file system cache is to use the fsutil.exe utility, which is supported by Microsoft. However, this utility requires that 1) Analysis Services is on it’s own separate drive and mountpoint and 2) you would have to run it outside of regular business hours, as the utility causes disruption to anyone currently using the cube. In my opinion, this assembly with the right command does the exact same thing and with a much less hassle and no un-mounting required.

So in conclusion, when clearing the cache in Analysis Services, keep in mind that you need to clear it everywhere, not just the formula and storage engine.

Dynamic Security in Tabular database

Tabular databases, like multidimensional, have the ability to create roles for additional security against our databases for individual business users. What’s more, it’s even easier to set up. Like other things in tabular, gone are the 9 or 10 different tabs, basic/advanced toggling etc. Roles in tabular and DAX use two functions to establish and manage security: the USERNAME DAX function and the LOOKUPVALUE DAX function. Lookupvalue is quickly becoming one of my favorite DAX functions, but that’s a different story. :) Something important to consider when

First we’re going to go into our database and add a table that we can use to map to what we’re securing by. For this example, I’m going to map between the employee dimension and the sales territory dimension. To keep it simple, though, I’m going to list out the employee names, rather than having a EmployeeKey/Sales Territory key bridge table. Start by creating a table called Dynamic Security with the employeekey, names, and userlogon fields. My script is below:

CREATE TABLE dbo.UserSecurity
    ([EmployeeKey] [int] IDENTITY(1,1) NOT NULL,
    [SalesTerritoryKey] [int] NOT NULL,
    [FirstName] [varchar](50) NULL,
    [LastName] [varchar](50) NULL,
    [UserName] [varchar](50) NULL,


Then let’s input some values into it.

INSERT INTO [dbo].[UserSecurity]
VALUES    (1, ‘Administrator’,”,’DATA0\administrator’)
INSERT INTO [dbo].[UserSecurity]
VALUES    (2, ‘Administrator’,”,’DATA0\administrator’)
INSERT INTO [dbo].[UserSecurity]
VALUES    (3, ‘Bob’,’Williams’,’DATA0\bob’)
INSERT INTO [dbo].[UserSecurity]
VALUES    (4, ‘Steve’,’Smith’,’DATA0\steve’)


Now that we have our table created, we can do the actual security implementation. Open up the tabular database that you wish to apply the security too, and import your table. We’ll select and open our existing connection:



Then select the user security table that we created in Management studio and click Finish.



Once it’s imported, we then want to join it to the appropriate tables so we can relate to our security table. For this example we’re just joining to the Sales Territory. A snapshot of this piece of the model looks like:



Right click on the User Security table and click “Hide from Client Tools”. Now we can create our role. Scroll up to the top and click on Roles: (screenshot below if you need it)



Create a New Role, and give it a name. In this case, I’m calling it “Sales Territory Users”, giving read access under the permissions. Once you’ve done that, click on the Members tab, and add the users we have in our table. Below is a screenshot of what mine looks like with the 3 members I have in my table:



Go back to the Row Filters tab, and find the User Security table listed. Under the DAX Filter column, type in the following:


Now find the Sales Territory table, and type in the following formula:

=’Sales Territory’[SalesTerritoryKey]=LOOKUPVALUE(UserSecurity[SalesTerritoryKey], ‘UserSecurity’[UserName], USERNAME(), UserSecurity[SalesTerritoryKey], ‘Sales Territory’[SalesTerritoryKey])

What this formula does is to say, “for each value in the UserSecurity[SalesTerritoryKey] column, find the current logged in Windows user name, and match it to the UserSecurity[SalesTerritoryKey] column. And only show rows where they match”


My Row FIlters section looks like the following:




Click Ok, and it will validate and finish your changes. Then go up to “Analyze in Excel”. Select “Other Windows user” and pick one of the users you have in your list. For this example, I picked Bob:




Click Ok, and then drag over your sales territory in the pivot table that appears. Drag over a measure so you get some data, and WHAMMY! some dynamic security awesome-ness:



Tabular Hierarchies Across Multiple Tables

Recently I was looking at the Adventure Works database and I noticed how Product, ProductCategory, and ProductSubCategory are their own separate tables. It got me to thinking about how someone might want to take attributes from each of those tables and create one hierarchy for reporting when you are looking at it. In DAX, this can be accomplished by using the RELATED() function. By using the related function in our table with our lowest level attributes, we can bring in the higher levels and create our hierarchy. This is really useful, especially if our data source isn’t necessarily a data warehouse and may be something more transactional with many tables split all over the place.


The Related function excepts one argument, which is the name of the table/column that we want to pull back. Let’s start with a simple diagram of our 3 tables:




Since our Product dimension is our lowest level of granularity of these attributes, what we want to do is bring the Product SubCategory and Product Category fields into the product dimension. Change your view to the data view in the product dimension and add a calculated column to pull in the subcategory:


You can see our RELATED function at work. What we’re telling the function to do is to find the relationship that exists between product and product subcategory, and bring back the product sub category name. Once you click enter, rename the column to sub category name. Next we want to bring in the product category. Add a new calculated column, only this time our RELATED function is going to point 1 column farther, into the Product Category table. The RELATED function is really useful for this kind of analysis, especially since it can traverse multiple tables. For any old school excel pros out there, take that vlookup! Below is our new column with both Product Sub Category and Product Category added:



Hide your key column attribute, and then go to diagram view and hide your subcategory and category tables. To hide the entire table, right click on the table itself and select hide from client tools. Create a new hierarchy in your dimension where you added your related columns to, and added the category and subcategory to them. The final design looks like the below:




And that’s it! Now we have all of our related attributes in one table that we can create hierarchies off of. This is very useful for multiple snowflakes in a warehouse or transactional databases.

Discretization Buckets in Tabular

A little while ago I blogged about how to created discretization buckets in a multi-dimensional cube. The nice thing about it is that it was just a few properties that you needed to modify, and was pretty straightforward. To the flip side of that I see a lot of companies wanting to take advantage of the new tabular Analysis Services model and the performance you get from it. The good news is that you can do the same thing in tabular, but they’re not properties you set. To set it up, first you’ll need to identify what is going to create your buckets. Are you going to use a linked table? SQL view? Or maybe just a hard coded SQL Select statement. For this example, I’m going to assume that the business user wants the ability to change them easily, so I’m going to use a linked table. I’m also using the Adventure Works 2012 tabular demo project. One way to approach this dilemma is that you could just create an if statement in the table that groups your items for you, but what if a business user wants to change the values dynamically? Re-writing your calculation every time would be difficult, and frustrating to the end user. They want the ability to modify it as business needs change.


First, create your table in excel. When you’re done, save it and give a name. It should look something like:




You can see from the picture, nothing fancy. Just a quick table, took me longer to write this sentence than create it. :)


Then go into SSDT, and in your project select “Import from Data Source.” If you created a view in your source or are pulling your table from another source, this is where you would configure it. Assuming you’re using the Excel table, select Excel File from under the text files options:




Click Next, and navigate to the file path from your earlier. Make sure to give your imported table a friendly name, as it will just name “Excel” by default:




Click Next, and then click Finish. It should import your table in a few seconds. After you’ve imported it, and you view the model in diagram view, it should not be related to anything. This is OK, as there isn’t anything to join it to, as there are no columns in any of the tables that would join to it. The calculation we’re going to write here in a minute is going to give us what we’re looking for. :)




In order to solve this and give us our proper values, we’re going to take the values from the banding table and bring it directly into our Fact Internet Sale table.  Going to the Internet Sales table, we’re going to create a calculated column (the filter context) with the following formula:

=CALCULATE(VALUES(‘Banding Matrix’[Band]),’Banding Matrix’[High Value] >= EARLIER(‘Internet Sales’[Unit Price]),’Banding Matrix’[Low Value] < EARLIER(‘Internet Sales’[Unit Price]))

What this calculation is doing is taking the value (the name) of the pricing band that we’ve specified where the value of the unit price column is between the low and the high amounts we’ve specified. But this formula has 1 problem. Since our key business requirement here was that users can input the band values themselves, what if one the values in our fact table is missed? What we really need to do is format this in a way where we can protect the formula against an error. Using the IFERROR() function can help us here. So a better formula is:


=CALCULATE(IFERROR(VALUES(‘Banding Matrix’[Band]), "No Band Found"),’Banding Matrix’[High Value] >= EARLIER(‘Internet Sales’[Unit Price]),’Banding Matrix’[Low Value] < EARLIER(‘Internet Sales’[Unit Price]))

Now we are well protected and business users are happy. :) Before you finish, click on the Banding configuration table tab and select “Hide from client tools”. Preview the model in Excel, pick any measure that you want out of the fact table, and select your new price banding matrix. The view below is showing me total sales amount by unit price group (band).




Now if a user wants to change the values for their groupings or just wants to know what they are, they have the control. There is no dependency on you or anyone else in IT to change it for them. Viva La Self Service BI Revolution!

Tabular Databases: The workspace server was not found

Sometimes, due to environment changes or some other reason, you want to change the workspace database that you are developing your tabular model against. I recently had a need to do this as my old tabular database instance was no longer active. When I went to open up the bim file, I received the following error message:




To fix it, click on the bim file once to select it and then go to properties (F4). The last property on the page is called “Workspace Server” and specifies where the workspace database server is. Change this field to the name of your new server and then the following confirmation screen will appear:




Simply click “Apply Change” and you’re done. Short and simple.

Partitions in Tabular

Continuing my tabular kick that I’ve been on lately, tabular databases allow partitioning of tables to break down larger fact tables into smaller pieces for the engine to manage. This can be very beneficial for us, especially if we have large groups of data that don’t change much from day to day. It is very wasteful to reprocess something if it’s never going to change, and creates a lot of unneeded pressure on the server to reprocess something. We can alter the table to only process current data when needed. To access the partitioning dialogue box, there are 2 ways that it can be accessed. The first is through the menu bar, by going to Table->Partitions, but an easier way (one less mouse click) is to just click on the partitions button from the main design screen, located towards the top left corner (I’ve circled it in this diagram)




After we click it, we get an options box that looks like the below:


At the top we have a table selector, that we can use to select which table we want to partition. For this, I’m going to use the classic FactInternetSales table in Adventure Works 2012 DW. Directly underneath that is a list of the current partitions that are associated with that table. We are going to split the 1 existing partition on this table into 2, 1 for orders before 2006 and then another one for everything after. To start, we’re going to modify the existing partition by changing it from being table bound (as it is by default) to being query-bound. To do this, on the right hand side about half-way down you’ll see the toggle, as indicated by the icons in red  below:



Click on the SQL icon, and you will get the query window. Assuming you are not importing every column from that table into your model (which you shouldn’t be anyway), you will have a query that lists out all of the columns, ending with the FROM clause. Here, we’re just hard-coding to before a certain date, basing it on the order date, so our where clause would be WHERE OrderDateKey < 20051231. Give the partition a new name, and that will do it for the first one.


After we’ve modified the query for the original table, click on New at the top underneath the partitions list. Click New to add a new partition to the list, and click down underneath in the SQL view. Change your less than sign to a greater than sign, and then give it a new name. Clicking on either partition allows to toggle between them and check your work. You’re final partitions list should look like the below:



Click Ok, do a complete Deploy and Process, and you’re ready to go.