SSAS: Remote Partitions

UPDATE 5/29/2014: I noticed today on msdn that Microsoft has posted that remote partition will be deprecated in a future release of SQL Server. So while this is a great tip for performance tuning now, know that in the future Microsoft is planning on discontinuing this functionality (although personally I think I would disagree with that decision, unless we’re getting some better for performance tuning! 🙂 For more information, see here)

 

Last week I blogged about setting the partition slice in SSAS and how it can help with performance (see here). Sometimes, for scalability reasons it may be helpful to store partitions on a different server or instance where you can take advantage of additional cpu and memory in order to gain faster performance. This functionality does exist, in a feature referred to as remote partitions. Remote partitions reside on a server different from the main cube. For simpler administration, remote partitions are still managed on the instance that the parent cube is defined. One key caveat to this awesome functionality is that both servers must be running the same instance and SP level (in order to ensure compatibility). To start, you will need at least 2 instances of SSAS in multidimensional mode. Remote partitions are much easier to set up than you might think, and if you’ve already created partitions you’ve already passed the configuration screen.

First, let’s start out with why you would want to create remote partitions. Remote partitions allow the remote instance (the secondary database) to perform most of the memory and CPU utilization, allowing the master (primary) database location the ability to keep its resources free. As per BOL:

When remote partitions are included in a measure group, the memory and CPU utilization of the cube is distributed across all the partitions in the measure group. For example, when a remote partition is processed, either alone or as part of parent cube processing, most of the memory and CPU utilization for that partition occurs on the remote instance of Analysis Services.

Additionally, this allows us to scale out Analysis Services across multiple instances. This can be represented via the below diagram:

image

Since “the memory and CPU utilization of the cube is distributed across all of the partitions”, we can have multiple AS servers working in parallel to answer a query as needed. Quite handy! For example, if I query the Adventure Works cube for Internet Gross Profit Amount by country by year it takes approximately 96 milliseconds: (on my local SSD)

SELECT NON EMPTY Hierarchize({DrilldownLevel({[Date].[Calendar].[All Periods]},,,INCLUDE_CALC_MEMBERS)})
ON COLUMNS ,
NON EMPTY Hierarchize({DrilldownLevel({DrilldownLevel({[Customer].[Customer Geography].[All Customers]},,,INCLUDE_CALC_MEMBERS)},
[Customer].[Customer Geography].[Country],INCLUDE_CALC_MEMBERS)})
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME,[Customer].[Customer Geography].[State-Province].[Country] ON ROWS
FROM [Adventure Works]
WHERE ([Measures].[Internet Gross Profit])

Moving the 2005, 2006, and 2007 partition to a remote instance, this same query runs in about 46 milliseconds from a cold cache. So we gained around 50% improvement! Nice! So how do we configure these remote partitions? For the remainder of this post, I’m going to walk through how we can configure these remote partitions. Referring to the diagram above, before we can store anything on a secondary database, we first need to create it. Open up SSDT, and create a new analysis services project. Give it a name, and create a data source pointing to your master analysis services database. You are going to want to use the Native OLE DB driver for Analysis Services. Deploy it to your secondary instance that you’re going to be using for your remote partition storage locations. On my laptop, I have an instance called .\REMOTE, which is where I’ll be deploying this too:

image  image

Now there are 2 server level properties we need to adjust, in both the master and the secondary server. Click on the server name, and go to properties. On the general page, you are looking for the Feature \ LinkFromOtherInstanceEnabled and Feature \ LinkToOtherInstanceEnabled properties. By default these are set to false, we want to change them to true. Once you finish the first one, make sure you change it for the second server as well.

image

Now we want to go back into our remote database that we just created. Right click on it, and select SCRIPT DATABASE AS –> ALTER TO –> NEW QUERY WINDOW. After the data source impersonation info tag closes, you want to insert the following line:

<MasterDataSourceID>YourDatabaseNameHere</MasterDataSourceID>

So my complete script executing this against an AdventureWorks database I have on my local machine looks like the following:

image

Run it, and you’ll should get your success XML return. Finally we’re ready to change our partition location in our primary database. Open up SSDT, and open the project for your original solution (with all your dimensions, cubes, and other stuff). Open up cubes, and go to the partitions tab. If you have an existing partition that you want to move, click on it, go to properties, and under the Configurable section select Storage Location and expand the ellipses:

image

Select “Remote Analysis Services instance”, and then select the database name. The first time you do it you will need to click “NEW”, and create an Analysis Services connection to your remote instance. Select whether the storage location should be at the default or in another location. It’s generally a good rule of thumb that your partitions should be on faster drives the more current the data is. For example, if I have a 2013 partition, and then another partition for older data, it’s ok to have my older data partition on a slower, SATA drive, but my 2013 partition should be on something significantly quicker, as I’m accessing it more often. Remember that the goal here is to get speed & performance up as well. We are moving this to a separate instance in order to allow Analysis Services to compute millions of records over a different partition in addition to what the master server is doing. Moving all of this data and then sticking it on the old drive you found in the basement isn’t going to give you the lift you want. Click OK.

If you are creating a new partition, select New partition, and select the Measure group you wish to partition. Once you click Next, select whether the partition is query bound or table bound, and click Next. If you followed along in the last exercise the next screen will look quite familiar:

image

Click next, and give your new partition a name. That’s all there is to it! Make sure you do a process full on your master database the first time after you initially configure. Administration and management of the new remote partition is all done via the master database. You can scale this out to as many secondary databases as needed, if you want. Be careful with over-saturation though. I didn’t test it but I would imagine there is an ROI point similar to SSIS’, where you reach a point of diminished return.

Happy tuning!

Advertisements

3 thoughts on “SSAS: Remote Partitions”

  1. I had implemented this with small data set for proof of concept in two separate physical server.
    Could you please tell me how can i be sure that while querying master database the remote server resource( ie. processor and memory) is being used parallel with master server resource?
    Is there any way to check this out before implementing it into huge data?

    1. Hi Bikram,
      Apologies for the late response. I’ve been moving across the country and my responses have been delayed. The only way that I know of to monitor it over the remote instance would be to directly remote desktop and set up some perfmon counters to log the performance, or you could set a trace up in profiler. As the master instance controls the secondary node, all of the memory and CPU allocations are determined by the primary server. I will try and do some more research and *hopefully* a blog post on it soon.

      Thanks!
      Chris

  2. Hi Chris,
    Thanks for the response. i traced the profiler events in both server. What i found while processing the partition is; it takes more than usual time before initiating actual read, execute and write operation. It continuously fire alter xmla commands. So, in my case it became slower than usual. Is there anything to be cross checked in SSAS configuration?.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s