SSAS Tabular: Internal Physical Files

SSAS Tabular stores data mostly in memory. However, it still consists of two main engines to drive and respond to user queries, the formula engine and the storage engine. This, theoretically, is very similar to how queries were answered in multidimensional models. The overall query request follows a pattern that can be visually represented using the below diagram:





Just like in multidimensional, the formula engine takes the request, determines the optimal query plan, and then answers the query or requests the data that it needs from the storage engine. The storage engine runs against the data that the xVelocity engine stores in the cache. However, to me, 1 major question has bugged me about this from the beginning. In the case of a restart, or memory clearing of some way, how does the tabular model store it’s relationships between everything. The data itself may be stored in memory, but the relationships and the structure surely must have some reference files on disk that xVelocity and Tabular can use. Today while working and playing with it tonight, I think I’ve figured it out.


Each tabular model creates a folder in the data drive that we created and configured during installation, typically located at $InstallLocation/Microsoft SQL Server\MSAS11.TABULAR\OLAP\Data. Inside of this folder, each tabular model creates a folder, appended with .db, and an XML file. The XML file contains relevant information about the database, and the folder contains all the “good stuff”. Opening the folder against the Adventure Works tabular database, for example, gives me an image that looks like the below:



Similar to the database level, each table in tabular consists of an XML file, ended with .dim. Each table then also contains a folder. The .dim XML file contains relevant information about everything the table needs and the pointers that exist between the different columns to store the relationships for storage. Inside of the folder, each column consists of a tbl xml file, an IDF file, and potentially an HIDX file or DICTIONARY file, depending on the data type. Everything except for the tbl xml file are for tabular internal use, but we can get some good information about what tabular did during processing for each attribute from the tbl file. Opening up the tbl file, shows us some really good information about that column. We can see things like the min and max data id, and the number of segments:



Min and max segment and data id’s are things that the storage engine uses during query time to identify what data exists in the proper segment and data file. This is quite useful information! I’ll have to remember this for the future and when it comes to performance tuning tabular models.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s