SSIS Performance Tuning: Max Default Buffer Rows

UPDATE 01/19/2015: As this is by far one of the most popular posts on ssis performance tuning on my blog, you may enjoy the dynamic configuration follow up post here.

When SSIS caches data in memory as it’s going through the ETL process, it uses something called buffers to control how much data in can pick up and transform/modify at once. Buffers are what SSIS uses to cache each set of data as it transforms it through the pipeline. For example, let’s take a look at the below scenario:

image

What’s going on right now in this package? This package is set to the defaults right now, which is 10,000 rows or 10 MB, whichever comes first. We can see that SSIS has picked up 5 buffers of ~10K rows each, and the first buffer of 9,947 rows waiting for me. In this case, it’s because I just have a data viewer on the data flow so I can see what is passing through. As this is a 20 million row fact table, my issue here is that 20,000,000 divided by 10,000 rows per buffer is, roughly, 2,000 buffers. That’s quite a bit. The last task at the end of my package is just a Row Count component. Using 2,000 buffers at 10,000 rows per buffer this package takes about 1 minute and 8 seconds. What I can do is adjust the size of the buffer that SSIS uses to process each set. Changing the value to 873,000 rows per buffer and 100 MB default buffer max size moves my performance to about 50 seconds. A boost of about 30%, and it could have been a lot more if I was using actual server resources and not my little laptop here.

The problem we face as ETL developers is that without changing these from the defaults, we’re allowing SSIS to run at a much smaller capacity than what it’s capable of. It would be like buying an Aston Martin DB7 (my favorite car :)) and never moving it out of 2nd gear. Painful. To help with this, I’ve created a very simple excel spreadsheet to calculate how many rows should be put into a buffer at one time in order to maximize SSIS’ efficiency. The buffer size is customizable based on how much memory you want SSIS to use for each pass it uses through the data. The maximum amount SSIS can use at once is 100 MB, or 104857600 bytes), and the spreadsheet will take what you input, up to the maximum SSIS allowed limit. Any field highlighted in yellow is where you input your values in, and each step to the left of the value denotes either step by step instructions or what the step is doing.

image

The complete file is available for download here.

Enjoy!

Advertisements

6 thoughts on “SSIS Performance Tuning: Max Default Buffer Rows”

  1. Sweet spreadsheet.

    I think I’d compare SSIS to a Subaru Impreza WRX with cloth seats, Sync and no cup holders, but still a good post!

  2. Hi,
    Thanks a lot for sharing all these informations
    The excel link to skydrive is broken, can you give us a fresh link ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s