Hello everyone,
I have encountered a problem when tuning the performance of a Data Flow Task recently. The DFT uses Flat File Source and OLE DB Destination to load data into a staging table. The DFT doesn’t contain any other asynchronous transformations
except the Flat File Source. The overview of the DFT is as follows:
![]()
The package runs in 64-bit runtime mode. The DFT takes about 50 minutes to process 40 million - 50 million rows.
By enabling the BufferSizeTuning property on the DFT, I am able to obtain the buffer related information. First, I use the default settings for DefaultBufferMaxRow (10000) and DefaultBufferSize (10MB). However, in the package log, there
are both 10MB buffers and 64KB buffers allocated. Please see the following log info:
User:BufferSizeTuning,TestDW,Test\administrator,Insert into Staging Table and Production Table,{feb3e0dc-f5db-400f-87f5-1561d93bc30f},{FFFC0B10-7BC3-43BC-85AC-6557CA654FA0},2014/12/15 19:34:45,2014/12/15 19:34:45,0,0x,Rows in buffer type 0 would cause a buffer size greater than the configured maximum. There will be only 3901 rows in buffers of this type.
……
User:BufferSizeTuning,TestDW,Test\administrator,Insert into Staging Table and Production Table,{feb3e0dc-f5db-400f-87f5-1561d93bc30f},{FFFC0B10-7BC3-43BC-85AC-6557CA654FA0},2014/12/15 19:34:46,2014/12/15 19:34:46,0,0x,Rows in buffer type 9 would cause a buffer size less than allocation minimum, which is 65536 bytes. There will be 1638 rows in buffers of this type.
…….
User:BufferSizeTuning,TestDW,Test\administrator,Insert into Staging Table and Production Table,{feb3e0dc-f5db-400f-87f5-1561d93bc30f},{FFFC0B10-7BC3-43BC-85AC-6557CA654FA0},2014/12/15 19:34:47,2014/12/15 19:34:47,0,0x,Rows in buffer type 10 would cause a buffer size less than allocation minimum, which is 65536 bytes. There will be 2730 rows in buffers of this type.
……
Buffer manager is throttling allocations to keep in-memory buffers around 850MB.
……
I also enable the Performance Monitor when the package runs, and I find that the number of buffers in use is around 19000, and the number of buffers spooled is a little larger than 19000. I understand that this happens because of the RAM
pressure, and the buffers are spooled to hard disk.
Now the questions are:
- The flat file source doesn’t have BLOB type columns, and the size of different rows should be roughly the same. Why some buffers are 10MB, while some others are 64KB?
- For the 64KB buffers (which means the estimated row size * 10000 < 64KB), should not they have more than 10000 rows? Why only 1638 or 2730 rows are filled into the buffer?
- When I change the DefaultBufferMaxRow or DefaultBufferSize, it only affect the number of rows filled in the large buffers (I understand this is because the size of the buffers changes). There are still 64KB buffers allocated. So, I still
not clear why and how the 64KB buffers are created. If it is because there are few large memory chunks (10MB) available for buffer allocation, I also set the DefaultBufferSize to 1MB or 5MB, but it doesn’t change the situation of the 64KB buffers allocations.
- Why is the memory used by all the buffers limited to 800-900MB? I suspect that most of the buffers are 64KB, and the SSIS pipeline controls the number of buffers to a certain number, thus, the total size of the buffers are even less than
1000MB.
- There is another package running at the same when this package runs as a job. I changed the BufferTempStoragePath of this package to a different folder than the TEMP folder, however, it seems doesn’t affect the package performance in
the main.
- Could you give some suggestions to improve the performance of the package significantly without increasing the physical memory?
Thank you very much if you could spend some time reviewing the scenario, and clearing up some of my doubts or giving some suggestions.
Mike Yin
TechNet Community Support
![]()