Hi
I have read about Rows per Batch (RPB) and Maximum insert commit size (MICS) from online search as well as research paper from Microsoft. I am still not 100% clear on them though.
I have a data warehouse on Azure Virtual machine A7. It has Stage DB and ODS DB.
There is a table on our production server without a timestamp. It's silly but I have no choice but to do full load. It's about 35 million rows from 18 databases everyday and it grows..
I created a heap table on stage.
It's suggested from the article http://blogs.msdn.com/b/sqlcat/archive/2013/09/16/top-10-sql-server-integration-services-best-practices.aspx
Commit size 0 is fastest on heap bulk targets, because only one transaction is committed
What I am not sure is that if this is one transaction committed, does it mean that it spills into temp table if the memory cannot hold the entire data?
Before Source data will fill up the SSIS pipeline, will it be better to commit the transactions and clear the pipeline? If the commit size is 10000 then when the size of data reaches 10000, it will commit the database and clear the pipeline. Am I right? As far as i remember there won't be any logging in transaction log for bulk insert. Is that right?
What does it really happen when commit size is 0 for big data in terms of transaction log and temp table?
Kind regards