I'm setting up a small-mid sized data warehouse load and I seem to be getting out of memory exceptions all over the place. I usually do my heavy data lifting using tsql but thats not an option here because the architectural design doesn't allow any staging data on the data warehouse server. All comparisons have to be done via ssis on a separate etl/staging db server
How does SSIS handle memory? Do the various tasks just use up as much as they need and if there isn't enough they throw an exception? Is there any way to hard limit tasks?
I'm trying to set up a Type 2 SCD on a wide table (i'm using the kimball scd codeplex data flow task, but i'm posting here because it doesn't appear to be being supported anymore - I'm hoping someone here that uses it can give me some quicker answers)
The task is fine for small amounts of rows (less than 1mil in source and dest), but if I attempt to do a bulk load of 8mil source into 2 mil dest dimension table then it looks like the reads are coming in at about 3-4 times the speed that the rows are going out of the SCD task. This causes the number of buffers to increase until all my memory is gone.
both sources are sql server 2008r2 (separate servers)
Both my sources are sorted on the business key
I've tried sending the new rows to a dummy destination to see if the blockage was at the insert stage, but it's not. The reads going into the SCD task still exceed the unmatched rows going out by a few factors.
One thing I've noticed is that the kimball scd task is blocking as well. I would have thought that since the inputs are sorted then new/changed rows should pop out fairly quickly but I never start getting any outputs until one of the sources has been completely loaded
I don't think there'd be any issues for regular trickle feed daily delta loads but I want to make a solution that will allow a reload of historical data by simply changing a delta date and rerunning.
This is the error:
Error: 0x0 at Publish - EDW - WorkOrder, Merge WorkOrder: Internal error (Error building work units: Error getting next work unit - marking all keys matched: Exception of type 'System.OutOfMemoryException' was thrown.) in ProcessCache_Thread_MatchKeys.
Can anyone suggest any solutions that might work? Is the 'double lookup' SCD type 2 approach more robust for larger data sets?