Hi All,
Any help regarding this very appreciated.
Problem:
I have a tough situation of trying to execute multiple instance of same package, to reduce the process load times.
Introduction:
We have src system which get 7000 tiny files of 72 rows each, and the SSIS package uses For Each Loop task and iterates through each file and loads data. We have a Process table that keeps track of the status of the SRC Process & ETL Load Process.
When the src process starts, For each row in the process table, it assigns a row status 'Assigned' brings in the flat file of 72 rows & updates the status as 'Complete'. When the ETL starts, for each file in the shared directory, it assigns status 'Running' and loads the data and updates status 'Complete'. Then the file is moved to different processes folder. Its like the bridge table between the 2 processes.
Bride Table Format: Table_PK(identity col), (DATE, City) is natural key, it is a cross join of date & City, so the process is getting 1 file every day for 1 city. Initial status are both 'Queued'
-----------------------------------------------------------------------------------------------------------------
Table_PK DATE CITY SrcProcStatus ETLStatus
-----------------------------------------------------------------------------------------------------------------
1 03/17/2007 Abingdon Queued Queued
2 03/17/2007 Albion Queued Queued
3 03/17/2007 Aledo Queued Queued
4 03/17/2007 Altamont Queued Queued
5 03/17/2007 Alton Queued Queued
6 03/17/2007 Amboy Queued Queued
7 03/17/2007 Anna Queued Queued
8 03/17/2007 Antioch Queued Queued
9 03/17/2007 Arcola Queued Queued
10 03/17/2007 Arlington Heights Queued Queued
11 03/17/2007 Ashley Queued Queued
.... ....
11 03/17/2007 Zeigler Queued Queued
11 03/17/2007 Zion Queued Queued----------------------------------------------------------------------------------------------------------------
Since the bridge table is prepopulated, the src process(which is on Unix) starts multiple threads and gets files with in 30 minutes. But the SSIS is serial process & takes 2 -3 hrs to load the files, most of the time is taken by file operations and SSIS can only start only 1 thread.
Future Plan:
So to bring down the processing times, we wanted to start the SSIS packages in the Bridge table instead of starting in the share folder. i.e. for each row in the bridge where SRCProcess is Complete & ETLProcess Queued, start the SSIS process for this src file. Since our SRC files are names as "CityName_Date.csv" it will not be difficult. So we wanted to start multiple threads, that way the load process will be fast.
Implementation:
In the T-SQL loop we wanted to use 'xp_cmdshell' and call DTEXEC utility with the src file name as variable. But the DTEXEC is a synchronous process, but I am looking for a way to implement this asyncronously. Just like using "nohup executionscript &" in unix.
So any ideas on how to implement this, I looked on the web, and there is some thing about service broker, but all it says is about sending messages & queuing. Any light on how to implement this on windows server is going to be a life saver.
Thanks a lot,
Venkat