Environment: win7, SQL Server 2008 R2
Application: Microsoft SQL Management Studio 2008 R2, Business Intelligence 2008 - SSIS
SSIS competency level: Novice
ETL Process ( it works fine - no issues): The following flowcharts illustrate basic ETL process, where the data is being transformed to a staging table [destination table]. The staging table consist of the following fields
(id, ssn, Fname, Lname, Subject_cd, Test_dt, Score, comments, ind_response)
![]()
![]()
After running ETL package in SSIS, the data were loaded to the Staging table (destination table)The following code shows the data being created in the staging table to use it later for another data transaction
Code:
CREATE TABLE Staging_Table (
id CHAR(9)
,ssn CHAR(9) NOT NULL
,Fname VARCHAR(50) NOT NULL
,Lname VARCHAR(50) NOT NULL
,Subject_cd char (2)
,Test_dt datetime
,Score char(2), comments varchar(250),ind_response varchar(250)
);
INSERT INTO Staging_Table (
id
,ssn
,Fname
,Lname
,Test_dt
,Score
,Subject_cd, comments ,ind_response
)
VALUES (
123456781,123549874
,'Sally', 'Johnson'
,'QB','3', 'N/A'. '1243212221144121321411123332411121'
);
INSERT INTO Staging_Table (
id
,ssn
,Fname
,Lname
,Test_dt
,Score
,Subject_cd,comments ,ind_response
)
VALUES (
123456792,003549874
,'Will', 'Smith'
,'AD','3','Test was good','1231121223334121334121412'
);
INSERT INTO Staging_Table (
id
,ssn
,Fname
,Lname
,Test_dt
,Score
,Subject_cd, comments ,ind_response
)
VALUES (
120056783,993549800
,'William', 'Wahab'
,'FR','1', 'no comments', '111111111111222224121312144412'
);
INSERT INTO Staging_Table (
id
,ssn
,Fname
,Lname
,Test_dt
,Score
,Subject_cd
)
VALUES (
213450081,128749890
,'Douglas', 'Mike'
,'CH','+2'
);
Problem: How to load staging_table data into three entity tables, while there is referential integrity data constraints.
- If matched: Check whether SSN exists in the SSN table. Insert the records (id, Subject_cd, Score, test_dt) in the ind_subject_scores].
- If not matched: Insert the records to the following table: SSN, Individual, then [ind_subject_scores].
- If matched in the [ind_subject_scores] then update the additional elements in the table
Table #1:Parent
CREATE TABLE SSN (
id CHAR(9)
,ssn CHAR(9) NOT NULL
CONSTRAINT [FK_individual] FOREIGN KEY([id])
REFERENCES [individual] ([id])
);
INSERT INTO ssn (
id
,ssn
)
VALUES (
'12001212','993549800'
);
Table #2: child
CREATE TABLE individual
(
id CHAR(9) NOT NULL
,Fname VARCHAR(50) NOT NULL
,Lname VARCHAR(50) NOT NULL
, email VARCHAR(50) NOT NULL
)
INSERT INTO individual (
id
,Fname
,Lname
)
VALUES (
'12001212','William', 'Wahab', 'fake@yahoo.com'
);
Table #3
CREATE TABLE [dbo].[ind_subject_scores](
[ind_scr_id] [int] IDENTITY(1,1) NOT NULL,
[id] [char](9) NULL,
[subject_cd] [char](2) NULL,
[score] [varchar](2) NULL,
[test_dt] datetime
)
INSERT INTO [dbo].[ind_subject_scores] (
id
,Test_dt
,Score
,Subject_cd
)
VALUES (
897841239, '20110101'
,'2'
,'FR'
);
INSERT INTO [dbo].[ind_subject_scores] (
id
,Test_dt
,Score
,Subject_cd
)
VALUES (
80041239, '20110115'
,'2'
,'CH'
);
Table #3 - additional elements were requested later. These data can be updated from the staging table if matched occured
CREATE TABLE [dbo].[ind_subject_scores](
[ind_scr_id] [int] IDENTITY(1,1) NOT NULL,
[id] [char](9) NULL,
[subject_cd] [char](2) NULL,
[score] [varchar](2) NULL,
[test_dt] datetime,
comments [varchar](250)
ind_response [varchar](250)
)