I've done some research on this problem and not finding a solution that works. Essentially what's happening is that I have an end user whom is creating an Excel file (.xls) by copying data from various places and then pasting the data into two columns. One column is used to store mobile phone numbers without the dashes, so all numbers look like this: 2064337873 . After she pastes the data and saves the file, she drops it into a folder where my SSIS package will pick it up and insert the data to a table.
I'm sure you know where this is going - when the data is inserted into the SQL table, some values for the mobile phone number are NULL. I get that this is caused by some numbers actually being stored in Excel as TEXT, while other values are stored as a NUMBER.
Here's what else I tried:
Saving the Excel file as a .csv file. Unfortunately I get 65,000 rows inserted into the table when there are normally less than 100 rows in the file. I'm not sure what causes this. When I copy the data from Excel and paste it into a fresh, new spreadsheet and save as .csv, I get my two columns names showing up in the preview of the Flat File task PLUS several additional columns (column 0, column 1, etc) which seem to hold additional mobile phone numbers. Again, not sure what causes this.
Using IMEX=1 in the Excel connection string does not work. I cannot change the registry settings, I do not have privileges. Really weird to me that Microsoft hasn't found a resolution for this problem, particularly since Excel is created there. Excel is crap in SSIS...weird that two products from the same company can't play nicely together. What do we have to do to get the two teams talking over there in Redmond?
I digress, LOL.
So my question:
How the hell do I resolve this problem?
Do I have to run this file through some sort of parser?
Does anyone have any ideas?
Thanks!