Wednesday, March 21, 2012

read an xls file from a script task

I'm trying to read an xcel file from within the script task of SSIS. I've seen posts that kind of steer me in the right direction, but none to do exactly what I'm looking for. I am loading data from flat files into a database where the source file has text longer than the target column. In order to get the text to "fit" i would like to apply a series of standard abbreviations, for example replace "DOCTOR" with "DR". I started off by simply writing a series of if-then logic; if length (<source>) > <targetlengh>, then replace <long> with <short>. After writing this for 40+ abbreviations I soon realized that this is not the best way to do this. So I came up with the idea of storing all the abbreviations in a .xls file which would be opened and applied via a script task. The problem is that I'm new to VB.net (actually any flavor of VB) and have not been able to read the abbreviations file as of yet.

This is what I've tried thus far, which results with this error: Exception from HRESULT: 0xC001400B

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

'

' Add your code here

'

Dim local_SQLConnectionManager As New Microsoft.SqlServer.Dts.Runtime.Wrapper.ConnectionManagerAdoNet()

Dim local_SQLConnection As New System.Data.SqlClient.SqlConnection

Dim local_SQLDataReader As System.Data.SqlClient.SqlDataReader

Dim local_SQLCommand As New System.Data.SqlClient.SqlCommand

'local_SQLConnectionManager = Microsoft.SqlServer.Dts.Runtime.Wrapper.cConnections("Name of ConnectionManager")

local_SQLConnectionManager.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Departments\DataServices\Projects\QNXT Conversion\XWalks\DesciptionAbbreviations.xls;Extended Properties=Excel 8.0;HDR=YES;"

local_SQLConnection = CType(local_SQLConnectionManager.AcquireConnection(Nothing), System.Data.SqlClient.SqlConnection)

local_SQLConnection.Open()

local_SQLCommand.Connection = local_SQLConnection

local_SQLCommand.CommandText = "SELECT * from [sheet1$]"

local_SQLDataReader = local_SQLCommand.ExecuteReader()

If local_SQLDataReader.Read Then

'Microsoft.SqlServer.Dts.Runtime.Wrapper.Variable("Just a variable").Value = local_SQLDataReader.GetString(0)

End If

local_SQLDataReader.Close()

End Sub

I think a better way to do this would be to import the XLS file into a SQL table, quite easy to do via SSIS. This forms a reference table in SQL of abbreviations and expanded terms. Then use a Lookup after your Flat File source to return the replacement value based on the reference table. Set the lookup to ignore errors, no lookup value found and then use a derived column to merge the two values into one column.

This will only work if your term is the whole column value.

Off the top of my head I'm also thinking the Term Extraction/Lookup transform may be of use if your terms are in longer fields. Check out the whitepaper on MSDN.

|||

I dont that that would for for me because I need to look at each "record" in the abbreviations.xls file and apply it to a column on the input buffer. Furthermore I also need to only apply the abbreviation if the value i'm shortening is longer than a pre-defined length.

Here's the pseudocode for what I'm tring to do:

shortDescr = row.longDescr

For each xcelRow in xcel file

If length(shortDesc) > row.maxLen then

shortDesc.replace(xcelRow.longWord,xcelRow.abbrevWord)

end if

end loop

If length(shortDesc) > row.maxLen then

shortDesc = shortDesc.substring(0,row.maxLen)

end if

row.outputShortDesc = shortDesc

|||

You can't use SqlClient with the Jet OLE DB Provider. Use the same connection string and similar code with System.Data.OleDb instead.

However I believe that you will need double double quotes around the multi-value string for Extended Properties:

local_OleDbConnectionManager.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Departments\DataServices\Projects\QNXT Conversion\XWalks\DesciptionAbbreviations.xls;Extended Properties=""Excel 8.0;HDR=YES;"""

-Doug

http://msdn2.microsoft.com/en-us/library/ms403358(SQL.90).aspx

|||

Thanks for pointing me in the right direction, i've got it working now

No comments:

Post a Comment