SSIS Data flow


“Why does my data flow take so long to execute when I use an OLE DB Command component?
“I need to update values in another table for every row in my data flow.”

These are common questions I see when dealing with “slow” performance of an SSIS package. Sometimes though, the developer of the package does not even understand which component is taking so long to process say, thousands of records in their data flow, so it is understandable why this component may proliferate in many shops. (Sometimes it is needed, of course, but you need to always ask the question - do I need the updates to happen RIGHT NOW?)

If you have a need to update data in another table, while still pushing data from a source to a destination, the OLE DB Command is often used because of the thought that it is part of the data flow, so why not use it.

Read on for more….
(more…)

When using a Merge or Merge Join component in a data flow, your incoming data is required to be sorted. While it may be easy to drop a Sort component into your data flow, it may make more sense to perform the sort in your source query (if you are using a source such as an OLE DB Source component).

If you decide to use an ORDER BY in your source and want to tell the SSIS Data Flow that your data is sorted, follow the below steps:

(more…)

There are many, many resources out on the ‘Net regarding SSIS and the Lookup component and what each of its cache modes are and how to implement them in your own package. This is going to be a technical post, for those of you interested in what each cache mode does behind the scenes.

(more…)

I know, this isn’t “breaking” news or anything, but what is new is the white paper detailing how Microsoft was able to achieve this record breaking speed using SSIS. Check it out below as its a very interesting read, and it may help generate some new ideas for your implementations.

http://blogs.msdn.com/sqlperf/archive/2009/03/03/an-etl-world-record-revealed-finally.aspx

So, one of the pain points in SSIS is that the Lookup Component performs CaSE sensitive matching, as well as retaining any trailing spaces that may be in the data. So be forewarned that when using the lookup component, it may behoove you to use a derived column upstream to TRIM([SSISColumn]) your data to get rid of any leading/trailing whitespaces and to perform the same (if needed) on the SQL Server side via an SQL statement - ltrim(rtrim(SQLServerColumn)). Also, if you have differing CaSEs among the incoming data and the reference data, you may need to convert one or the other, or both to UPPER/lower case before performing the lookup.

If you’d at least like to see CaSE INsensitive searches, please visit my submission and lets see if we can get this feature added to SSIS in SQL Server 2008.

One workaround for you, should you want to use it, is to use the Fuzzy Lookup Component instead. This has its own problems, as I’ve made mention to in the above feature request submission.

Edit: Jamie Thomson submitted an identical request before me but I’ve made reference to that submission in mine. The idea was to get it in SSIS 2008. Never-the-less, Michael Entin claims this feature request has been dropped due to lack of time.

Many users have issues when trying to use a stored procedure in an OLE DB Source component. Here are some tips:

  • Ensure that there is a final select statement at the end of the stored procedure to ensure the correct resultset is returned
  • At the top of the stored procedure as part of the procedure code, add the following line: SET NOCOUNT ON
  • At the top of the OLE DB Source SQL statement and before you issue the EXEC YourSproc line, add the following: SET FMTONLY OFF.

This should ensure that when you click on the column mapping tab, the columns are correctly displayed.

Also another approach is to use a table variable in the stored proc. Then at the end, select from that table variable.

Jamie Thomson has posted four SSIS lookup component feature requests that have been recently closed as “fixed” by Microsoft. Hopefully these will make it into the next CTP of SQL Server 2008. Take a look because they all have very high potentials to increase your productivity and useability of SSIS.

See them here: http://blogs.conchango.com/jamiethomson/archive/2007/08/21/SSIS_3A00_-Big-improvements-to-Lookup-in-SQL-Server-2008.aspx

Just a quick tutorial on using a script component as a source… Follow the link for screenshots.

(more…)

Over on the forums today a user, Pra Rav, has asked how do we work with mainframe zoned decimals. I have the answer, at least on how *I* deal with them. Take it or leave it as there may be other ways to do this, but it works slick enough for me to blog about it and to share it with Pra. Read on….

(more…)

It is one of the more common questions over on the SSIS Forums. How can I check if a record exists and if so, how can I check (quickly) if it has changed. Jamie Thomson has a blog post that outlines this, and I’ve made it a sticky post on the forums. However, if you need to check many, if not all of the columns in a table for changes, Jamie’s blogged method might be a bit laborious. This is where the Konesan’s Checksum transformation comes in. Read on…

(more…)

Next Page »