Thursday, 30 July 2015

SSIS : Lookup Vs Merge Join


In many interviews you will be seeing interview asking what is difference between merge join and lookup component.
Most of the people think Merge and lookup are same, well if that was the case than Microsoft would have not created two separate component.

Lookup and Merge join are two different activities they perform in different way and are meant of different purposes.  

Area
Merge Join
Lookup
Work function
Joins two data set based on one or more key columns. This joining is just like T SQL join (actually a hash join)
It reference data for a one and only one possible match of the lookup table (or query)
Input
  • Input needs to be sorted
  • Need 2 input source
  • Input need not to be sorted
  • Need one input
Output
Only one output with merging (joining) of two input data sets
Two outputs Matched and non matched
On Match
Matched rows will flow into output pipeline. All matching rows of both the source will be flown

Match rows will into Matched output pipeline.
Here if multiple matching rows are there in lookup table than only first matched row from lookup table will be selected.
On Non Match
It gets define by merge join settings-
  • Inner  - In case of inner join, non matched rows will be discarded
  • Left – In cases on left join, non matched rows of left of sources (first input) will be considered and other source non matched rows will be discarded.
  • Right - In cases on right join, non matched rows of right of sources (second input) will be considered and other source non matched rows will be discarded.
  • Full – both source non matched rows will be considered in output
Non match setting is available, when enabled-

  • Only non match rows of source will be considered and will be flown in output.
  • Lookup tables’s non matched out put is will not be considered at all.


Null Values
You have setting to specify if NULL to treated as equal. This will treat NULL value equal to other NULL value.  Else it will discard the row as non matched one.
Null values in source will match to null values in lookup if both tables has null, else of only source has null it will return an error. Null in lookup table will not have any impact.
Usage
When two data set needs to be joined (left, right, full)
When data needs to referenced like get PK from master tables based on name.  
Specific requisite
Input needed to be sorted
NA

I hope above comparison will help you understand the difference between SSIS Lookup and Merge join.

3 comments:

  1. thanks for sharing it's very useful to msbi learners.we are providing msbi online training.
    msbi ONLINE TRAINING

    ReplyDelete
  2. Thank you for sharing this kind of useful information,
    It is very useful mostly useful to who are searching for Msbi online training.

    ReplyDelete