1. Suppose you are inserting an (in humanly eyes) large dataset (say 200.000 records) into a target table and this gives a primary key violation. The question is which record(s) cause this violation.
e.g. insert into target select * from source
Troubleshoot technique I)
Insert the dataset into a table wihout constraints (no prim key) and
e.g. insert into test_target select * from source
Ia) select on the fields that make the primary key with a having count(*)>1 option to get the duplicate records.
select column1, column2 from test_target group by column1, column2 having count(*)>1
Ib) Inner join the result with the target table to get already present records.
select * from test_target t
inner join target on t.column1=target.column1 and t.column2 =target.column2
2. How to find out which records caused an error in an SSIS Data flow task.
3. Suppose you have a very large primary key (say 10 columns). Is there a solution to the annoying fact that you have to join every column every time.