Different types of merging using Data Step or Proc SQL in SAS

This post shows different type of merging using Data step and Proc SQL in SAS. Use the following two datasets (A & B) and give it a try!!

Dataset A = Red color;
Dataset B = Green color;
Common observations in A and B = Yellow color (Red + Green = Yellow!)

DATA ADATA B
Data A;
input ID Gender$;
Datalines;
1 M
2 M
3 M
4 M
5 M
6 F
7 F
8 M
9 M
10 M
;
run;
Data B;
input ID Sex$;
Datalines;
6 F
7 F
8 M
9 M
10 M
11 F
12 F
13 F
14 F
15 F
;
run;
What do you want?Graphical representationData StepProc SQL
All observationsdata M;
merge A (in = x) B (in = y);
by id;
if x = 1 or y = 1;
run;
Proc sql;
create table M2 as
select coalesce(a.id, b.id) as id, gender, sex
from A full join B
on a.id = b.id;
quit;
All matching obs and non-matching from A data N;
merge A (in = x) B (in = y);
by id;
if x = 1;
run;
Proc sql;
create table N2 as
select coalesce(a.id, b.id) as id, gender, sex
from A left join B
on a.id = b.id;
quit;
All matching obs and non-matching from Bdata O;
merge A (in = x) B (in = y);
by id;
if y = 1;
run;
Proc sql;
create table O2 as
select coalesce(a.id, b.id) as id, gender, sex
from A right join B
on a.id = b.id;
quit;
Only matching observationsdata P;
merge A (in = x) B (in = y);
by id;
if x = 1 and y = 1;
run;
Proc sql;
create table P2 as
select coalesce(a.id, b.id) as id, gender, sex
from A inner join B
on a.id = b.id;
quit;
Only nonmatching from Adata Q;
merge A (in = x) B (in = y);
by id;
if x = 1 and y = 0;
run;
Proc sql;
create table Q2 as
select coalesce(a.id, b.id) as id, gender, sex
from A full join B
on a.id = b.id
where b.id is null;
quit;

More efficient code:
Proc Sql;
create table Q2 as
select coalesce(a.id, b.id) as id, gender, sex
from A left join B
on a.id = b.id
where b.id is null;
quit;
Only nonmatching from Bdata R;
merge A (in = x) B (in = y);
by id;
if x = 0 and y = 1;
run;
Proc sql;
create table R2 as
select coalesce(a.id, b.id) as id, gender, sex
from A full join B
on a.id = b.id
where a.id is null;
quit;

More efficient code:
Proc Sql;
create table R2 as
select coalesce(a.id, b.id) as id, gender, sex
from A right join B
on a.id = b.id
where a.id is null;
quit;
Only nonmatching from A and Bdata S;
merge A (in = x) B (in = y);
by id;
if (x = 1 and y = 0) or (x=0 and y = 1);
run;
Proc sql;
create table S2 as
select coalesce(a.id, b.id) as id, gender, sex
from A full join B
on a.id = b.id
where a.id is null or b.id is null;
quit;
Posted in SAS Tagged with:
  • shambhu verma

    Good Explanation……………Thanks

  • Sneha Mehta

    I like the way you have presented everything in one doc for sas merging..Thank you very much…

  • raj

    can we merge tables also with merge statement or we can merge only datasets.

    • hbmehta

      To me, tables and datasets are same.

  • Archita Bhansali

    excellent way to put it !! a very clear way to understand 🙂

  • Bobby

    I did a ton of searching on this, and your page has by far the best explanation. Thanks for your help!

  • Glenn

    Thanks ! Great explanation and comparison.

  • Nasochkas

    This is great. There is no need for the coalesce statement though. Can just say select id, gender, sex.

  • Swati

    Very clearly described…Thanks!

  • Prashant Chegoor

    Good

  • Prashant Chegoor

    Excellent @hbmehta:disqus . One small correction . The PROC SQL Join for the “Only nonmatching from A ” should have the Table name as Q2 instead of R2 to make sure the table naming convention above is Consistent.

    Proc Sql;
    create table Q2 as

    select coalesce(a.id, b.id) as id, gender, sex

    from A full join B

    on a.id = b.id

    where b.id is null;
    quit;

    Also as is very much evident from the Graphical Representation the below 2 scenarios can also be represented by these PROC SQL joins as well.

    a) Only nonmatching from A
    ************************************
    Proc Sql;
    create table Q2 as

    select coalesce(a.id, b.id) as id, gender, sex

    from A left join B

    on a.id = b.id

    where b.id is null;
    quit;

    b) Only nonmatching from B
    ************************************
    Proc Sql;
    create table R2 as

    select coalesce(a.id, b.id) as id, gender, sex

    from A right join B

    on a.id = b.id

    where a.id is null;
    quit;

    • hbmehta

      Thank you Prashant. Corrected Q2 and your suggestions on a) and b) are well received. Changed as per your codes.

      • Prashant Chegoor

        Thanks @hbmehta:disqus for promptly making the Changes. Much Appreciated.

  • Himadri

    Great!!