我在SAS中有两个我想要合并的数据集,但它们没有共同的变量.一个数据集具有"subject_id"变量,而另一个具有"mom_subject_id"变量.这两个变量都是9位代码,在代码中间只有3位数字,具有共同的含义,这就是我合并它们时需要匹配的两个数据集.
我想要做的是在每个数据集中创建一个新的公共变量,它只是主题ID中的3位数.这3个数字将始终位于9位主题ID内的相同位置,因此我想知道是否有办法从变量中提取这3个数字以创建新变量.
谢谢!
SQL(使用Data Step代码中的示例数据):
proc sql;
create table want2 as
select a.subject_id, a.other, b.mom_subject_id, b.misc
from have1 a JOIN have2 b
on(substr(a.subject_id,4,3)=substr(b.mom_subject_id,4,3));
quit;
Run Code Online (Sandbox Code Playgroud)
数据步骤:
data have1;
length subject_id $9;
input subject_id $ other $;
datalines;
abc001def other1
abc002def other2
abc003def other3
abc004def other4
abc005def other5
;
data have2;
length mom_subject_id $9;
input mom_subject_id $ misc $;
datalines;
ghi001jkl misc1
ghi003jkl misc3
ghi005jkl misc5
;
data have1;
length id $3;
set have1;
id=substr(subject_id,4,3);
run;
data have2;
length id $3;
set have2;
id=substr(mom_subject_id,4,3);
run;
Proc sort data=have1;
by id;
run;
Proc sort data=have2;
by id;
run;
data work.want;
merge have1(in=a) have2(in=b);
by id;
run;
Run Code Online (Sandbox Code Playgroud)