|
I have duplicate client records within my data file. I need to determine the level of agreement between the records. I would like to create a variable that identifies if there is agreement within the RecordID for a variable of interest (like Gender). All variables are numeric.
The number of duplicates range from 2-17 records within a RecordID. I would like the output files to look like this:
|
|
Christy, You can use the magic of SQL!
To run the duplicates check for a variable just execute the macro:
The default value for Missing is -1 you can change that by adding it as a parameter;
If you're processing a character variable, you'll need to change the missing value as such:
|
|
You can also do it using a standard DATA step and BY processing:
|
|
proc sql; create table output as select RecordID, case when count(distinct Gender) = 1 then '1-yes' when count(distinct Gender) > 1 then '0-no' else ' ' end as Gender_agreement , count(*) as Count from datasetname group RecordID; quit; |