r - Subset with unique cases, based on multiple columns -
i'd subset dataframe include rows have unique combinations of 3 columns. situation similar 1 presented in this question, i'd preserve other columns in data well. here's example:
> df v1 v2 v3 v4 v5 1 7 1 100 98 2 7 2 98 97 3 8 1 c na 80 4 8 1 c 78 75 5 8 1 c 50 62 6 9 3 c 75 75
the requested output this, i'm looking unique cases based on v1, v2, , v3 only:
> df.new v1 v2 v3 v4 v5 1 7 1 100 98 2 7 2 98 97 3 8 1 c na 80 6 9 3 c 75 75
if recover non-unique rows great too:
> df.dupes v1 v2 v3 v4 v5 3 8 1 c na 80 4 8 1 c 78 75 5 8 1 c 50 62
i saw related question how in sql (here), can't in r. i'm sure it's simple messing unique() , subset() hasn't been fruitful. in advance.
you can use duplicated()
function find unique combinations:
> df[!duplicated(df[1:3]),] v1 v2 v3 v4 v5 1 7 1 100 98 2 7 2 98 97 3 8 1 c na 80 6 9 3 c 75 75
to duplicates, can check in both directions:
> df[duplicated(df[1:3]) | duplicated(df[1:3], fromlast=true),] v1 v2 v3 v4 v5 3 8 1 c na 80 4 8 1 c 78 75 5 8 1 c 50 62
Comments
Post a Comment