r - Subset with unique cases, based on multiple columns -

i'd subset dataframe include rows have unique combinations of 3 columns. situation similar 1 presented in this question, i'd preserve other columns in data well. here's example:

> df v1 v2 v3 v4 v5 1 7 1 100 98 2 7 2 98 97 3 8 1 c na 80 4 8 1 c 78 75 5 8 1 c 50 62 6 9 3 c 75 75

the requested output this, i'm looking unique cases based on v1, v2, , v3 only:

> df.new v1 v2 v3 v4 v5 1 7 1 100 98 2 7 2 98 97 3 8 1 c na 80 6 9 3 c 75 75

if recover non-unique rows great too:

> df.dupes v1 v2 v3 v4 v5 3 8 1 c na 80 4 8 1 c 78 75 5 8 1 c 50 62

i saw related question how in sql (here), can't in r. i'm sure it's simple messing unique() , subset() hasn't been fruitful. in advance.

you can use duplicated() function find unique combinations:

> df[!duplicated(df[1:3]),] v1 v2 v3 v4 v5 1 7 1 100 98 2 7 2 98 97 3 8 1 c na 80 6 9 3 c 75 75

to duplicates, can check in both directions:

> df[duplicated(df[1:3]) | duplicated(df[1:3], fromlast=true),] v1 v2 v3 v4 v5 3 8 1 c na 80 4 8 1 c 78 75 5 8 1 c 50 62

Search This Blog

Brayton

r - Subset with unique cases, based on multiple columns -

Comments

Post a Comment

Popular posts from this blog

javascript - backbone.js Collection.add() doesn't `construct` (`initialize`) an object -

php - Get uncommon values from two or more arrays -

Adding duplicate array rows in Php -