selection - R: Choosing specific number of combinations from all possible combinations -


let's have following dataset

set.seed(144)  dat <- matrix(rnorm(100), ncol=5) 

the following function creates possible combinations of columns , removes first

(cols <- do.call(expand.grid, rep(list(c(f, t)), ncol(dat)))[-1,]) #     var1  var2  var3  var4  var5 # 2   true false false false false # 3  false  true false false false # 4   true  true false false false # ... # 31 false  true  true  true  true # 32  true  true  true  true  true 

my question how can calculate single, binary , triple combinations ?

choosing rows including no more 3 true values using following function works vector: cols[rowsums(cols)<4l, ] however, gives following error larger vectors because of error in expand.grid long vectors:

error in rep.int(seq_len(nx), rep.int(rep.fac, nx)) :    invalid 'times' value in addition: warning message: in rep.fac * nx : nas produced integer overflow 

any suggestion allow me compute single, binary , triple combinations ?

you can use solution:

col.i <- do.call(c,lapply(1:3,combn,x=5,simplify=f)) # [[1]] # [1] 1 #  # [[2]] # [1] 2 #  # <...skipped...> #  # [[24]] # [1] 2 4 5 #  # [[25]] # [1] 3 4 5 

here, col.i list every element of contains column indices.

how works: combn generates combinations of numbers 1 5 (requested x=5) taken m @ time (simplify=false ensures result has list structure). lapply invokes implicit cycle iterate m 1 3 , returns list of lists. do.call(c,...) converts list of lists plain list.

you can use col.i columns dat using e.g. dat[,col.i[[1]],drop=f] (1 index of column combination, use number 1 25; drop=f makes sure when pick 1 column dat, result not simplified vector, might cause unexpected program behavior). option use lapply, e.g.

lapply(col.i, function(cols) dat[,cols]) 

which return list of data frames each containing subset of columns of dat.

in case want column indices boolean matrix, can use:

col.b <- t(sapply(col.i,function(z) 1:5 %in% z)) #       [,1]  [,2]  [,3]  [,4]  [,5] # [1,]  true false false false false # [2,] false  true false false false # [3,] false false  true false false # ... 

[update]

more efficient realization:

library("grbase")  coli <- function(x=5,m=3) {     col.i <- do.call(c,lapply(1:m,combnprim,x=x,simplify=f))      z <- lapply(seq_along(col.i), function(i) x*(i-1)+col.i[[i]])     v.b <- rep(f,x*length(col.i))     v.b[unlist(z)] <- true     matrix(v.b,ncol=x,byrow = true) }  coli(70,5) # takes 30 sec on desktop 

Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -