2
lists <- lapply(vector("list", 5), function(x) sample(1:100,50,replace=T))

How can i extract all values which are present in at least n (2,3,4,5) vectors inside lists (or generally in a population of vectors)?
For n=5, this question gives already a solution (e.g. intersect()), but is unclear to me for cases of n < m. In this particular case, i perform 5 variants of mean-comparisons between two groups, and want to extract a consensus between the 5 tests (e.g. significantly different in at least 3 tests).

pogibas
  • 25,773
  • 19
  • 74
  • 108
nouse
  • 2,889
  • 2
  • 22
  • 45

2 Answers2

3

If I understand correctly, you can do it as follows. Assume that you are interested in values that are shared between at least 3 list elements.

combos <- combn(seq_along(lists), 3, simplify = FALSE)
lapply(combos, function(i) Reduce(intersect, lists[i]))

And if you're just interested in the actual values,

unique(unlist(lapply(combos, function(i) Reduce(intersect, lists[i]))))

In combos we store all possible combinations of your lists of length n (here, 3).

talat
  • 66,143
  • 20
  • 123
  • 153
  • The line starting with "unique" is what i wanted. This should give me the values which appear in at least 3 of the 5 list elements, correct? – nouse Feb 09 '18 at 13:54
  • 1
    You can also do it all inside `combn`, i.e. `combn(seq_along(lists), 3, simplify = FALSE, FUN = function(i) Reduce(intersect, lists[i]))` – Sotos Feb 09 '18 at 13:54
2

You can simply reduce lists using unique then combine them into one vector with unlist and count with table.

n <- 3
names(which(table(unlist(lapply(lists, unique))) >= n))

Output of this code is vector of names.

pogibas
  • 25,773
  • 19
  • 74
  • 108