4

Is there a Haskell function that generates all the unique combinations of a given length from a list?

Source = [1,2,3]

uniqueCombos 2 Source = [[1,2],[1,3],[2,3]]

I tried looking in Hoogle but could not find a function that did this specifically. Permutations does not give the desired result.

Has anybody used a similar function before?

jpmarinier
  • 4,087
  • 1
  • 9
  • 22
newb346
  • 151
  • 1
  • 6

6 Answers6

15

I don't know a predefined function either, but it's pretty easy to write yourself:

-- Every set contains a unique empty subset.
subsets 0 _ = [[]]

-- Empty sets don't have any (non-empty) subsets.
subsets _ [] = []

-- Otherwise we're dealing with non-empty subsets of a non-empty set.
-- If the first element of the set is x, we can get subsets of size n by either:
--   - getting subsets of size n-1 of the remaining set xs and adding x to each of them
--     (those are all subsets containing x), or
--   - getting subsets of size n of the remaining set xs
--     (those are all subsets not containing x)
subsets n (x : xs) = map (x :) (subsets (n - 1) xs) ++ subsets n xs
melpomene
  • 81,915
  • 7
  • 76
  • 137
  • 1
    It's interesting to note that if the order of appearance of `subsets 0 _ = [[]]` and `subsets _ [] = []` changes it breaks. – Redu Dec 22 '19 at 19:50
7

Using Data.List:

import Data.List
combinations k ns = filter ((k==).length) $ subsequences ns

Reference: 99 Haskell Problems

There are quite a few interesting solutions in the reference, I just picked a concise one.

cmaher
  • 4,853
  • 1
  • 20
  • 34
dopamane
  • 1,263
  • 2
  • 16
  • 26
3

It is not clear to me how concerned you would be about performance.

If it can be of any use, back in 2014, somebody posted some sort of performance contest of various Haskell combinations generating algorithms.

For combinations of 13 items out of 26, execution times varied from 3 to 167 seconds ! The fastest entry was provided by Bergi. Here is the non-obvious (for me at least) source code:

subsequencesOfSize :: Int -> [a] -> [[a]]
subsequencesOfSize n xs = let l = length xs
                          in if (n > l) then []
                             else subsequencesBySize xs !! (l-n)
 where
   subsequencesBySize [] = [[[]]]
   subsequencesBySize (x:xs) = let next = subsequencesBySize xs
                               in zipWith (++)
                                    ([]:next)
                                    ( map (map (x:)) next ++ [[]] )                 

More recently, the question has been revisited, in the specific context of picking a few elements from a large list (5 out of 100). In that case, you cannot use something like subsequences [1 .. 100] because that refers to a list whose length is 2100 ≃ 1.26*1030. I submitted a state machine based algorithm which is not as Haskell-idiomatic as I would like, but is reasonably efficient for that sort of situations, something around 30 clock cycles per output item.

Side note: Using multisets to generate combinations ?

Also, there is a Math.Combinatorics.Multiset package available. Here is the documentation. I have only briefly tested it, but it can be used to generate combinations.

For example, the set of all combinations of 3 elements out of 8 are just like the "permutations" of a multiset with two elements (Present and Absent) with respective multiplicities of 3 and (8-3)=5.

Let us use the idea to generate all combinations of 3 elements out of 8. There are (876)/(321) = 336/6 = 56 of them.

*L M Mb T MS> import qualified Math.Combinatorics.Multiset as MS
*Math.Combinatorics.Multiset L M Mb T MS> pms = MS.permutations
*Math.Combinatorics.Multiset L M Mb T MS> :set prompt "λ> "
λ> 
λ> pms38 = pms $ MS.fromCounts [(True, 3), (False,5)]
λ> 
λ> length pms38
56
λ>
λ> take 3 pms38
[[True,True,True,False,False,False,False,False],[True,True,False,False,False,False,False,True],[True,True,False,False,False,False,True,False]]
λ> 
λ> str = "ABCDEFGH"
λ> combis38 = L.map fn pms38 where fn mask = L.map fst $ L.filter snd (zip str mask)
λ> 
λ> sort combis38
["ABC","ABD","ABE","ABF","ABG","ABH","ACD","ACE","ACF","ACG","ACH","ADE","ADF","ADG","ADH","AEF","AEG","AEH","AFG","AFH","AGH","BCD","BCE","BCF","BCG","BCH","BDE","BDF","BDG","BDH","BEF","BEG","BEH","BFG","BFH","BGH","CDE","CDF","CDG","CDH","CEF","CEG","CEH","CFG","CFH","CGH","DEF","DEG","DEH","DFG","DFH","DGH","EFG","EFH","EGH","FGH"]
λ>
λ> length combis38
56
λ>

So functionally at least, the idea of using multisets to generate combinations works.

jpmarinier
  • 4,087
  • 1
  • 9
  • 22
2

There is no such operation in lib, but you can easily implement it yourself:

import Data.List

main = putStrLn $ show $ myOp 2 [1, 2, 3]

myOp :: Int -> [a] -> [[a]]
myOp 0 _ = []
myOp 1 l = map (:[]) l
myOp c l = concat $ map f $ tails l
    where
        f :: [a] -> [[a]]
        f []     = []
        f (x:xs) = map (x:) $ myOp (c - 1) xs
talex
  • 16,886
  • 2
  • 27
  • 60
2

@melpomene's answer is generic and very concise. That's probably what you see in many places over the internet where a combinationsOf function is required.

However hidden behind the double recursion it does tons of needless recursive calls those can be avoided, yielding a much efficient code. i.e we don't need to make any calls if the length of the list is shorter than k.

I would propose a double termination check.

combinationsOf :: Int -> [a] -> [[a]]
combinationsOf k xs = runner n k xs
                      where
                      n = length xs
                      runner :: Int -> Int -> [a] -> [[a]]
                      runner n' k' xs'@(y:ys) = if k' < n'      -- k' < length of the list
                                                then if k' == 1
                                                     then map pure xs'
                                                     else map (y:) (runner (n'-1) (k'-1) ys) ++ runner (n'-1) k' ys
                                                else pure xs'   -- k' == length of the list.

λ> length $ subsets 10 [0..19] -- taken from https://stackoverflow.com/a/52602906/4543207
184756
(1.32 secs, 615,926,240 bytes)

λ> length $ combinationsOf 10 [0..19]
184756
(0.45 secs, 326,960,528 bytes)

So the above code, while optimized as much as possible, is still inefficient mainly due to double recursion from within. As a rule of thumb, in any algorithm a double recursions should best be avoided or be considered under very careful inspection.

The following algorithm on the other hand, is a very efficient way of doing this job both on speed and memory consumption.

combinationsOf :: Int -> [a] -> [[a]]
combinationsOf k as@(x:xs) | k == 1    = map pure as
                           | k == l    = pure as
                           | k >  l    = []
                           | otherwise = run (l-1) (k-1) as $ combinationsOf (k-1) xs
                             where
                             l = length as

                             run :: Int -> Int -> [a] -> [[a]] -> [[a]]
                             run n k ys cs | n == k    = map (ys ++) cs
                                           | otherwise = map (q:) cs ++ run (n-1) k qs (drop dc cs)
                                           where
                                           (q:qs) = take (n-k+1) ys
                                           dc     = product [(n-k+1)..(n-1)] `div` product [1..(k-1)]

λ> length $ combinationsOf 10 [0..19]
184756
(0.09 secs, 51,126,448 bytes)
Redu
  • 22,595
  • 5
  • 50
  • 67
  • With the second code block in GHCi, `combinationsOf 3 [1 .. 3] == [1, 2, 3, 2, 3, 3]`, which is incorrect, and `combinationsOf 4 [1 .. 3]` froze my machine, I think due to swapping from memory exhaustion. – Chai T. Rex May 14 '21 at 21:42
  • @Chai T. Rex Thank you for heads up. I tried to fix it. However this is not a code for combinations with repetitions so the updated code won't allow `k > l`. – Redu May 15 '21 at 13:09
0

Monadic solution for unique combinations:

cb _ 0 = [[]]
cb xs n = (nxs >>= (\(nx, x) -> (x:) <$> (cb [z | (n,z) <- nxs, n>nx] (n-1)) )) where nxs = zip [1..] xs
Evg
  • 2,698
  • 4
  • 38
  • 56