3

I apologize for the extremely vapid question, but I am trying to teach myself statistics and don't know of too many resources. through google searches I've found how to count combinations, how to count permutations, and several other things, but what I am looking for is somewhat specific and I haven't been able to find the formula.

If I have a set X = {a,b,c,d} I want to find how many lists (without repetition) can exists, when size of list can be <= size of X

eg {a},{b},{c},{d},{ab},{abc},......{dcba}

Again, I apologize for such a benign question, but I've lost count of all of the ways I've seen to count without finding what I'm looking for!

EDIT: In response to the comment left, {abcd} and {dcba} are both different lists (same set, but different lists). I am basically wanting to know if there is a formula for calculating combinations when the size of combination (which I've read is a list in this context) can be less than or equal to the entire set size. I know combination with replacement is (X size)^(possibilities) and combination without replacement is (X size)! but both of these examples are when the list size is the same as the set size.

Furthermore, empty collection isn't important, I could just add or subtract 1 as need be. And as for the connection: being able to count combinations is an integral part of statistics. It's like how integrals are, well, like integral to calculus :) But specifically, I am creating my own decision tree algorithm and want to know the total possible combinations of features that can be strung together. Hope this clarifies things! Thanks again

MORE EDIT: To those who put my question on hold. Please look at the last paragraph of my previous edit. Being able to count is the first chapter of all probability and statistics books. And specific to my needs, I need to know how many combinations of features I am looking at. As I am constantly changing and looking at new features it would be nice to have a formula that says I have x features so there are y different combinations.

  • 2
    Please tell us what a "list" is. For instance (in light of your use of the [tag:subset] tag) would "{dcba}" and "{acdb}" be considered the same or different lists? Would an empty collection {} be considered a list? And what is the connection between this question and statistics or machine learning? – whuber Jan 18 '15 at 16:00
  • Re whether this question is on topic: Although we use math a lot and like to answer math questions when they have clear statistical interest, this is not the math site. Questions that are only about mathematics really belong there. For a longer explanation of these guidelines, please see http://meta.stats.stackexchange.com/a/2352. – whuber Jan 19 '15 at 01:47
  • I'd regard counting combinations and permutations as part of "how to calculate probability" and so on topic, ... but it looks like few people would agree with me. If you flag your post and ask for it to be moved, it might be acceptable at math.SE (though I'd correct the spelling error in the title first). – Glen_b Jan 19 '15 at 17:49

1 Answers1

3

When order doesn't matter, your problem would be that of finding the cardinality of the power set of $S$, for $S=\{a,b,c,d\}$, which would be $2^S$.

When order matters, you're looking at counting permutations of each distinct subset; which would be the sum (over $k$) of the number of $k$-permutations of $n$.

Which is to say $a_n=\sum_{k=0}^n \frac{n!}{(n-k)!}$ (if we include the empty set; otherwise the sum runs from 1).

n     a(n) = sum of number of 
      k-permutations of n

0         1 
1         2 
2         5 
3        16 
4        65
5       326
6      1957
...

Another way to write this is $a_n=n!(\frac{1}{0!}+\frac{1}{1!}+\frac{1}{2!}+...+\frac{1}{n!})$ (note there's a nice upper bound $a_n< n!(\frac{1}{0!}+\frac{1}{1!}+\frac{1}{2!}+...) = n!\cdot e$).

As $n$ becomes large, the difference between the sum of the first $n$ terms inside the parentheses and the full series $\Sigma \frac{_1}{^{k!}}$ in the upper bound becomes quite small (at such a pace that even when multiplying by $n!$ it still decreases), and it turns out that for $n>1$ you can simply take $\lfloor n!\, e\rfloor$ to yield the same result as the count, $a_n$.

Edit: This sequence is A000522 in the On-Line Encyclopedia of Integer Sequences.

It refers to it as the number of arrangements of any subset of $n$ distinct objects. Note that at that page (the sequence's wiki page) they mention the (fairly obvious) approximation I found earlier ($n!\,e$). It also mentions a nice recurrence relationship ($a_n=n\,a_{n-1}+1$).

Glen_b
  • 282,281