13

Could anyone please recommend a good frequent itemset package in python? I only need to find frequent itemset, no need of finding the association rules.

Pluviophile
  • 3,808
  • 13
  • 31
  • 54
Edamame
  • 2,745
  • 5
  • 24
  • 33
  • In my personal exp, I found R's apriori and FP-growth much better than their Python alternatives. So, if you're open to considering R, you should try them :) – Dawny33 Mar 09 '17 at 06:09

3 Answers3

11

I also recommend MLXtend library for frequent itemsets.

usage example:

dataset = [['Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
           ['Dill', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
           ['Milk', 'Apple', 'Kidney Beans', 'Eggs'],
           ['Milk', 'Unicorn', 'Corn', 'Kidney Beans', 'Yogurt'],
           ['Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]

te = TransactionEncoder()

te_ary = te.fit(dataset).transform(dataset)

df = pd.DataFrame(te_ary, columns=te.columns_)

frequent_itemsets = apriori(df, min_support=0.1, use_colnames=True)

print frequent_itemsets

MoAdel
  • 111
  • 1
  • 2
  • this package has memory error when you have too many distinct items. Not recommended for Big Data – Snow Jul 07 '20 at 12:40
4

Orange3-Associate package provides frequent_itemsets() function based on FP-growth algorithm.

K3---rnc
  • 3,472
  • 1
  • 13
  • 12
3

MLXtend library has been really useful for me. In its docummentation there is an Apriori implementation that outputs the frequent itemset.

Please check the first example available in http://rasbt.github.io/mlxtend/user_guide/frequent_patterns/apriori/.

tbnsilveira
  • 131
  • 3