Keep Python from Providing Duplicate Outputs

Question

My goal is to have a range of items provide as many outputs as can be had, but without duplicate outputs. The code I've provided is a small sample of what I'm working on. For my larger data set, I notice duplicate outputs plague the CSV file when the script runs so I was wondering if there is a way to keep duplicate outputs from processing while maintaining the high range (100, 250, 400, etc)?

import random
Saying = ["I Like"]

Food = ['Coffee', 'Pineapples', 'Avocado', 'Bacon']
Holiday = ['on the 4th of July', 'on April Fools', 'during Autumn', 'on Christmas']

for x in range(10):
    One = random.choice(Saying)
    Two = random.choice(Food)
    Three = random.choice(Holiday)
    print(f'{One} {Two} {Three}')

Thanks for the help!

Possible duplicate of [All combinations of a list of lists](https://stackoverflow.com/questions/798854/all-combinations-of-a-list-of-lists) — DSC, Jul 02 '19 at 07:56
Possible duplicate of [how to sample from cartesian product without repetition in python?](https://stackoverflow.com/questions/48686767/how-to-sample-from-cartesian-product-without-repetition-in-python) — Georgy, Jul 02 '19 at 08:35

Afik Friedberg · Answer 1 · 2019-07-02T08:00:18.787

0

You can use set with the element that you already seen and then check if you see the element in the set with complexity of O(1) in average.

Another option is to shuffle your list and pop some element:

import random

random.shuffle(lst)

while lst:
    element = x.pop()

edited Jul 02 '19 at 08:00

answered Jul 02 '19 at 07:55

Afik Friedberg

322
2
8

1

Still remains inefficient since you'll create many duplicates – DSC Jul 02 '19 at 07:56

score 0 · Answer 2 · answered Jul 02 '19 at 08:13

You can use np.random.choice with parameter replace=False. Moreover you can samples as much sample as you want using size argument.

import numpy as np

Food = ['Coffee', 'Pineapples', 'Avocado', 'Bacon']
Holiday = ['on the 4th of July', 'on April Fools', 'during Autumn', 'on Christmas']

np.random.choice(Food, size=4, replace=False)
>>> array(['Avocado', 'Coffee', 'Pineapples', 'Bacon'], dtype='<U10')

np.random.choice(Holiday, size=4, replace=False)
>>> array(['on April Fools', 'on the 4th of July', 'during Autumn',
       'on Christmas'], dtype='<U18')

score 0 · Accepted Answer · answered Jul 02 '19 at 08:17

The issue is that your bot (i guess?) has no memory of what the outputs have been so far, so there's really no way to check with the code you have.

try with this instead:

import random
Saying = ["I Like"]

Food = ['Coffee', 'Pineapples', 'Avocado', 'Bacon']
Holiday = ['on the 4th of July', 'on April Fools', 'during Autumn', 'on Christmas']

memory=[]
done = False

while not done:
    One = random.choice(Saying)
    Two = random.choice(Food)
    Three = random.choice(Holiday)

    if f'{One} {Two} {Three}' not in memory:
        memory.append(f'{One} {Two} {Three}')
        if len(memory) == 10:
            done = True

[print(item) for item in memory]

so now instead of taking 10 potshots at creating 10 phrases, we're taking as many as it takes to create 10 different ones.

Thanks everyone! I really appreciate the help from this community for looking into my question. It works, yay! — Doug Morrow, Jul 02 '19 at 16:50

score 0 · Answer 4 · answered Jul 02 '19 at 08:33

You can generate random output while maintaining non-redundant data by:

First creating a list permutations which is basically product of lists to be permutated.

permutations = list(itertools.product(*Statement))
## Example - [('I Like', 'Coffee', 'on the 4th of July'), ('I Like', 'Coffee', 'on April Fools'), ('I Like', 'Coffee', 'during Autumn'), ('I Like', 'Coffee', 'on Christmas')]

Pick out elements from the permutations by randomly selecting index and printing it.

 num = int(random.random() * total_elements)
 print '{} {} {}'.format(permutations[num][0], permutations[num][1], permutations[num][2])

Next, we remove the element from list permutations so as to avoid redunancy.

del permutations[num]

Complete code:

import itertools, random
Saying = ["I Like"]

Food = ['Coffee', 'Pineapples', 'Avocado', 'Bacon']
Holiday = ['on the 4th of July', 'on April Fools', 'during Autumn', 'on Christmas']

Statements = [Saying, Food, Holiday]

permutations = list(itertools.product(*Statements))

random.seed()

total_elements = len(Saying) * len(Food) * len(Holiday)

while total_elements > 0:
    num = int(random.random() * total_elements)
    print '{} {} {}'.format(permutations[num][0], permutations[num][1], permutations[num][2])
    del permutations[num]
    total_elements = total_elements - 1

Keep Python from Providing Duplicate Outputs

4 Answers4