1

I had a little problem in one case in python. The cases are as follows:

"in NLP, stop words are commonly used words like "a", "is", and "the". They are typically filtered out during processing.

Implement a function that takes a string text and integer k, and returns the list of words that occur in the text at least k times. The words must be returned in the order of their first occurrence in the text."

And here's my code:

#!/bin/python3

import math
import os
import random
import re
import sys

def stopWords(text, k):
    stop_words = ['and','fox','jumps','over','dog','runs','away','to','a','house','lazy','quick']
    text = text.split()
    text = [word for word in text if word not in stop_words]
    text = [word for word in text if len(word) > k]
    return text

if _name_ == '_main_':
    fptr = open(os.environ['OUTPUT_PATH'], 'w')

    text = input()

    k = int(input().strip())

    result = stopWords(text, k)

    fptr.write('\n'.join(result))
    fptr.write('\n')

    fptr.close()

Here is my input:

Input
text = the quick brown fox jumps over the lazy dog runs away a brown house
k = 2

I want output like this:

Output:
the
brown

but my result is:

Output:
the
brown
the
brown
brown

how to fix it?

woyanas
  • 11
  • 2
  • Hi. Welcome to StackOverflow. You presented your desired output, and your actual output, but what's your input? For the sake of the question, please replace `text = input()` with `text = 'Some hardcoded text here so we can all have the same text'` and `k = int(input().strip())` with `k = 3` (or `k = 15` or whatever - again, a hardcoded integer value so we can all have the same value rather than depend on user input). – Stef May 05 '22 at 14:33
  • Does this answer your question? [Removing duplicates in lists](https://stackoverflow.com/questions/7961363/removing-duplicates-in-lists) – Stef May 05 '22 at 14:36
  • Sorry, wrong link. I meant, does this answer your question? [How do you remove duplicates from a list whilst preserving order?](https://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-whilst-preserving-order) – Stef May 05 '22 at 14:38
  • @Stef: sorry I forgot to add input. I've added the input below. – woyanas May 06 '22 at 07:30
  • Did you look at the question I linked? Take function `f7` from the accepted answer, and add `text = f7(text)` in your `stopWords` function. – Stef May 06 '22 at 08:35
  • Or you can combine the three lines into one: `seen = set(); text = [word for word in text if len(word) > k and word not in stop_words and not (x in seen or seen.add(x))]` – Stef May 06 '22 at 08:37

0 Answers0