0

I've a sorted list with some float values and I want to group values which are within a certain tolerance (kind of very basic clustering). So I start at list[0], pick up values near to it within tolerance, then delete all those cells , and start at the next cell to the last value picked up.

While the following toy sample works, the code gives absolutely junk values for a real life long list of numbers, apparently picking values at unpredictable positions inside list, which I guess is happening because I'm deleting list members while traversing the list. I tried to circumvent by deleting from the reverse sorted list, but still no luck.

Can someone suggest a better solution to this problem?

a = [1.0,1.1,1.2,1.3,1.4,1.5,2.0,2.1,2.2,3.0,3.1]
tolerance = 0.40
while len(a) != 0:
    r=a[0]
    li = [i for i,v in enumerate(a) if abs(v-r) <= tolerance]
    print ( str(r) + "<>" + ", ".join(str(a[x]) for x in li ) + "\n")
    for z in sorted(li, reverse=True):
      del a[z]

Output:

1.0<>1.0, 1.1, 1.2, 1.3, 1.4

1.5<>1.5

2.0<>2.0, 2.1, 2.2

3.0<>3.0, 3.1
The August
  • 423
  • 2
  • 6
  • 17

0 Answers0