I've a sorted list with some float values and I want to group values which are within a certain tolerance (kind of very basic clustering). So I start at list[0], pick up values near to it within tolerance, then delete all those cells , and start at the next cell to the last value picked up.
While the following toy sample works, the code gives absolutely junk values for a real life long list of numbers, apparently picking values at unpredictable positions inside list, which I guess is happening because I'm deleting list members while traversing the list. I tried to circumvent by deleting from the reverse sorted list, but still no luck.
Can someone suggest a better solution to this problem?
a = [1.0,1.1,1.2,1.3,1.4,1.5,2.0,2.1,2.2,3.0,3.1]
tolerance = 0.40
while len(a) != 0:
r=a[0]
li = [i for i,v in enumerate(a) if abs(v-r) <= tolerance]
print ( str(r) + "<>" + ", ".join(str(a[x]) for x in li ) + "\n")
for z in sorted(li, reverse=True):
del a[z]
Output:
1.0<>1.0, 1.1, 1.2, 1.3, 1.4
1.5<>1.5
2.0<>2.0, 2.1, 2.2
3.0<>3.0, 3.1