2

I'm working with a point dataset that contains, sometimes, (almost) repeated features:

data_id year    event_type  assoc_actor_1
5270361 2000    Riots   Madinka Ethnic Group
5270361 2000    Riots   Balanta Ethnic Group

As they share the same id, I'm looking for a way to select just one of those redundant features (any of them). It can be via expression builder or python.

I've seen that set(list) does the trick, but how can I integrate that in selectByExpression()?

In the expression builder, count_distinct() identifies unique id, but how could I use it to select features?

In Getting list of distinct values from shapefile field using QGIS? the user is looking to retrieve a text file with the results. I wanna do it... graphically (?).

pkry
  • 107
  • 5

1 Answers1

1

Im sure there is some easier way but if not:

You need a unique id field for this to work: Start editing, add a integer field for unique values, calculate using row@number, see Filling column with consecutive numbers in QGIS? then:

layer = iface.activeLayer()

field_with_duplicates = 'LANSKOD' #Change
unique_id_field = 'uniqueID' #Change

fieldlist = [field_with_duplicates, unique_id_field]

all_rows = [[feat[f] for f in fieldlist] for feat in layer.getFeatures()] #Create a list of list, each sublist looking like [someduplicatevalue, uniqueidvalue

d = {key:value for key,value in all_rows} #A Dictionary cannot have duplicate keys (key in this case is field_with_duplicates, value unique_id_field)

processing.run("qgis:selectbyexpression", {'INPUT':layer,
                'EXPRESSION': '"{0}" IN{1}'.format(unique_id_field, tuple(d.values())),'METHOD':0})

enter image description here

BERA
  • 72,339
  • 13
  • 72
  • 161