1

I'm new to QGIS (version 3.4.15), I typically use ArcGIS, and am attempting to change attribute values and also add a field, for a collection of shape files based on certain conditions, the largest having approximately 4 million features. I have tried to translate my ArcPy code into PyQGIS and I've clearly done something wrong, it would run usually in an hour or so on ArcPy, but isn't finishing after days on PyQGIS, though I can see is running through the code.

Below is some sample code, though there are typically about 20 if conditions within each loop.

roads1 = input_dir+"/roads1.shp" 
roads2 = input_dir+"/roads2.shp"
roads = [roads1, roads2]
for road in roads:
    # edit max speeds
    layer = QgsVectorLayer(road,"roads","ogr")
    features = layer.getFeatures()
    layer.startEditing()
    for f in features:
        if f['fclass'] == "footway":
            id = f.id()
            layer.changeAttributeValue(id,6,6)
        if f['fclass'] == "steps":
            id = f.id()
            layer.changeAttributeValue(id,6,2)
        if f['fclass'] == "track":
            id = f.id()
            layer.changeAttributeValue(id,6,32)
        if f['maxspeed'] == 0:
            if f['fclass'] == "living_street":
                id = f.id()
                layer.changeAttributeValue(id,6,48)
            if f['fclass'] == "motorway":
                id = f.id()
                layer.changeAttributeValue(id,6,112)
    layer.commitChanges()
    # add in hierarchies
    layer_provider = layer.dataProvider()
    layer_provider.addAttributes([QgsField("Hierarchy",QVariant.Double)])
    layer.updateFields()
    features = layer.getFeatures()
    layer.startEditing()
    for f in features:
        if f['fclass'] == "footway":
            id = f.id()
            layer.changeAttributeValue(id,10,1)
        if f['fclass'] == "steps":
            id = f.id()
            layer.changeAttributeValue(id,10,2)
    layer.commitChanges()

EDIT: I discovered when running the code on a smaller shape file without calling layer.commitChanges() at the end, the code runs in 22 seconds. When I call layer.commitChanges() at the end it takes about 30 minutes. If I understand correctly commitChanges() writes to the underlying data. Is this an IO issue?

Nik-D
  • 11
  • 2
  • Look at this answer: https://gis.stackexchange.com/questions/200997/is-there-a-faster-process-to-update-one-column-for-all-features/215464#215464 You should build a dictionary of values to be updated (e.g., attrsMap) and only at the end call the changeAttributeValues(attrsMap). – Germán Carrillo Mar 03 '21 at 13:39
  • Thanks for the input! But I've found the issue is coming from the .commitChanges() command at the end. – Nik-D Mar 03 '21 at 16:00

0 Answers0