3

I am wondering what is the expression that truncate a given name to the 10 characters name used in ESRI shapefiles when an object is saved as shapefile with the function writeOGR in R. For example, the attribute field from a spatial dataframe "Ablepharus kitaibelii" is converted to "Ablphrk" when saved as shapefile, how the truncation is done?

MrXsquared
  • 34,292
  • 21
  • 67
  • 117
Teuz
  • 65
  • 6
  • 2
    Which software is that? – Taras Apr 02 '20 at 07:51
  • QGIS perhaps follows the GDAL method https://gdal.org/drivers/vector/shapefile.html Starting with version 1.7, the OGR Shapefile driver tries to generate unique field names. Successive duplicate field names, including those created by truncation to 10 characters, will be truncated to 8 characters and appended with a serial number from 1 to 99. This software must be something else. – user30184 Apr 02 '20 at 08:14
  • It is not a limitation at the software level but rather, a hard-and-fast limitation of the shapefile format due to dbf (dBase V). Many software account for this know limitation at the code level to avoid crashes or to output rational truncation. – Jeffrey Evans Apr 02 '20 at 14:03
  • 1
    This is worth re-opening because the name-munging has to be done by the code creating the shapefile, and different codes might use different algorithms. Certainly ogr2ogr creates different names to R's st_write. – Spacedman Apr 02 '20 at 16:05

1 Answers1

4

sf::st_write in R uses the abbreviate function in its base package to create unique names of the right length for a shapefile.

If I have a spatial object with these two long names (plus "geometry"):

> names(p)
[1] "longnamehereplease" "longnamehereaswell" "geometry"          

then writing them gives:

> st_write(p,"p.shp")
Writing layer `p' to data source `p.shp' using driver `ESRI Shapefile'
Writing 10 features with 2 fields and geometry type Point.
Warning message:
In abbreviate_shapefile_names(obj) :
  Field names abbreviated for ESRI Shapefile driver

and the names in the shapefile will be:

> abbreviate(names(p)[1:2] minlength=5)
longnamehereplease longnamehereaswell 
        "lngnmhrp"         "lngnmhrs" 

Note that its the writing program's job to truncate the field names and different software does it differently. The well-known GDAL/OGR conversion tool for example,gGiven a geopackage with the same long names, when converting to shapefile:

$ ogr2ogr short.shp p.gpkg 
Warning 6: Normalized/laundered field name: 'longnamehereplease' to 'longnamehe'
Warning 6: Normalized/laundered field name: 'longnamehereaswell' to 'longname_1'

and note the laundered field names are different to the ones created by R.

Spacedman
  • 63,755
  • 5
  • 81
  • 115