4

My Issue

According to the docs it appears that gdalwarp has many more resampling methods than gdal_translate. Can anybody explain why the resampling methods can't be used with both functions? Alternatively can you explain how to achieve the same result?

I have a big vrt that I'm trying to crop into a raster with a lower resolution covering a smaller geographic region. gdal_translate has the '-projwin' option which allows me to select smaller region without reading the entire dataset.

I see that gdalwarp has the '-te' option which would give me a similar result, except that it looks like it tries to read the entire dataset into memory prior to cropping it to the extent that I specify. (ERROR 1: Integer overflow).

I would be happy using either method as long as the result that I get comes in a somewhat performant manner for a small region.

Things I've tried

  1. gdal_translate -r max … - GDAL_RASTERIO_RESAMPLING = max not supported
  2. gdal_warp -te xmin ymin xmax ymax … - Integer overflow .
  3. A combination of translate to crop the image without downsampling, then gdal_warp to downsample. While this works the two operations take too long for my use case.

Exact commands tried

gdalwarp -ts 10 10 -te -122.4396 47.5189 -122.1892 47.6830 dem.vrt test_wc.tif -overwrite --config GDAL_CACHEMAX 100 -wm 100 -multi -wo NUM_THREADS=ALL_CPUS

gdalwarp -ts 10 10 -te -122.4396 47.5189 -122.1892 47.6830 dem.vrt test_wc.tif -overwrite --config GDAL_CACHEMAX 100 -wm 100 -multi -wo NUM_THREADS=ALL_CPUS

What can I do to use the max resampling method, while cropping a large dataset?

Gdal Info

Driver: VRT/Virtual Raster
Files: dem.vrt
       LongFileList…
Size is 1296003, 421203
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433],
    AUTHORITY["EPSG","4326"]]
Origin = (-180.000416666666666,61.000416666666666)
Pixel Size = (0.000277777777778,-0.000277777777778)
Corner Coordinates:
Upper Left  (-180.0004167,  61.0004167) (180d 0' 1.50"W, 61d 0' 1.50"N)
Lower Left  (-180.0004167, -56.0004167) (180d 0' 1.50"W, 56d 0' 1.50"S)
Upper Right ( 180.0004167,  61.0004167) (180d 0' 1.50"E, 61d 0' 1.50"N)
Lower Right ( 180.0004167, -56.0004167) (180d 0' 1.50"E, 56d 0' 1.50"S)
Center      (  -0.0000000,   2.5000000) (  0d 0' 0.00"W,  2d30' 0.00"N)
Band 1 Block=128x128 Type=Float32, ColorInterp=Gray
  NoData Value=-9999999
  • try GDALWarp https://www.gdal.org/gdalwarp.html, max is offered as a resampling method and it also supports -cutline and -te xmin ymin xmax ymax, -te_srs srs_def and -tr xres yres to change the pixel size. Are your -te coordinates in projected coordinates or cell coordinates? -te expects the numbers to be expressed in the same units as the datasets SRS but you can use -te_srs to supply in a different spatial reference but neither are in cells. – Michael Stimson Oct 10 '18 at 00:27
  • 1
    I've done this with ~ 1,000,000 x 500,000 VRT using gdalwarp and I never had memory issues. I am fairly confident (though not 100%) that gdalwarp does not read the entire raster into memory: however, if your underlying rasters (that you used to build the vrt) or the vrt itself have row- or column-based blocks rather than tiled, gdalwarp will have to read in much more data than it should. See https://stackoverflow.com/questions/41742162/gdal-readasarray-for-vrt-extremely-slow/45745848#45745848 for how blocksize might be playing a role. – Jon Oct 10 '18 at 03:59
  • Add gdalinfo about your source raster. Some image formats and variants do not support reading just a region of interest. – user30184 Oct 10 '18 at 05:07
  • @user30184 I added my gdalinfo output to the question. I did remove the file list though because of the large number of files, the individual files are all GeoTiff – Braden Roseborough Oct 10 '18 at 23:04
  • @MichaelStimson I'm using -te with the same units in my source SRS (which also matches my target SRS) – Braden Roseborough Oct 10 '18 at 23:10
  • 1
    Can you give an example of the command you have used with GDALWarp please? – Michael Stimson Oct 10 '18 at 23:40
  • Kind of similar problem in https://gis.stackexchange.com/questions/292632/gdal-warp-with-huge-tif-integer-overflow and referred http://osgeo-org.1560.x6.nabble.com/gdal-dev-gdalwarp-crash-td5036575.html. It seems that for some reason your gdalwarp command tries to allocate more than 2 GB of memory but that should not normally happen. – user30184 Oct 11 '18 at 06:11
  • @MichaelStimson I added the commands that I tried. for the gdalwarp integer overflow I think it may be something with the target size option. changing the size from '10 10' to something a little larger '500 0' removes the integer overflow error, but takes upwards of 30mins to produce a result. – Braden Roseborough Oct 11 '18 at 18:13
  • Are you sure you want max resampling? Note that this does not mean "take the maximum value from any input dataset", as I learned some time ago. Not sure this applies but figured I'd point it out – mikewatt Oct 11 '18 at 18:17
  • @gberard thanks for checking with me on this. I read through the link you posted and I believe I'm using the resampling option correctly. I'm not trying to take the maximum value from multiple datasets, instead I'm reducing the resolution of the output but trying to retain the maximum value within the area covered by that pixel. – Braden Roseborough Oct 11 '18 at 19:14
  • @Jon I looked into what you mentioned. It looks like my vrt BlockSize is 128x128 but when I looked at the underlying rasters, the block size was something like 1201x1. As a result I think you are on the right trail, I'm probably reading entire rows of the underlying data. I'm going to take the time to re-process the source files with -co TILED=YES to see if that reduces my warp time. – Braden Roseborough Oct 11 '18 at 19:17
  • @BradenRoseborough 1201 x 1 isn't too bad, actually, but if it's not too much trouble give re-tiling the inputs a shot! You can write an easy script and let it run overnight if you have a ton, and you could also do it in parallel. I'm guessing that this isn't your problem, though, because you'd need something like 12,000 x 1 before it became an issue (I'd guess). – Jon Oct 11 '18 at 22:24

0 Answers0