1

Now I have a shapefile describing the sergeant area of Texas. I opened it in R using read_sf() and get very weird results:

sgt_area <- read_sf("shapefiles/THP_SGTArea_outlines_20200422.shp")
sgt_area$geometry
Geometry set for 213 features 
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -11871800 ymin: 2978934 xmax: -10409240 ymax: 4369693
CRS:           NA
First 5 geometries:
MULTIPOLYGON (((-11103211 3540951, -11102945 35...
MULTIPOLYGON (((-10981244 3470857, -10981233 34...
MULTIPOLYGON (((-10944063 3451077, -10944060 34...
MULTIPOLYGON (((-10999089 3463854, -10996024 34...
MULTIPOLYGON (((-11066709 3387272, -11066686 33...

I want to simply convert the coordinates into a normal form (i.e. with crs = ESPG:4326 or long/lat).

I tried many ways but failed all the time.

  1. sgt_area <- st_transform(sgt_area, crs = 4326)
Error in st_transform.sfc(st_geometry(x), crs, ...) : 
  cannot transform sfc object with missing crs
  1. sgt_area <- st_transform(sgt_area, st_crs(county_sf)) # county_sf is the correct shapefile of counties Texas and did nothing.
Error in st_transform.sfc(st_geometry(x), crs, ...) : 
  cannot transform sfc object with missing crs
  1. I opened it in QGIS and did reproject layer setting crs to NAD83. But the reprojected layer didn't move as expected. I exported the data set and opened it in R.
sgt
Simple feature collection with 213 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -11871800 ymin: 2978934 xmax: -10409240 ymax: 4369693
Geodetic CRS:  NAD83
# A tibble: 213 × 9
   REGION DISTRICT SGT_AREA SGT_AREA_N COUNTY  DISTR…¹ Shape…² Shape…³                  geometry
   <chr>  <chr>    <chr>    <chr>      <chr>   <chr>     <dbl>   <dbl>        <MULTIPOLYGON [°]>
 1 6      6C       6C07     6C07       Bander… San An… 1.03e10 518477. (((-11103211 3540951, -1…
 2 6      6C       6C09     6C09       Comal   San An… 1.99e 9 230062. (((-10981244 3470857, -1…
 3 6      6C       6C03     6C03       Guadal… San An… 2.46e 9 282335. (((-10944063 3451077, -1…
 4 6      6C       6C01     6C01       Bexar,… San An… 6.60e 9 471434. (((-10999089 3463854, -1…
 5 6      6C       6C05     6C05       Medina  San An… 4.57e 9 299504. (((-11066709 3387272, -1…
 6 6      6C       6C02     6C02       Karnes… San An… 5.30e 9 325761. (((-10861961 3351903, -1…
 7 6      6C       6C04     6C04       Atasco… San An… 8.03e 9 409839. (((-10998912 3387155, -1…
 8 6      6C       6C06     6C06       Frio    San An… 3.84e 9 249471. (((-11066709 3387272, -1…
 9 1      1D       1D01     1D01       Red Ri… Mt. Pl… 5.54e 9 530189. (((-10592824 4020348, -1…
10 1      1D       1D05     1D05       Delta,… Mt. Pl… 7.90e 9 541725. (((-10641676 4020961, -1…
# … with 203 more rows, and abbreviated variable names ¹​DISTRICT_N, ²​Shape_STAr, ³​Shape_STLe
# ℹ Use `print(n = ...)` to see more rows

still didn't work.

  1. using the method from here
sgt <- readOGR("shapefiles/THP_SGTArea_outlines_20200422.shp") #from rdgal
> proj4string(sgt)
[1] NA
> proj4string(sgt) <- CRS("+init=epsg:4269") # I wanted to specify the crs epsg:4269
Error in `proj4string<-`(`*tmp*`, value = new("CRS", projargs = "+proj=longlat +datum=NAD83 +no_defs")) : 
  Geographical CRS given to non-conformant data: -11871802.3935   4369693.2565

Have you encountered this kind of problem?

Zeyu Wang
  • 13
  • 3
  • 1
    When you open a shapefile in QGIS - is the location of the data normal? You may need to assign projection before reprojecting. See this QGIS tool: https://docs.qgis.org/3.28/en/docs/user_manual/processing_algs/qgis/vectorgeneral.html#qgisassignprojection – Comrade Che Feb 15 '23 at 04:57
  • Thank you. No, the location was weird and the shape was over 100000 times bigger than the correct one. – Zeyu Wang Feb 15 '23 at 05:10
  • I assigned projection twice (ESPG:4326/4269) but it did nothing, the shape is still in the wrong location. It is really weird. – Zeyu Wang Feb 15 '23 at 05:14
  • 1
    Do you know for sure what CRS the data is supposed to be in? From the coordinate values, my guess is that you are looking at a projected system, not a geographic system like EPSG:4269 or 4326. – ChloeG Feb 15 '23 at 07:18
  • 3
    You assigned the wrong projection. 4236 and 4269 are lat-long coordinates, and these aren't lat-long numbers. You need to assign it with the correct projection for those numbers and then you can transform it. – Spacedman Feb 15 '23 at 07:28
  • If you don't know the exact projection, you can guess it with this QGIS tool: https://docs.qgis.org/3.28/en/docs/user_manual/processing_algs/qgis/vectorgeneral.html#find-projection – Comrade Che Feb 15 '23 at 07:55
  • Thanks for your help, guys. You are all insanely helpful! – Zeyu Wang Feb 15 '23 at 15:49

1 Answers1

5

This is enough to see the problem:

CRS:           NA
First 5 geometries:
MULTIPOLYGON (((-11103211 3540951, -11102945 35...
MULTIPOLYGON (((-10981244 3470857, -10981233 34...

The CRS is "NA" and the numbers are very large. A missing or invalid CRS will often get treated as a lat-long degrees system, and this clearly isn't that from the size of those numbers. The missing value is the reason most of your transform attempts fail, so we must deal with this first.

Did the .shp file come with a .prj file? Because if it did, there's a problem reading it, and we'll need to see it to help. If it didn't, you should try and get it, and then re-reading the file should set the CRS to something other than NA.

If you can't get the .prj file then we have to tell R what the coordinate system is, using st_crs(sgt_area) <- some value. Once you've done that then you can transform to another system, such as one based on latitude and longitude degree values.

So which CRS is it, without a .prj component of the shapefile? This is guesswork, helped by the online "projfinder" system which lets you input coordinates in an unknown system and gives you a map for you to say where you think these points should be and then searches all the coordinate systems it knows for possibilities. I tried this, but I can't find "Sergeant" in Texas, the nearest spelling being a "Sargent" SW of Houston. None of projfinder's coordinate systems are close to this, but if I zoom out I can see that if these numbers are in the standard "Google map" projection (EPSG:3857) then they would be in NW San Antonio. Since this is a common CRS I'll continue with this assumption, but really you need to get the correct CRS from the data supplier in a .prj file component of the shapefile (you should also have .dbf and .shx components).

So the next steps are:

st_crs(sgt_area) <- 3857

At this point the object has a valid coordinate system identifier and plotting it, eg with the mapview package or tmap in web mode should have at least part of it in Texas. If you need it in lat-long WGS84 coordinates then:

sgt_area_wgs = st_transform(sgt_area, 4326)

will do that.

If this is in San Antonio when it should be in Sargent (near Houston) then the 3857 assumption is wrong, but projfinder scans thousands of CRS possibilities and this seems the best it could find to me (but I don't know where the data is really supposed to be). The best solution for you is to find the .prj file that the .shp file should have come with.

Spacedman
  • 63,755
  • 5
  • 81
  • 115
  • THANK YOU!!! You are amazing! I am sorry I could've given out more information and you really did everything correctly! Unspeakable thanks!! – Zeyu Wang Feb 15 '23 at 15:47