2

I read a lot of questions related to this problem, but it seems nothing applies to this case. From the Google Search Console, the sitemap for a Wordpress website using Yoast is reported as "couldn't fetch". No other info seems to be provided.

I found similar questions everywhere, like Google Search Console Sitemap Can't Fetch.

Typical answer is: "just wait". I have been waiting for many months now. Nothing changed. Tried to re-submit many times, but nothing changed again.

I also tried "URL inspection": "No: 'noindex' detected in 'X-Robots-Tag' http header" is returned, which I understand is correct for a sitemap, and "Page fetch" is reported as successful. So Google can fetch from there, but not from the sitemap tag.

I also tried to create minimal XML to debug the issue, but "couldn't fetch" remains in any case. I tried other online services to check the URL, and every service reports the sitemap as valid. Bing has no problems with the same sitemap and indexed the site properly.

Any other idea that could lead to understand the problem? Thanks.

EDIT: This is an example of GET call with metadata for the sitemap index:

*   Trying 82.60.120.226:443...
* Connected to ... (...) port 443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: CN=...
*  start date: Jun 22 08:17:08 2022 GMT
*  expire date: Sep 20 08:17:07 2022 GMT
*  subjectAltName: host "..." matched cert's "..."
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
> GET /sitemap_index.xml HTTP/1.1
> Host: ...
> User-Agent: curl/7.84.0
> Accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: nginx/1.23.0
< Date: Thu, 21 Jul 2022 09:29:33 GMT
< Content-Type: text/xml; charset=UTF-8
< Content-Length: 822
< Connection: keep-alive
< X-Powered-By: PHP/7.4.30
< X-Robots-Tag: noindex, follow
< Vary: Accept-Encoding
< Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
< 
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="//.../wp-content/plugins/wordpress-seo/css/main-sitemap.xsl"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        <sitemap>
                <loc>https://.../post-sitemap.xml</loc>
                <lastmod>2022-04-10T09:44:18+00:00</lastmod>
        </sitemap>
        <sitemap>
                <loc>https://.../page-sitemap.xml</loc>
                <lastmod>2022-03-06T10:57:39+00:00</lastmod>
        </sitemap>
        <sitemap>
                <loc>https://.../category-sitemap.xml</loc>
                <lastmod>2022-04-10T09:44:18+00:00</lastmod>
        </sitemap>
        <sitemap>
                <loc>https://.../post_tag-sitemap.xml</loc>
                <lastmod>2022-04-10T09:44:18+00:00</lastmod>
        </sitemap>
</sitemapindex>
* Connection #0 to host ... left intact
<!-- XML Sitemap generated by Yoast SEO -->

Metadata for the included sitemaps does not seem to differ much. Content type is always text/xml.

Luca Carlon
  • 121
  • 3
  • Yoast typically produces multiple XML sitemaps and a sitemap index. It sounds like you are submitting the sitemap index. Have you tried submitting the other individual sitemaps as well? Have you tried listing your sitemaps in robots.txt? – Stephen Ostermiller Jul 19 '22 at 19:50
  • I tried with a minimal sitemap of just one entry with the same result. The sitemap is already in robots.txt. – Luca Carlon Jul 20 '22 at 06:34
  • Also tried to add the sitemaps included in the index independently. Still "couldn't fetch". – Luca Carlon Jul 21 '22 at 05:46
  • Can you check the metadata that it served with the site maps? What HTTP headers are served with them? Specifically I'm interested in the content type header. – Stephen Ostermiller Jul 21 '22 at 07:26
  • Good question. I added some more data to my question. Content type appears to be text/xml. – Luca Carlon Jul 21 '22 at 09:39
  • I don't think there is anything wrong with that. I'm out of ideas. The only other thing that I can tell you is that it probably doesn't matter. Sitemaps have little impact on your site and your SEO. Googlebot will be able to crawl your WordPress site just fine without them. At best they could give you bit of extra information in search console. See The Sitemap Paradox – Stephen Ostermiller Jul 21 '22 at 09:58
  • Actually, what I see is that many pages are simply not indexed, and I have to index them manually. Is it possible there is some conflict with another domain that is redirecting to this, which is also still in GSC? Thanks for your help anyway though. – Luca Carlon Jul 21 '22 at 10:05
  • There are many possible reasons. See Why aren't search engines indexing my content? Another domain redirect to yours or sitemap problems are not usually reasons that Google wouldn't index your pages. – Stephen Ostermiller Jul 21 '22 at 10:31

1 Answers1

1

I found it to be a common error/bug, happens randomly. The Google Support forum has a large thread about it since 2019.

A possible workaround is to add another slash after the domain:

i.e.: https://store.usbswiper.com//sitemap_index.xml

Mariana
  • 21
  • 4