GoogleBot to soon crawl over HTTP/2
Quick summary: Starting November 2020, Googlebot will start crawling some sites over HTTP/2.
Ever since mainstream browsers started supporting the next major revision of HTTP, HTTP/2 or h2 for short, web professionals asked us whether Googlebot can crawl over the upgraded, more modern version of the protocol.
Today google announcing that starting mid November 2020, Googlebot will support crawling over HTTP/2 for select sites.
Table of Contents
- What is HTTP/2
- Why we’re making this change
- How it works
- How to opt out
- Questions that we thought you might ask
- Why are you upgrading Googlebot now?
- Do I need to upgrade my server ASAP?
- How do I test if my site supports h2?
- How do I upgrade my site to h2?
- How do I convince Googlebot to talk h2 with my site?
- Why are you not crawling every h2-enabled site over h2?
- How do I know if my site is crawled over h2?
- Does Googlebot support plaintext HTTP/2 (h2c)?
- Is Googlebot going to use the ALPN extension to decide which protocol version to use for crawling?
- How will different h2 features help with crawling?
- Will Googlebot crawl more or faster over h2?
- Is there any ranking benefit for a site in being crawled over h2?
As we said, it’s the next major version of HTTP, the protocol the internet primarily uses for transferring data. HTTP/2 is much more robust, efficient, and faster than its predecessor, due to its architecture and the features it implements for clients (for example, your browser) and servers. If you want to read more about it, we have a long article on the HTTP/2 topic on developers.google.com.
In general, we expect this change to make crawling more efficient in terms of server resource usage. With h2, Googlebot is able to open a single TCP connection to the server and efficiently transfer multiple files over it in parallel, instead of requiring multiple connections. The fewer connections open, the fewer resources the server and Googlebot have to spend on crawling.
In the first phase, we’ll crawl a small number of sites over h2, and we’ll ramp up gradually to more sites that may benefit from the initially supported features, like request multiplexing.
Googlebot decides which site to crawl over h2 based on whether the site supports h2, and whether the site and Googlebot would benefit from crawling over HTTP/2. If your server supports h2 and Googlebot already crawls a lot from your site, you may be already eligible for the connection upgrade, and you don’t have to do anything.
If your server still only talks HTTP/1.1, that’s also fine. There’s no explicit drawback for crawling over this protocol; crawling will remain the same, quality and quantity wise.
Our preliminary tests showed no issues or negative impact on indexing, but we understand that, for various reasons, you may want to opt your site out from crawling over HTTP/2. You can do that by instructing the server to respond with a 421 HTTP status code when Googlebot attempts to crawl your site over h2. If that’s not feasible at the moment, you can send a message to the Googlebot team (however, this solution is temporary).
The software we use to enable Googlebot to crawl over h2 has matured enough that it can be used in production.
It’s really up to you. However, we will only switch to crawling over h2 sites that support it and will clearly benefit from it. If there’s no clear benefit for crawling over h2, Googlebot will still continue to crawl over h1.
Cloudflare has a blog post with a plethora of different methods to test whether a site supports h2, check it out!
This really depends on your server. We recommend talking to your server administrator or hosting provider.
You can’t. If the site supports h2, it is eligible for being crawled over h2, but only if that would be beneficial for the site and Googlebot. If crawling over h2 would not result in noticeable resource savings for example, we would simply continue to crawl the site over HTTP/1.1.
In our evaluations we found little to no benefit for certain sites (for example, those with very low qps) when crawling over h2. Therefore we have decided to switch crawling to h2 only when there’s clear benefit for the site. We’ll continue to evaluate the performance gains and may change our criteria for switching in the future.
When a site becomes eligible for crawling over h2, the owners of that site registered in Search Console will get a message saying that some of the crawling traffic may be over h2 going forward. You can also check in your server logs (for example, in the access.log file if your site runs on Apache).
Which h2 features are supported by Googlebot?
Googlebot supports most of the features introduced by h2. Some features like server push, which may be beneficial for rendering, are still being evaluated.
No. Your website must use HTTPS and support HTTP/2 in order to be eligible for crawling over HTTP/2. This is equivalent to how modern browsers handle it.
Application-layer protocol negotiation (ALPN) will only be used for sites that are opted in to crawling over h2, and the only accepted protocol for responses will be h2. If the server responds during the TLS handshake with a protocol version other than h2, Googlebot will back off and come back later on HTTP/1.1.
Some of the many, but most prominent benefits of h2 include:
- Multiplexing and concurrency: Fewer TCP connections open means fewer resources spent.
- Header compression: Drastically reduced HTTP header sizes will save resources.
- Server push: This feature is not yet enabled; it’s still in the evaluation phase. It may be beneficial for rendering, but we don’t have anything specific to say about it at this point.
If you want to know more about specific h2 features and their relation to crawling, ask us on Twitter.
The primary benefit of h2 is resource savings, both on the server side, and on Googlebot side. Whether we crawl using h1 or h2 does not affect how your site is indexed, and hence it does not affect how much we plan to crawl from your site.
GROW YOUR BUSINESS WITH GK WEB AGENCY MARKETING
As Australia’s premier digital marketing agency, GK Web Agency has a reputation for delivering marketing that works. We make the web with an team of digital marketing experts. So if you’re looking to grow your business with a stellar marketing strategy, reach out to us! We’d love to chat with you about how we can take your marketing to the next level.