Abusing Googlebot Services to Deliver Crypto-Mining Malware

by nlqip
April 23, 2024

While investigating a recent threat campaign, F5 researchers encountered a strange behaviour where malicious requests were originating from legitimate Googlebot servers. This relatively infrequent behavior could potentially have serious consequences in environments where the trust level given to Googlebot influences an organization’s security decisions.

The Trust Paradox

Google’s official support site advises to “make sure Googlebot is not blocked”¹ and provides instructions to verify that Googlebot is real.² Both imply that trusting Googlebot traffic is somewhat mandatory if you’d like your site to show up in Google search engine results. Many vendors rely on the legitimacy of Google bot traffic and allowlist requests coming from genuine Googlebot servers. This means that malicious requests coming from Googlebot can bypass some security mechanisms without being inspected for content and potentially deliver malicious payloads. On the other hand, if an organization’s mitigation mechanism automatically denylists IP addresses delivering malicious requests, that organization could easily be tricked into blocking Googlebot, which may lead to a lower ranking in Google’s search engine.

Was Googlebot Hijacked?

After verifying that the requests we received on our threat intelligence system came from real Googlebot servers, we started looking into the possibility of an attacker creating such a scenario. It seems there are a couple of ways this could happened. One would be by controlling the Googlebot server, which seemed highly unlikely. Another would be by sending a fake User-Agent from another Google service. But since the requests originated from a Googlebot’s subdomain and Googlebot’s IP address pool and not from a different Google service (like Google Sites, for example), this possibility was also ruled out. That left us with the most likely scenario: that the service was being abused.

How Does the Googlebot Crawling Service Work?

Essentially, Googlebot follows every new or updated link on your website and then follows links from those pages, and so on. This is done to allow Google to add pages previously unknown to its search engine database. It also allows Google to analyze websites and later, make them available to users searching on Google’s search engine. Technically, “following links” means sending a GET request to every URL listed in the links on the website. So, Googlebot servers generate requests based on links they do not control and, as it seems, do not validate.

Tricking Googlebot

Given that Googlebot follows links, attackers figured out a simple method to trick Googlebot into send malicious requests to arbitrary targets. An attacker can add malicious links on a website, each link composed of the target’s address and a relevant attack payload. Here is a generic example of such a malicious link:

<a href=”http://victim-address.com/exploit-payload”>malicious link<a>

When Googlebot crawls through the page with this link, it follows the link and sends a malicious GET request holding exploit-payload to the attacker’s target of choice, in this case, victim-address.com.

Source link
lol

While investigating a recent threat campaign, F5 researchers encountered a strange behaviour where malicious requests were originating from legitimate Googlebot servers. This relatively infrequent behavior could potentially have serious consequences in environments where the trust level given to Googlebot influences an organization’s security decisions. The Trust Paradox Google’s official support site advises to “make sure…