Merklemap Crawler

Overview

The Merklemap crawler is an automated web crawling robot that browses the web to gather publicly available data related to website security configurations. Its purpose is to assess the overall adoption of security best practices across a representative sample of websites.

How it Works

The Merklemap crawler uses standard web crawling techniques to visit a large number of public websites across the internet. As it crawls each site, it passively observes and records security-related configurations, such as:

Use of HTTPS vs non-secure HTTP
Presence of security-related HTTP headers
JavaScript library versions referenced in public code
Caching and content security policies

The crawler aggregates and anonymizes data to identify trends and patterns in the implementation of common security settings across websites. This allows it to gauge the general state of web security hygiene.

Privacy and Ethics

The Merklemap crawler is designed to gather only publicly available information and does not circumvent any access controls or authentication. It does not attempt to actively probe for vulnerabilities or conduct any unauthorized testing.

All data collected is anonymized and aggregated before use in security analysis reporting. No data on individual websites is retained or published.

Controlling Crawling with Robots.txt

Website owners can control how the Merklemap crawler interacts with their site using the robots.txt file. To exclude the Merklemap crawler from your website, add the following line to your robots.txt:

User-agent: MerklemapBot
Disallow: /

This will instruct the Merklemap crawler not to visit any pages on the domain. If no crawl instructions are specified for the MerklemapBot user agent, the crawler will follow the default rules specified for Googlebot or the generic * user agent.