Weighs the soul of incoming HTTP requests to stop AI crawlers
at main 227 lines 8.4 kB view raw
1## Anubis has the ability to let you import snippets of configuration into the main 2## configuration file. This allows you to break up your config into smaller parts 3## that get logically assembled into one big file. 4## 5## Of note, a bot rule can either have inline bot configuration or import a 6## bot config snippet. You cannot do both in a single bot rule. 7## 8## Import paths can either be prefixed with (data) to import from the common/shared 9## rules in the data folder in the Anubis source tree or will point to absolute/relative 10## paths in your filesystem. If you don't have access to the Anubis source tree, check 11## /usr/share/docs/anubis/data or in the tarball you extracted Anubis from. 12 13bots: 14 # Pathological bots to deny 15 - # This correlates to data/bots/deny-pathological.yaml in the source tree 16 # https://github.com/TecharoHQ/anubis/blob/main/data/bots/deny-pathological.yaml 17 import: (data)/bots/_deny-pathological.yaml 18 - import: (data)/bots/aggressive-brazilian-scrapers.yaml 19 20 # Aggressively block AI/LLM related bots/agents by default 21 - import: (data)/meta/ai-block-aggressive.yaml 22 23 # Consider replacing the aggressive AI policy with more selective policies: 24 # - import: (data)/meta/ai-block-moderate.yaml 25 # - import: (data)/meta/ai-block-permissive.yaml 26 27 # Search engine crawlers to allow, defaults to: 28 # - Google (so they don't try to bypass Anubis) 29 # - Apple 30 # - Bing 31 # - DuckDuckGo 32 # - Qwant 33 # - The Internet Archive 34 # - Kagi 35 # - Marginalia 36 # - Mojeek 37 - import: (data)/crawlers/_allow-good.yaml 38 # Challenge Firefox AI previews 39 - import: (data)/clients/x-firefox-ai.yaml 40 41 # Allow common "keeping the internet working" routes (well-known, favicon, robots.txt) 42 - import: (data)/common/keep-internet-working.yaml 43 44 # # Punish any bot with "bot" in the user-agent string 45 # # This is known to have a high false-positive rate, use at your own risk 46 # - name: generic-bot-catchall 47 # user_agent_regex: (?i:bot|crawler) 48 # action: CHALLENGE 49 # challenge: 50 # difficulty: 16 # impossible 51 # report_as: 4 # lie to the operator 52 # algorithm: slow # intentionally waste CPU cycles and time 53 54 # Requires a subscription to Thoth to use, see 55 # https://anubis.techaro.lol/docs/admin/thoth#geoip-based-filtering 56 - name: countries-with-aggressive-scrapers 57 action: WEIGH 58 geoip: 59 countries: 60 - BR 61 - CN 62 weight: 63 adjust: 10 64 65 # Requires a subscription to Thoth to use, see 66 # https://anubis.techaro.lol/docs/admin/thoth#asn-based-filtering 67 - name: aggressive-asns-without-functional-abuse-contact 68 action: WEIGH 69 asns: 70 match: 71 - 13335 # Cloudflare 72 - 136907 # Huawei Cloud 73 - 45102 # Alibaba Cloud 74 weight: 75 adjust: 10 76 77 ## System load based checks. 78 # If the system is under high load, add weight. 79 - name: high-load-average 80 action: WEIGH 81 expression: load_1m >= 10.0 # make sure to end the load comparison in a .0 82 weight: 83 adjust: 20 84 85 ## If your backend service is running on the same operating system as Anubis, 86 ## you can uncomment this rule to make the challenge easier when the system is 87 ## under low load. 88 ## 89 ## If it is not, remove weight. 90 # - name: low-load-average 91 # action: WEIGH 92 # expression: load_15m <= 4.0 # make sure to end the load comparison in a .0 93 # weight: 94 # adjust: -10 95 96 # Generic catchall rule 97 - name: generic-browser 98 user_agent_regex: >- 99 Mozilla|Opera 100 action: WEIGH 101 weight: 102 adjust: 10 103 104dnsbl: false 105 106# # 107# impressum: 108# # Displayed at the bottom of every page rendered by Anubis. 109# footer: >- 110# This website is hosted by Zombocom. If you have any complaints or notes 111# about the service, please contact 112# <a href="mailto:contact@domainhere.example">contact@domainhere.example</a> 113# and we will assist you as soon as possible. 114 115# # The imprint page that will be linked to at the footer of every Anubis page. 116# page: 117# # The HTML <title> of the page 118# title: Imprint and Privacy Policy 119# # The HTML contents of the page. The exact contents of this page can 120# # and will vary by locale. Please consult with a lawyer if you are not 121# # sure what to put here 122# body: >- 123# <p>Last updated: June 2025</p> 124 125# <h2>Information that is gathered from visitors</h2> 126 127# <p>In common with other websites, log files are stored on the web server saving details such as the visitor's IP address, browser type, referring page and time of visit.</p> 128 129# <p>Cookies may be used to remember visitor preferences when interacting with the website.</p> 130 131# <p>Where registration is required, the visitor's email and a username will be stored on the server.</p> 132 133# <!-- ... --> 134 135# Open Graph passthrough configuration, see here for more information: 136# https://anubis.techaro.lol/docs/admin/configuration/open-graph/ 137openGraph: 138 # Enables Open Graph passthrough 139 enabled: false 140 # Enables the use of the HTTP host in the cache key, this enables 141 # caching metadata for multiple http hosts at once. 142 considerHost: false 143 # How long cached OpenGraph metadata should last in memory 144 ttl: 24h 145 # # If set, return these opengraph values instead of looking them up with 146 # # the target service. 147 # # 148 # # Correlates to properties in https://ogp.me/ 149 # override: 150 # # og:title is required, it is the title of the website 151 # "og:title": "Techaro Anubis" 152 # "og:description": >- 153 # Anubis is a Web AI Firewall Utility that helps you fight the bots 154 # away so that you can maintain uptime at work! 155 # "description": >- 156 # Anubis is a Web AI Firewall Utility that helps you fight the bots 157 # away so that you can maintain uptime at work! 158 159# By default, send HTTP 200 back to clients that either get issued a challenge 160# or a denial. This seems weird, but this is load-bearing due to the fact that 161# the most aggressive scraper bots seem to really, really, want an HTTP 200 and 162# will stop sending requests once they get it. 163status_codes: 164 CHALLENGE: 200 165 DENY: 200 166 167# Anubis can store temporary data in one of a few backends. See the storage 168# backends section of the docs for more information: 169# 170# https://anubis.techaro.lol/docs/admin/policies#storage-backends 171store: 172 backend: memory 173 parameters: {} 174 175# The weight thresholds for when to trigger individual challenges. Any 176# CHALLENGE will take precedence over this. 177# 178# A threshold has four configuration options: 179# 180# - name: the name that is reported down the stack and used for metrics 181# - expression: A CEL expression with the request weight in the variable 182# weight 183# - action: the Anubis action to apply, similar to in a bot policy 184# - challenge: which challenge to send to the user, similar to in a bot policy 185# 186# See https://anubis.techaro.lol/docs/admin/configuration/thresholds for more 187# information. 188thresholds: 189 # By default Anubis ships with the following thresholds: 190 - name: minimal-suspicion # This client is likely fine, its soul is lighter than a feather 191 expression: weight <= 0 # a feather weighs zero units 192 action: ALLOW # Allow the traffic through 193 # For clients that had some weight reduced through custom rules, give them a 194 # lightweight challenge. 195 - name: mild-suspicion 196 expression: 197 all: 198 - weight > 0 199 - weight < 10 200 action: CHALLENGE 201 challenge: 202 # https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh 203 algorithm: metarefresh 204 difficulty: 1 205 report_as: 1 206 # For clients that are browser-like but have either gained points from custom rules or 207 # report as a standard browser. 208 - name: moderate-suspicion 209 expression: 210 all: 211 - weight >= 10 212 - weight < 20 213 action: CHALLENGE 214 challenge: 215 # https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work 216 algorithm: fast 217 difficulty: 2 # two leading zeros, very fast for most clients 218 report_as: 2 219 # For clients that are browser like and have gained many points from custom rules 220 - name: extreme-suspicion 221 expression: weight >= 20 222 action: CHALLENGE 223 challenge: 224 # https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work 225 algorithm: fast 226 difficulty: 4 227 report_as: 4