Weighs the soul of incoming HTTP requests to stop AI crawlers

chore: release v1.20.0-pre1

Signed-off-by: Xe Iaso <me@xeiaso.net>

Xe Iaso ecc71694 4948036f

+125 -14
+1 -1
VERSION
··· 1 - 1.19.1 1 + 1.20.0-pre1
+121 -10
docs/docs/CHANGELOG.md
··· 11 11 12 12 ## [Unreleased] 13 13 14 + ## v1.20.0: Thancred Waters 15 + 16 + The big ticket items are as follows: 17 + 18 + - Implement a no-JS challenge method: [`metarefresh`](./admin/configuration/challenges/metarefresh.mdx) ([#95](https://github.com/TecharoHQ/anubis/issues/95)) 19 + - Implement request "weight", allowing administrators to customize the behaviour of Anubis based on specific criteria 20 + - Implement GeoIP and ASN based checks via [Thoth](https://anubis.techaro.lol/docs/admin/thoth) ([#206](https://github.com/TecharoHQ/anubis/issues/206)) 21 + - Add [custom weight thresholds](./admin/configuration/thresholds.mdx) via CEL ([#688](https://github.com/TecharoHQ/anubis/pull/688)) 14 22 - Move Open Graph configuration [to the policy file](./admin/configuration/open-graph.mdx) 15 - - Enable support for default Open Graph metadata 23 + - Enable support for Open Graph metadata to be returned by default instead of doing lookups against the target 24 + - Add `robots2policy` CLI utility to convert robots.txt files to Anubis challenge policies using CEL expressions ([#409](https://github.com/TecharoHQ/anubis/issues/409)) 25 + - Refactor challenge presentation logic to use a challenge registry 26 + - Allow challenge implementations to register HTTP routes 27 + 28 + A lot of performance improvements have been made: 29 + 30 + - Replace internal SHA256 hashing with xxhash for 4-6x performance improvement in policy evaluation and cache operations 31 + - Optimized the OGTags subsystem with reduced allocations and runtime per request by up to 66% 16 32 - Replace cidranger with bart for IP range checking, improving IP matching performance by 3-20x with zero heap 17 33 allocations 34 + 35 + And some cleanups/refactors were added: 36 + 18 37 - Remove the unused `/test-error` endpoint and update the testing endpoint `/make-challenge` to only be enabled in 19 38 development 20 39 - Add `--xff-strip-private` flag/envvar to toggle skipping X-Forwarded-For private addresses or not 21 - - Requests can have their weight be adjusted, if a request weighs zero or less than it is allowed through 22 - - Refactor challenge presentation logic to use a challenge registry 23 - - Allow challenge implementations to register HTTP routes 24 - - Implement a no-JS challenge method: [`metarefresh`](./admin/configuration/challenges/metarefresh.mdx) ([#95](https://github.com/TecharoHQ/anubis/issues/95)) 25 40 - Bump AI-robots.txt to version 1.37 26 41 - Make progress bar styling more compatible (UXP, etc) 27 - - Optimized the OGTags subsystem with reduced allocations and runtime per request by up to 66% 28 42 - Add `--strip-base-prefix` flag/envvar to strip the base prefix from request paths when forwarding to target servers 29 - - Add `robots2policy` CLI utility to convert robots.txt files to Anubis challenge policies using CEL expressions ([#409](https://github.com/TecharoHQ/anubis/issues/409)) 30 - - Implement GeoIP and ASN based checks via [Thoth](https://anubis.techaro.lol/docs/admin/thoth) ([#206](https://github.com/TecharoHQ/anubis/issues/206)) 31 - - Replace internal SHA256 hashing with xxhash for 4-6x performance improvement in policy evaluation and cache operations 32 - - Add [custom weight thresholds](./admin/configuration/thresholds.mdx) via CEL ([#688](https://github.com/TecharoHQ/anubis/pull/688)) 43 + 44 + Request weight is one of the biggest ticket features in Anubis. This enables Anubis to be much closer to a Web Application Firewall and when combined with custom thresholds allows administrators to have Anubis take advanced reactions. For more information about request weight, see [the request weight section](./admin/policies.mdx#request-weight) of the policy file documentation. 45 + 46 + TL;DR when you have one or more WEIGHT rules like this: 47 + 48 + ```yaml 49 + bots: 50 + - name: gitea-session-token 51 + action: WEIGH 52 + expression: 53 + all: 54 + - '"Cookie" in headers' 55 + - headers["Cookie"].contains("i_love_gitea=") 56 + # Remove 5 weight points 57 + weight: 58 + adjust: -5 59 + ``` 60 + 61 + You can configure custom thresholds like this: 62 + 63 + ```yaml 64 + thresholds: 65 + - name: minimal-suspicion # This client is likely fine, its soul is lighter than a feather 66 + expression: weight < 0 # a feather weighs zero units 67 + action: ALLOW # Allow the traffic through 68 + 69 + # For clients that had some weight reduced through custom rules, give them a 70 + # lightweight challenge. 71 + - name: mild-suspicion 72 + expression: 73 + all: 74 + - weight >= 0 75 + - weight < 10 76 + action: CHALLENGE 77 + challenge: 78 + # https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh 79 + algorithm: metarefresh 80 + difficulty: 1 81 + report_as: 1 82 + 83 + # For clients that are browser-like but have either gained points from custom 84 + # rules or report as a standard browser. 85 + - name: moderate-suspicion 86 + expression: 87 + all: 88 + - weight >= 10 89 + - weight < 20 90 + action: CHALLENGE 91 + challenge: 92 + # https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work 93 + algorithm: fast 94 + difficulty: 2 # two leading zeros, very fast for most clients 95 + report_as: 2 96 + 97 + # For clients that are browser like and have gained many points from custom 98 + # rules 99 + - name: extreme-suspicion 100 + expression: weight >= 20 101 + action: CHALLENGE 102 + challenge: 103 + # https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work 104 + algorithm: fast 105 + difficulty: 4 106 + report_as: 4 107 + ``` 108 + 109 + These thresholds apply when no other `ALLOW`, `DENY`, or `CHALLENGE` rule matches the request. `WEIGHT` rules add and remove request weight as needed: 110 + 111 + ```yaml 112 + bots: 113 + - name: gitea-session-token 114 + action: WEIGH 115 + expression: 116 + all: 117 + - '"Cookie" in headers' 118 + - headers["Cookie"].contains("i_love_gitea=") 119 + # Remove 5 weight points 120 + weight: 121 + adjust: -5 122 + 123 + - name: bot-like-user-agent 124 + action: WEIGH 125 + expression: '"Bot" in userAgent' 126 + # Add 5 weight points 127 + weight: 128 + adjust: 5 129 + ``` 130 + 131 + Of note: the default "generic browser" rule assigns 10 weight points: 132 + 133 + ```yaml 134 + # Generic catchall rule 135 + - name: generic-browser 136 + user_agent_regex: >- 137 + Mozilla|Opera 138 + action: WEIGH 139 + weight: 140 + adjust: 10 141 + ``` 142 + 143 + Adjust this as you see fit. 33 144 34 145 ## v1.19.1: Jenomis cen Lexentale - Echo 1 35 146
+2 -2
package-lock.json
··· 1 1 { 2 2 "name": "@techaro/anubis", 3 - "version": "1.19.1", 3 + "version": "1.20.0-pre1", 4 4 "lockfileVersion": 3, 5 5 "requires": true, 6 6 "packages": { 7 7 "": { 8 8 "name": "@techaro/anubis", 9 - "version": "1.19.1", 9 + "version": "1.20.0-pre1", 10 10 "license": "ISC", 11 11 "devDependencies": { 12 12 "cssnano": "^7.0.7",
+1 -1
package.json
··· 1 1 { 2 2 "name": "@techaro/anubis", 3 - "version": "1.19.1", 3 + "version": "1.20.0-pre1", 4 4 "description": "", 5 5 "main": "index.js", 6 6 "scripts": {