A tool for parsing traffic on the jetstream and applying a moderation workstream based on regexp based rules

16: Update readme.md

+81 -21
+81 -21
README.md
··· 1 - # skywatch-tools 2 3 - This is a rewrite of the original skywatch-tools project in TypeScript. The original project was written in Bash. The purpose of this project is to automate the moderation by the Bluesky independent labeler skywatch.blue 4 5 - ## Installation and Setup 6 7 - To install dependencies: 8 9 ```bash 10 - bun i 11 ``` 12 13 - Modify .env.example with your own values and rename it to .env 14 15 ```bash 16 bun run start 17 ``` 18 19 - To run in docker: 20 21 ```bash 22 docker build -pull -t skywatch-automod . 23 docker run -d -p 4101:4101 skywatch-automod 24 ``` 25 26 - ## Brief overview 27 28 - Currently this tooling does one thing. It monitors the bluesky firehose and analyzes content for phrases which fit Skywatch's criteria for moderation. If the criteria is met, it can automatically label the content with the appropriate label. 29 30 - In certain cases, where regexp will create too many false positives, it will flag content as a report against related to the account, so that it can be reviewed later. 31 32 - For information on how to set-up your own checks, please see the [developing_checks.md](./src/developing_checks.md) file. 33 - 34 - _TODO_: 35 - 36 - - [ ] Remove unused types 37 - - [ ] Update the types needed to be more specific to the checks rather than bluesky content types 38 - - [ ] Consider how to write directly to OzonePDS database rather than using the API. May require running the same instance as Ozone to allow for direct database access. 39 - - [ ] Add compose.yaml for easy deployment 40 - - [ ] Make the metrics server work (or remove it) 41 - 42 - Create a seperate program to watch OZONE_PDS firehose labels, and update the lists as needed. This will remove dependency on broken ruby tools created by aegis.
··· 1 + # Skywatch Automod 2 + 3 + This project provides tools for automating moderation of the Bluesky social network. It listens to the Bluesky firehose stream, analyzes various types of content against user-defined rules, and performs moderation actions such as applying labels, reporting content, or leaving comments. 4 + 5 + ## Features 6 + 7 + - **Real-time Moderation:** Monitors the Bluesky firehose in real-time. 8 + - **Content-Aware Analysis:** Analyzes posts, user profiles (display names, descriptions), and handles 9 + - **Flexible Rule Engine:** Uses regular expressions for defining moderation checks. 10 + - **Variety of Actions:** Can apply labels, create reports (for posts or accounts), and post comments on accounts. 11 + - **Configurable:** Highly configurable through environment variables and a central constants file. 12 + - **Allowlisting:** Supports allowlisting for DIDs and text patterns to reduce false positives. 13 + - **URL Unshortening:** Automatically resolves shortened URLs in posts before checking them. 14 + - **Monitoring:** Exposes a Prometheus metrics endpoint to monitor its activity. (untested) 15 + - **Resilient:** Persists the firehose cursor to gracefully handle restarts without missing events. 16 + 17 + ## How It Works 18 + 19 + The application connects to the Bluesky firehose and subscribes to a set of collections (e.g., posts, profiles). When a new event is received, it is passed through a series of checks defined in `src/constants.ts`. These checks are categorized by content type: 20 + 21 + - `POST_CHECKS`: For post content and links. 22 + - `HANDLE_CHECKS`: For user handles. 23 + - `PROFILE_CHECKS`: For user display names and descriptions. 24 + 25 + If the content matches a check's criteria (and is not excluded by an allowlist), a corresponding moderation action is triggered. These actions (labeling, reporting, etc.) are performed using the Bluesky API. 26 + 27 + ## Getting Started 28 + 29 + ### Prerequisites 30 31 + - Node.js (v20 or higher recommended) 32 + - `bun` package manager 33 + - A Bluesky account for the bot. 34 + - A Bluesky labeler account 35 36 + ### 1. Installation 37 38 + Clone the repository and install the dependencies: 39 40 ```bash 41 + git clone <repository-url> 42 + cd skywatch-automod-public 43 + bun install 44 ``` 45 46 + ### Configuration 47 + 48 + There are two main configuration files you need to set up: 49 + 50 + - **Checks (`src/constants.ts`):** 51 + This file defines the rules for your automod. You need to create it by copying the example file: 52 + 53 + ```bash 54 + cp src/constants.ts.example src/constants.ts 55 + ``` 56 + 57 + Then, edit `src/constants.ts` to define your own checks. For detailed instructions on how to create checks, please see [developing_checks.md](./src/developing_checks.md). 58 + 59 + - **Environment Variables (`.env`):** 60 + This file contains credentials and other runtime configuration. You will need to create a `.env` file and populate it with your specific values. You can use `.env.example` as a reference if it exists in the 61 + 62 + ### 3. Running the Application 63 + 64 + Once configured, you can start the automod with: 65 66 ```bash 67 bun run start 68 ``` 69 70 + ### 4. Running with Docker 71 + 72 + You can also build and run the application as a Docker container. 73 74 ```bash 75 + # Build the container 76 docker build -pull -t skywatch-automod . 77 + 78 + # Run the container 79 docker run -d -p 4101:4101 skywatch-automod 80 ``` 81 82 + Make sure your `.env` file is present when building the Docker image, as it will be copied into the container. 83 84 + #### Configuration Variables 85 86 + The following environment variables are used for configuration: 87 88 + | Variable | Description | Default | 89 + | ------------------------ | ---------------------------------------------------------------- | ----------------------------------------- | 90 + | `DID` | The DID of your moderation service for atproto-proxy headers. | `""` | 91 + | `OZONE_URL` | The URL of the Ozone service. | `""` | 92 + | `OZONE_PDS` | The Public Downstream Service for Ozone. | `""` | 93 + | `BSKY_HANDLE` | The handle (username) of the bot's Bluesky account. | `""` | 94 + | `BSKY_PASSWORD` | The app password for the bot's Bluesky account. | `""` | 95 + | `HOST` | The host on which the server runs. | `127.0.0.1` | 96 + | `PORT` | The port for the main application (currently unused). | `4100` | 97 + | `METRICS_PORT` | The port for the Prometheus metrics server. | `4101` | 98 + | `FIREHOSE_URL` | The WebSocket URL for the Bluesky firehose. | `wss://jetstream.atproto.tools/subscribe` | 99 + | `CURSOR_UPDATE_INTERVAL` | How often to save the firehose cursor to disk (in milliseconds). | `60000` | 100 + | `LABEL_LIMIT` | (Optional) API call limit for labeling. | `undefined` | 101 + | `LABEL_LIMIT_WAIT` | (Optional) Wait time when label limit is hit. | `undefined` | 102 + | `LOG_LEVEL` | The logging level. | `info` |