at protocol indexer with flexible filtering, xrpc queries, and a cursor-backed event stream, built on fjall
at-protocol atproto indexer rust fjall

[crawler] always check all signals whether its glob or not, since we use describeRepo now

ptr.pet d4aba848 8dd3080a

verified
+6 -7
+1 -1
README.md
··· 45 45 | `NO_LZ4_COMPRESSION` | `false` | disable lz4 compression for storage. | 46 46 | `ENABLE_FIREHOSE` | `true` | whether to ingest relay subscriptions. | 47 47 | `ENABLE_BACKFILL` | `true` | whether to backfill from PDS instances. | 48 - | `ENABLE_CRAWLER` | `false` (if Filter), `true` (if Full) | whether to actively query the network for unknown repositories. when in `Filter` mode without wildcard (`*`) signals, the crawler uses `com.atproto.repo.listRecords` to verify if a discovered repository has matching records before queuing it for backfill, this will be a lot faster usually since most repos will get filtered out faster. | 48 + | `ENABLE_CRAWLER` | `false` (if Filter), `true` (if Full) | whether to actively query the network for unknown repositories. | 49 49 | `DB_WORKER_THREADS` | `4` (`8` if full network) | database worker threads. | 50 50 | `DB_MAX_JOURNALING_SIZE_MB` | `512` (`1024` if full network) | max database journaling size in MB. | 51 51 | `DB_PENDING_MEMTABLE_SIZE_MB` | `64` (`192` if full network) | pending memtable size in MB. |
+4 -1
src/crawler/mod.rs
··· 285 285 } 286 286 }; 287 287 288 - let found_signal = filter.signals.iter().any(|s| out.collections.contains(s)); 288 + let found_signal = out 289 + .collections 290 + .iter() 291 + .any(|col| filter.matches_signal(col.as_str())); 289 292 290 293 if !found_signal { 291 294 trace!("no signal-matching collections found");
+1 -5
src/filter.rs
··· 52 52 self.signals.iter().any(|p| nsid_matches(p, collection)) 53 53 } 54 54 55 - fn has_glob_signals(&self) -> bool { 56 - self.signals.iter().any(|s| s.ends_with(".*")) 57 - } 58 - 59 55 pub fn check_signals(&self) -> bool { 60 - self.mode == FilterMode::Filter && !self.signals.is_empty() && !self.has_glob_signals() 56 + self.mode == FilterMode::Filter && !self.signals.is_empty() 61 57 } 62 58 } 63 59