this repo has no description
1/* 2 * CONTENT WARNING 3 * 4 * This file contains explicit slurs and hateful language. We're sorry you have to see them. 5 * 6 * These words exist here for one reason: to ensure our moderation system correctly blocks them. 7 * We can't verify the filter catches the n-word without testing against the actual word. 8 * Euphemisms wouldn't prove the protection works. 9 * 10 * If reading this file has caused you distress, please know: 11 * - you are valued and welcome in this community 12 * - these words do not reflect the views of this project or its contributors 13 * - we maintain this code precisely because we believe everyone deserves an experience on the web that is free from this kinda language 14*/ 15 16use regex::Regex; 17use std::sync::OnceLock; 18 19static SLUR_REGEXES: OnceLock<Vec<Regex>> = OnceLock::new(); 20static EXTRA_BANNED_WORDS: OnceLock<Vec<String>> = OnceLock::new(); 21 22fn get_slur_regexes() -> &'static Vec<Regex> { 23 SLUR_REGEXES.get_or_init(|| { 24 vec![ 25 Regex::new(r"\b[cĆćĈĉČčĊċÇçḈḉȻȼꞒꞓꟄꞔƇƈɕ][hĤĥȞȟḦḧḢḣḨḩḤḥḪḫH̱ẖĦħⱧⱨꞪɦꞕΗНн][iÍíi̇́Ììi̇̀ĬĭÎîǏǐÏïḮḯĨĩi̇̃ĮįĮ́į̇́Į̃į̇̃ĪīĪ̀ī̀ỈỉȈȉI̋i̋ȊȋỊịꞼꞽḬḭƗɨᶖİiIıIi1lĺľļḷḹl̃ḽḻłŀƚꝉⱡɫɬꞎꬷꬸꬹᶅɭȴLl][nŃńǸǹŇňÑñṄṅŅņṆṇṊṋṈṉN̈n̈ƝɲŊŋꞐꞑꞤꞥᵰᶇɳȵꬻꬼИиПпNn][kḰḱǨǩĶķḲḳḴḵƘƙⱩⱪᶄꝀꝁꝂꝃꝄꝅꞢꞣ][sŚśṤṥŜŝŠšṦṧṠṡŞşṢṣṨṩȘșS̩s̩ꞨꞩⱾȿꟅʂᶊᵴ]?\b").unwrap(), 26 Regex::new(r"\b[cĆćĈĉČčĊċÇçḈḉȻȼꞒꞓꟄꞔƇƈɕ][ÓóÒòŎŏÔôỐốỒồỖỗỔổǑǒÖöȪȫŐőÕõṌṍṎṏȬȭȮȯO͘o͘ȰȱØøǾǿǪǫǬǭŌōṒṓṐṑỎỏȌȍȎȏƠơỚớỜờỠỡỞởỢợỌọỘộO̩o̩Ò̩ò̩Ó̩ó̩ƟɵꝊꝋꝌꝍⱺOo0]{2}[nŃńǸǹŇňÑñṄṅŅņṆṇṊṋṈṉN̈n̈ƝɲŊŋꞐꞑꞤꞥᵰᶇɳȵꬻꬼИиПпNn][sŚśṤṥŜŝŠšṦṧṠṡŞşṢṣṨṩȘșS̩s̩ꞨꞩⱾȿꟅʂᶊᵴ]?\b").unwrap(), 27 Regex::new(r"\b[fḞḟƑƒꞘꞙᵮᶂ][aÁáÀàĂăẮắẰằẴẵẲẳÂâẤấẦầẪẫẨẩǍǎÅåǺǻÄäǞǟÃãȦȧǠǡĄąĄ́ą́Ą̃ą̃ĀāĀ̀ā̀ẢảȀȁA̋a̋ȂȃẠạẶặẬậḀḁȺⱥꞺꞻᶏẚAa@4][gǴǵĞğĜĝǦǧĠġG̃g̃ĢģḠḡǤǥꞠꞡƓɠᶃꬶGg]{1,2}([ÓóÒòŎŏÔôỐốỒồỖỗỔổǑǒÖöȪȫŐőÕõṌṍṎṏȬȭȮȯO͘o͘ȰȱØøǾǿǪǫǬǭŌōṒṓṐṑỎỏȌȍȎȏƠơỚớỜờỠỡỞởỢợỌọỘộO̩o̩Ò̩ò̩Ó̩ó̩ƟɵꝊꝋꝌꝍⱺOo0e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEeiÍíi̇́Ììi̇̀ĬĭÎîǏǐÏïḮḯĨĩi̇̃ĮįĮ́į̇́Į̃į̇̃ĪīĪ̀ī̀ỈỉȈȉI̋i̋ȊȋỊịꞼꞽḬḭƗɨᶖİiIıIi1lĺľļḷḹl̃ḽḻłŀƚꝉⱡɫɬꞎꬷꬸꬹᶅɭȴLl][tŤťṪṫŢţṬṭȚțṰṱṮṯŦŧȾⱦƬƭƮʈT̈ẗᵵƫȶ]{1,2}([rŔŕŘřṘṙŖŗȐȑȒȓṚṛṜṝṞṟR̃r̃ɌɍꞦꞧⱤɽᵲᶉꭉ][yÝýỲỳŶŷY̊ẙŸÿỸỹẎẏȲȳỶỷỴỵɎɏƳƴỾỿ]|[rŔŕŘřṘṙŖŗȐȑȒȓṚṛṜṝṞṟR̃r̃ɌɍꞦꞧⱤɽᵲᶉꭉ][iÍíi̇́Ììi̇̀ĬĭÎîǏǐÏïḮḯĨĩi̇̃ĮįĮ́į̇́Į̃į̇̃ĪīĪ̀ī̀ỈỉȈȉI̋i̋ȊȋỊịꞼꞽḬḭƗɨᶖİiIıIi1lĺľļḷḹl̃ḽḻłŀƚꝉⱡɫɬꞎꬷꬸꬹᶅɭȴLl][e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEe])?)?[sŚśṤṥŜŝŠšṦṧṠṡŞşṢṣṨṩȘșS̩s̩ꞨꞩⱾȿꟅʂᶊᵴ]?\b").unwrap(), 28 Regex::new(r"\b[kḰḱǨǩĶķḲḳḴḵƘƙⱩⱪᶄꝀꝁꝂꝃꝄꝅꞢꞣ][iÍíi̇́Ììi̇̀ĬĭÎîǏǐÏïḮḯĨĩi̇̃ĮįĮ́į̇́Į̃į̇̃ĪīĪ̀ī̀ỈỉȈȉI̋i̋ȊȋỊịꞼꞽḬḭƗɨᶖİiIıIi1lĺľļḷḹl̃ḽḻłŀƚꝉⱡɫɬꞎꬷꬸꬹᶅɭȴLlyÝýỲỳŶŷY̊ẙŸÿỸỹẎẏȲȳỶỷỴỵɎɏƳƴỾỿ][kḰḱǨǩĶķḲḳḴḵƘƙⱩⱪᶄꝀꝁꝂꝃꝄꝅꞢꞣ][e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEe]([rŔŕŘřṘṙŖŗȐȑȒȓṚṛṜṝṞṟR̃r̃ɌɍꞦꞧⱤɽᵲᶉꭉ][yÝýỲỳŶŷY̊ẙŸÿỸỹẎẏȲȳỶỷỴỵɎɏƳƴỾỿ]|[rŔŕŘřṘṙŖŗȐȑȒȓṚṛṜṝṞṟR̃r̃ɌɍꞦꞧⱤɽᵲᶉꭉ][iÍíi̇́Ììi̇̀ĬĭÎîǏǐÏïḮḯĨĩi̇̃ĮįĮ́į̇́Į̃į̇̃ĪīĪ̀ī̀ỈỉȈȉI̋i̋ȊȋỊịꞼꞽḬḭƗɨᶖİiIıIi1lĺľļḷḹl̃ḽḻłŀƚꝉⱡɫɬꞎꬷꬸꬹᶅɭȴLl][e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEe])?[sŚśṤṥŜŝŠšṦṧṠṡŞşṢṣṨṩȘșS̩s̩ꞨꞩⱾȿꟅʂᶊᵴ]*\b").unwrap(), 29 Regex::new(r"\b[nŃńǸǹŇňÑñṄṅŅņṆṇṊṋṈṉN̈n̈ƝɲŊŋꞐꞑꞤꞥᵰᶇɳȵꬻꬼИиПпNn][iÍíi̇́Ììi̇̀ĬĭÎîǏǐÏïḮḯĨĩi̇̃ĮįĮ́į̇́Į̃į̇̃ĪīĪ̀ī̀ỈỉȈȉI̋i̋ȊȋỊịꞼꞽḬḭƗɨᶖİiIıIi1lĺľļḷḹl̃ḽḻłŀƚꝉⱡɫɬꞎꬷꬸꬹᶅɭȴLloÓóÒòŎŏÔôỐốỒồỖỗỔổǑǒÖöȪȫŐőÕõṌṍṎṏȬȭȮȯO͘o͘ȰȱØøǾǿǪǫǬǭŌōṒṓṐṑỎỏȌȍȎȏƠơỚớỜờỠỡỞởỢợỌọỘộO̩o̩Ò̩ò̩Ó̩ó̩ƟɵꝊꝋꝌꝍⱺOoІіa4ÁáÀàĂăẮắẰằẴẵẲẳÂâẤấẦầẪẫẨẩǍǎÅåǺǻÄäǞǟÃãȦȧǠǡĄąĄ́ą́Ą̃ą̃ĀāĀ̀ā̀ẢảȀȁA̋a̋ȂȃẠạẶặẬậḀḁȺⱥꞺꞻᶏẚAa][gǴǵĞğĜĝǦǧĠġG̃g̃ĢģḠḡǤǥꞠꞡƓɠᶃꬶGgqꝖꝗꝘꝙɋʠ]{2}(l[e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEe]t|[e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEeaÁáÀàĂăẮắẰằẴẵẲẳÂâẤấẦầẪẫẨẩǍǎÅåǺǻÄäǞǟÃãȦȧǠǡĄąĄ́ą́Ą̃ą̃ĀāĀ̀ā̀ẢảȀȁA̋a̋ȂȃẠạẶặẬậḀḁȺⱥꞺꞻᶏẚAa][rŔŕŘřṘṙŖŗȐȑȒȓṚṛṜṝṞṟR̃r̃ɌɍꞦꞧⱤɽᵲᶉꭉ]?|n[ÓóÒòŎŏÔôỐốỒồỖỗỔổǑǒÖöȪȫŐőÕõṌṍṎṏȬȭȮȯO͘o͘ȰȱØøǾǿǪǫǬǭŌōṒṓṐṑỎỏȌȍȎȏƠơỚớỜờỠỡỞởỢợỌọỘộO̩o̩Ò̩ò̩Ó̩ó̩ƟɵꝊꝋꝌꝍⱺOo0][gǴǵĞğĜĝǦǧĠġG̃g̃ĢģḠḡǤǥꞠꞡƓɠᶃꬶGgqꝖꝗꝘꝙɋʠ]|[a4ÁáÀàĂăẮắẰằẴẵẲẳÂâẤấẦầẪẫẨẩǍǎÅåǺǻÄäǞǟÃãȦȧǠǡĄąĄ́ą́Ą̃ą̃ĀāĀ̀ā̀ẢảȀȁA̋a̋ȂȃẠạẶặẬậḀḁȺⱥꞺꞻᶏẚAa]?)?[sŚśṤṥŜŝŠšṦṧṠṡŞşṢṣṨṩȘșS̩s̩ꞨꞩⱾȿꟅʂᶊᵴ]?\b").unwrap(), 30 Regex::new(r"[nŃńǸǹŇňÑñṄṅŅņṆṇṊṋṈṉN̈n̈ƝɲŊŋꞐꞑꞤꞥᵰᶇɳȵꬻꬼИиПпNn][iÍíi̇́Ììi̇̀ĬĭÎîǏǐÏïḮḯĨĩi̇̃ĮįĮ́į̇́Į̃į̇̃ĪīĪ̀ī̀ỈỉȈȉI̋i̋ȊȋỊịꞼꞽḬḭƗɨᶖİiIıIi1lĺľļḷḹl̃ḽḻłŀƚꝉⱡɫɬꞎꬷꬸꬹᶅɭȴLloÓóÒòŎŏÔôỐốỒồỖỗỔổǑǒÖöȪȫŐőÕõṌṍṎṏȬȭȮȯO͘o͘ȰȱØøǾǿǪǫǬǭŌōṒṓṐṑỎỏȌȍȎȏƠơỚớỜờỠỡỞởỢợỌọỘộO̩o̩Ò̩ò̩Ó̩ó̩ƟɵꝊꝋꝌꝍⱺOoІіa4ÁáÀàĂăẮắẰằẴẵẲẳÂâẤấẦầẪẫẨẩǍǎÅåǺǻÄäǞǟÃãȦȧǠǡĄąĄ́ą́Ą̃ą̃ĀāĀ̀ā̀ẢảȀȁA̋a̋ȂȃẠạẶặẬậḀḁȺⱥꞺꞻᶏẚAa][gǴǵĞğĜĝǦǧĠġG̃g̃ĢģḠḡǤǥꞠꞡƓɠᶃꬶGgqꝖꝗꝘꝙɋʠ]{2}(l[e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEe]t|[e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEe][rŔŕŘřṘṙŖŗȐȑȒȓṚṛṜṝṞṟR̃r̃ɌɍꞦꞧⱤɽᵲᶉꭉ])[sŚśṤṥŜŝŠšṦṧṠṡŞşṢṣṨṩȘșS̩s̩ꞨꞩⱾȿꟅʂᶊᵴ]?").unwrap(), 31 Regex::new(r"\b[tŤťṪṫŢţṬṭȚțṰṱṮṯŦŧȾⱦƬƭƮʈT̈ẗᵵƫȶ][rŔŕŘřṘṙŖŗȐȑȒȓṚṛṜṝṞṟR̃r̃ɌɍꞦꞧⱤɽᵲᶉꭉ][aÁáÀàĂăẮắẰằẴẵẲẳÂâẤấẦầẪẫẨẩǍǎÅåǺǻÄäǞǟÃãȦȧǠǡĄąĄ́ą́Ą̃ą̃ĀāĀ̀ā̀ẢảȀȁA̋a̋ȂȃẠạẶặẬậḀḁȺⱥꞺꞻᶏẚAa4]+[nŃńǸǹŇňÑñṄṅŅņṆṇṊṋṈṉN̈n̈ƝɲŊŋꞐꞑꞤꞥᵰᶇɳȵꬻꬼИиПпNn]{1,2}([iÍíi̇́Ììi̇̀ĬĭÎîǏǐÏïḮḯĨĩi̇̃ĮįĮ́į̇́Į̃į̇̃ĪīĪ̀ī̀ỈỉȈȉI̋i̋ȊȋỊịꞼꞽḬḭƗɨᶖİiIıIi1lĺľļḷḹl̃ḽḻłŀƚꝉⱡɫɬꞎꬷꬸꬹᶅɭȴLl][e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEe]|[yÝýỲỳŶŷY̊ẙŸÿỸỹẎẏȲȳỶỷỴỵɎɏƳƴỾỿ]|[e3ЄєЕеÉéÈèĔĕÊêẾếỀềỄễỂểÊ̄ê̄Ê̌ê̌ĚěËëẼẽĖėĖ́ė́Ė̃ė̃ȨȩḜḝĘęĘ́ę́Ę̃ę̃ĒēḖḗḔḕẺẻȄȅE̋e̋ȆȇẸẹỆệḘḙḚḛɆɇE̩e̩È̩è̩É̩é̩ᶒⱸꬴꬳEe][rŔŕŘřṘṙŖŗȐȑȒȓṚṛṜṝṞṟR̃r̃ɌɍꞦꞧⱤɽᵲᶉꭉ])[sŚśṤṥŜŝŠšṦṧṠṡŞşṢṣṨṩȘșS̩s̩ꞨꞩⱾȿꟅʂᶊᵴ]?\b").unwrap(), 32 ] 33 }) 34} 35 36fn get_extra_banned_words() -> &'static Vec<String> { 37 EXTRA_BANNED_WORDS.get_or_init(|| { 38 std::env::var("PDS_BANNED_WORDS") 39 .unwrap_or_default() 40 .split(',') 41 .map(|s| s.trim().to_lowercase()) 42 .filter(|s| !s.is_empty()) 43 .collect() 44 }) 45} 46 47fn strip_trailing_digits(s: &str) -> &str { 48 s.trim_end_matches(|c: char| c.is_ascii_digit()) 49} 50 51fn normalize_leetspeak(s: &str) -> String { 52 s.chars() 53 .map(|c| match c { 54 '4' | '@' => 'a', 55 '3' => 'e', 56 '1' | '!' | '|' => 'i', 57 '0' => 'o', 58 '5' | '$' => 's', 59 '7' => 't', 60 '8' => 'b', 61 '9' => 'g', 62 _ => c, 63 }) 64 .collect() 65} 66 67pub fn has_explicit_slur(text: &str) -> bool { 68 has_explicit_slur_with_extra_words(text, get_extra_banned_words()) 69} 70 71fn has_explicit_slur_with_extra_words(text: &str, extra_words: &[String]) -> bool { 72 let text_lower = text.to_lowercase(); 73 let normalized = text_lower.replace(['.', '-', '_'], ""); 74 let stripped = strip_trailing_digits(&text_lower); 75 let normalized_stripped = strip_trailing_digits(&normalized); 76 77 let regexes = get_slur_regexes(); 78 if regexes.iter().any(|r| { 79 r.is_match(&text_lower) 80 || r.is_match(&normalized) 81 || r.is_match(stripped) 82 || r.is_match(normalized_stripped) 83 }) { 84 return true; 85 } 86 87 if !extra_words.is_empty() { 88 let leet_normalized = normalize_leetspeak(&normalized); 89 let leet_stripped = normalize_leetspeak(strip_trailing_digits(&leet_normalized)); 90 if extra_words.iter().any(|w| { 91 text_lower.contains(w) 92 || normalized.contains(w) 93 || stripped.contains(w) 94 || normalized_stripped.contains(w) 95 || leet_normalized.contains(w) 96 || leet_stripped.contains(w) 97 }) { 98 return true; 99 } 100 } 101 false 102} 103 104#[cfg(test)] 105mod tests { 106 use super::*; 107 use base64::Engine; 108 109 fn d(b64: &str) -> String { 110 let bytes = base64::engine::general_purpose::STANDARD.decode(b64).unwrap(); 111 String::from_utf8(bytes).unwrap() 112 } 113 114 #[test] 115 fn test_chink_pattern() { 116 assert!(has_explicit_slur(&d("Y2hpbms="))); 117 assert!(has_explicit_slur(&d("Y2hpbmtz"))); 118 assert!(has_explicit_slur(&d("Q0hJTks="))); 119 assert!(has_explicit_slur(&d("Q2hpbmtz"))); 120 } 121 122 #[test] 123 fn test_coon_pattern() { 124 assert!(has_explicit_slur(&d("Y29vbg=="))); 125 assert!(has_explicit_slur(&d("Y29vbnM="))); 126 assert!(has_explicit_slur(&d("Q09PTg=="))); 127 } 128 129 #[test] 130 fn test_fag_pattern() { 131 assert!(has_explicit_slur(&d("ZmFn"))); 132 assert!(has_explicit_slur(&d("ZmFncw=="))); 133 assert!(has_explicit_slur(&d("ZmFnZ290"))); 134 assert!(has_explicit_slur(&d("ZmFnZ290cw=="))); 135 assert!(has_explicit_slur(&d("ZmFnZ290cnk="))); 136 } 137 138 #[test] 139 fn test_kike_pattern() { 140 assert!(has_explicit_slur(&d("a2lrZQ=="))); 141 assert!(has_explicit_slur(&d("a2lrZXM="))); 142 assert!(has_explicit_slur(&d("S0lLRQ=="))); 143 assert!(has_explicit_slur(&d("a2lrZXJ5"))); 144 } 145 146 #[test] 147 fn test_nigger_pattern() { 148 assert!(has_explicit_slur(&d("bmlnZ2Vy"))); 149 assert!(has_explicit_slur(&d("bmlnZ2Vycw=="))); 150 assert!(has_explicit_slur(&d("TklHR0VS"))); 151 assert!(has_explicit_slur(&d("bmlnZ2E="))); 152 assert!(has_explicit_slur(&d("bmlnZ2Fz"))); 153 } 154 155 #[test] 156 fn test_tranny_pattern() { 157 assert!(has_explicit_slur(&d("dHJhbm55"))); 158 assert!(has_explicit_slur(&d("dHJhbm5pZXM="))); 159 assert!(has_explicit_slur(&d("VFJBTk5Z"))); 160 } 161 162 #[test] 163 fn test_normalization_bypass() { 164 assert!(has_explicit_slur(&d("bi5pLmcuZy5lLnI="))); 165 assert!(has_explicit_slur(&d("bi1pLWctZy1lLXI="))); 166 assert!(has_explicit_slur(&d("bl9pX2dfZ19lX3I="))); 167 assert!(has_explicit_slur(&d("Zi5hLmc="))); 168 assert!(has_explicit_slur(&d("Zi1hLWc="))); 169 assert!(has_explicit_slur(&d("Yy5oLmkubi5r"))); 170 assert!(has_explicit_slur(&d("a19pX2tfZQ=="))); 171 } 172 173 #[test] 174 fn test_trailing_digits_bypass() { 175 assert!(has_explicit_slur(&d("ZmFnZ290MTIz"))); 176 assert!(has_explicit_slur(&d("bmlnZ2VyNjk="))); 177 assert!(has_explicit_slur(&d("Y2hpbms0MjA="))); 178 assert!(has_explicit_slur(&d("ZmFnMQ=="))); 179 assert!(has_explicit_slur(&d("a2lrZTIwMjQ="))); 180 assert!(has_explicit_slur(&d("bl9pX2dfZ19lX3IxMjM="))); 181 } 182 183 #[test] 184 fn test_embedded_in_sentence() { 185 assert!(has_explicit_slur(&d("eW91IGFyZSBhIGZhZ2dvdA=="))); 186 assert!(has_explicit_slur(&d("c3R1cGlkIG5pZ2dlcg=="))); 187 assert!(has_explicit_slur(&d("Z28gYXdheSBjaGluaw=="))); 188 } 189 190 #[test] 191 fn test_safe_words_not_matched() { 192 assert!(!has_explicit_slur("hello")); 193 assert!(!has_explicit_slur("world")); 194 assert!(!has_explicit_slur("bluesky")); 195 assert!(!has_explicit_slur("tranquil")); 196 assert!(!has_explicit_slur("programmer")); 197 assert!(!has_explicit_slur("trigger")); 198 assert!(!has_explicit_slur("bigger")); 199 assert!(!has_explicit_slur("digger")); 200 assert!(!has_explicit_slur("figure")); 201 assert!(!has_explicit_slur("configure")); 202 } 203 204 #[test] 205 fn test_similar_but_safe_words() { 206 assert!(!has_explicit_slur("niggardly")); 207 assert!(!has_explicit_slur("raccoon")); 208 } 209 210 #[test] 211 fn test_empty_and_whitespace() { 212 assert!(!has_explicit_slur("")); 213 assert!(!has_explicit_slur(" ")); 214 assert!(!has_explicit_slur("\t\n")); 215 } 216 217 #[test] 218 fn test_case_insensitive() { 219 assert!(has_explicit_slur(&d("TklHR0VS"))); 220 assert!(has_explicit_slur(&d("TmlnZ2Vy"))); 221 assert!(has_explicit_slur(&d("TmlHZ0Vy"))); 222 assert!(has_explicit_slur(&d("RkFHR09U"))); 223 assert!(has_explicit_slur(&d("RmFnZ290"))); 224 } 225 226 #[test] 227 fn test_leetspeak_bypass() { 228 assert!(has_explicit_slur(&d("ZjRnZ290"))); 229 assert!(has_explicit_slur(&d("ZjRnZzB0"))); 230 assert!(has_explicit_slur(&d("bjFnZ2Vy"))); 231 assert!(has_explicit_slur(&d("bjFnZzNy"))); 232 assert!(has_explicit_slur(&d("azFrZQ=="))); 233 assert!(has_explicit_slur(&d("Y2gxbms="))); 234 assert!(has_explicit_slur(&d("dHI0bm55"))); 235 } 236 237 #[test] 238 fn test_normalize_leetspeak() { 239 assert_eq!(normalize_leetspeak("h3llo"), "hello"); 240 assert_eq!(normalize_leetspeak("w0rld"), "world"); 241 assert_eq!(normalize_leetspeak("t3$t"), "test"); 242 assert_eq!(normalize_leetspeak("b4dw0rd"), "badword"); 243 assert_eq!(normalize_leetspeak("l33t5p34k"), "leetspeak"); 244 assert_eq!(normalize_leetspeak("@ss"), "ass"); 245 assert_eq!(normalize_leetspeak("sh!t"), "shit"); 246 assert_eq!(normalize_leetspeak("normal"), "normal"); 247 } 248 249 #[test] 250 fn test_extra_banned_words() { 251 let extra = vec!["badword".to_string(), "offensive".to_string()]; 252 253 assert!(has_explicit_slur_with_extra_words("badword", &extra)); 254 assert!(has_explicit_slur_with_extra_words("BADWORD", &extra)); 255 assert!(has_explicit_slur_with_extra_words("b.a.d.w.o.r.d", &extra)); 256 assert!(has_explicit_slur_with_extra_words("b-a-d-w-o-r-d", &extra)); 257 assert!(has_explicit_slur_with_extra_words("b_a_d_w_o_r_d", &extra)); 258 assert!(has_explicit_slur_with_extra_words("badword123", &extra)); 259 assert!(has_explicit_slur_with_extra_words("b4dw0rd", &extra)); 260 assert!(has_explicit_slur_with_extra_words("b4dw0rd789", &extra)); 261 assert!(has_explicit_slur_with_extra_words("b.4.d.w.0.r.d", &extra)); 262 assert!(has_explicit_slur_with_extra_words( 263 "this contains badword here", 264 &extra 265 )); 266 assert!(has_explicit_slur_with_extra_words("0ff3n$1v3", &extra)); 267 268 assert!(!has_explicit_slur_with_extra_words("goodword", &extra)); 269 assert!(!has_explicit_slur_with_extra_words("hello world", &extra)); 270 } 271}