Path to this page:
./
www/anubis,
Proof-of-work check to stop AI bots
Branch: CURRENT,
Version: 1.18.0,
Package name: anubis-1.18.0,
Maintainer: bsiegertAnubis weighs the soul of your connection using a sha256 proof-of-work
challenge in order to protect upstream resources from scraper bots.
Installing and using this will likely result in your website not being
indexed by some search engines. This is considered a feature of Anubis,
not a bug.
This is a bit of a nuclear response, but AI scraper bots scraping so
aggressively have forced my hand. I hate that I have to do this, but
this is what we get for the modern Internet because bots don't conform
to standards like robots.txt, even when they claim to.
In most cases, you should not need this and can probably get by using
Cloudflare to protect a given origin. However, for circumstances where
you can't or won't use Cloudflare, Anubis is there for you.
Master sites:
Filesize: 770.734 KB
Version history: (Expand)
- (2025-05-13) Updated to version: anubis-1.18.0
- (2025-05-09) Package added to pkgsrc.se, version anubis-1.16.0 (created)
CVS history: (Expand)
2025-05-13 19:33:14 by Benny Siegert | Files touched by this commit (3) |  |
Log message:
anubis: update to 1.18.0
v1.18.0: Varis zos Galvus
The big ticket feature in this release is CEL expression matching
support. This allows you to tailor your approach for the individual
services you are protecting.
These can be as simple as:
- name: allow-api-requests
action: ALLOW
expression:
all:
- '"Accept" in headers'
- 'headers["Accept"] == "application/json"'
- 'path.startsWith("/api/")'
Or as complicated as:
- name: allow-git-clients
action: ALLOW
expression:
all:
- >-
(
userAgent.startsWith("git/") ||
userAgent.contains("libgit") ||
userAgent.startsWith("go-git") ||
userAgent.startsWith("JGit/") ||
userAgent.startsWith("JGit-")
)
- '"Git-Protocol" in headers'
- headers["Git-Protocol"] == "version=2"
The docs have more information. This is a simple, lovable, and complete
implementation of this feature so that administrators can get hacking
ASAP.
Other changes:
- Use CSS variables to deduplicate styles
- Fixed native packages not containing the stdlib and botPolicies.yaml
- Change import syntax to allow multi-level imports
- Changed the startup logging to use JSON formatting as all the other
logs do.
- Added the ability to do expression matching with CEL
- Add a warning for clients that don't store cookies
- Disable Open Graph passthrough by default
- Clarify the license of the mascot images
- Started Suppressing 'Context canceled' errors from http in the logs
v1.17.0: Asahi sas Brutus
v.1.17.0 is a rather large release. This kind of giant feature release
will not happen again as this has caused significant problems with
testing in various configurations. Automated testing is being worked on
but I have nothing to report yet.
Big-ticket features include but are not limited to:
- Configuration can be in YAML or JSON
- Configuration snippets can be imported from the default library or
anywhere on the filesystem
- Default rules now flag "Opera" after seeing an attack in the wild
that does that
- Many documentation and build script fixes
- AI-robots.txt rules are added to the default config to stop the
worst offenders that care to identify themselves
- Apache, Nginx, and Traefik have gotten documentation
- Users can match by headers as well as user agents or paths
- Internal refactoring to make Anubis faster and easier to maintain
- "Secondary screening" has been removed to give a more consistent
user experience
- The Internet Archive is allowlisted by default
- X-Forwarded-For header calculation should be a bit better
- Subpath support (run anubis on /git)
- Many implicit things have been documented
|
2025-05-09 21:16:40 by Benny Siegert | Files touched by this commit (5) |
Log message:
New package, www/anubis.
Anubis weighs the soul of your connection using a sha256 proof-of-work
challenge in order to protect upstream resources from scraper bots.
Installing and using this will likely result in your website not being
indexed by some search engines. This is considered a feature of Anubis,
not a bug.
This is a bit of a nuclear response, but AI scraper bots scraping so
aggressively have forced my hand. I hate that I have to do this, but
this is what we get for the modern Internet because bots don't conform
to standards like robots.txt, even when they claim to.
In most cases, you should not need this and can probably get by using
Cloudflare to protect a given origin. However, for circumstances where
you can't or won't use Cloudflare, Anubis is there for you.
|