* Refactor anubis to split business logic into a lib, and cmd to just be direct usage.
* Post-rebase fixes.
* Update changelog, remove unnecessary one.
* lib: refactor this
This is mostly based on my personal preferences for how Go code should
be laid out. I'm not sold on the package name "lib" (I'd call it anubis
but that would stutter), but people are probably gonna import it as
libanubis so it's likely fine.
Packages have been "flattened" to centralize implementation with area of
concern. This goes against the Java-esque style that many people like,
but I think this helps make things simple.
Most notably: the dnsbl client (which is a hack) is an internal package
until it's made more generic. Then it can be made external.
I also fixed the logic such that `go generate` works and rebased on
main.
* internal/test: run tests iff npx exists and DONT_USE_NETWORK is not set
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: install deps
Signed-off-by: Xe Iaso <me@xeiaso.net>
* .github/workflows: verbose go tests?
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: sleep 2
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: nix this test so CI works
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: warmup per browser?
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: disable for now :(
Signed-off-by: Xe Iaso <me@xeiaso.net>
* lib/anubis: do not apply bot rules if address check fails
Closes#83
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* Cleanup regex
Were were going overkill on the escape characters
* Update docs/docs/CHANGELOG.md
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Dennis ten Hoove <36002865+dennis1248@users.noreply.github.com>
---------
Signed-off-by: Dennis ten Hoove <36002865+dennis1248@users.noreply.github.com>
Co-authored-by: Xe Iaso <me@xeiaso.net>
The example/default bot policy document had a rule to allow RSS readers
through based on paths that end with ".rss", ".xml", ".atom", or
".json". Frameworks like Rails will treat these specially, meaning that
going to /things/12345-whateverhaha.json could bypass Anubis.
I checked the history of this rule and it was present in the original
example policy file in Xe/x. This rule is likely a mistake and it has
been removed. I think it was for making my blog still work with RSS
readers.
Thanks to Graham Sutherland for reporting this over email.
Signed-off-by: Xe Iaso <me@xeiaso.net>
hash.Write never returns error so removing it from
the results simplifies usage and eliminates dead error handling.
Signed-off-by: Alexander Yastrebov <yastrebov.alex@gmail.com>
* Added the possibility to define rules for remote addresses
* Added change in changelog
* Added check for X-Real-Ip and X-Forwarded-For when checking for remote address filtering
* cmd/anubis: refine IP filtering logic
* Optimize the configuration so that the IP trie is created once at
application start instead of dynamically being created every request.
* Document the changes in the changelog and docs site.
* Allow pure IP range filtering.
* Allow user agent based IP range filtering.
* Allow path based IP range filtering.
* Create --debug-x-real-ip-default flag for testing Anubis locally
without a HTTP load balancer.
---------
Co-authored-by: Xe Iaso <me@xeiaso.net>
Closes#30
Introduces the "challenge" field in bot rule definitions:
```json
{
"name": "generic-bot-catchall",
"user_agent_regex": "(?i:bot|crawler)",
"action": "CHALLENGE",
"challenge": {
"difficulty": 16,
"report_as": 4,
"algorithm": "slow"
}
}
```
This makes Anubis return a challenge page for every user agent with
"bot" or "crawler" in it (case-insensitively) with difficulty 16 using
the old "slow" algorithm but reporting in the client as difficulty 4.
This is useful when you want to make certain clients in particular
suffer.
Additional validation and testing logic has been added to make sure
that users do not define "impossible" challenge settings.
If no algorithm is specified, Anubis defaults to the "fast" algorithm.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat: allow binding to unix domain sockets
this is useful when the user does not want to expose more tcp ports than
needed. also simplifes configuration in some situation, like with nixos
modules as the socket paths can be automatically configured.
docs updated with additional configuration flags.
Signed-off-by: Cassie Cheung <me@soopy.moe>
* feat: graceful shutdown and cleanup on signal
this is needed to clean up left-over unix sockets, else on the next boot
listener panics with `address already in use`.
Co-authored-by: cat <cat@gensokyo.uk>
Signed-off-by: Cassie Cheung <me@soopy.moe>
* feat: support unix socket upstream targets
adds support for proxying unix socket upstreams, essentially allowing
anubis to run without listening on tcp sockets at all*.
*for metrics, neither prometheus and victoriametrics supports scraping
from unix sockets. if metrics are desired, tcp sockets are still needed.
Co-authored-by: cat <cat@gensokyo.uk>
Signed-off-by: Cassie Cheung <me@soopy.moe>
* docs: add changelog entry
---------
Signed-off-by: Cassie Cheung <me@soopy.moe>
Co-authored-by: cat <cat@gensokyo.uk>
* Explicitely define image sources
Explicitely refering to docker.io will make the build succeed on software such as podman which does not default to docker.io as the standard image source
* Dockerfiles: use the full legal docker.io/library name just in case
Signed-off-by: Xe Iaso <me@xeiaso.net>
* update CHANGELOG
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* fix: no duplicate work when exceeding that 1xxx number
* run go generate and update CHANGELOG
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* cmd/anubis: drastically optimize proof of work
Closes#12Closes#17
This drastically optimizes the proof of work check by removing the
stringify call at every iteration. Additionally, this optimizes the
checks by running them in parallel for as many threads as the browser
has available (according to navigator.hardwareConcurrency).
This also changes the redirect lag to 250 milliseconds instead of 2000
milliseconds in order to be perceptually faster. This is below the
reaction time threshold of many people, so this will make the post-check
success phase perceptually instant.
Testing on an iPhone 7 Plus has shown that this can clear a difficulty 4
check in 3.4 seconds.
This actually optimizes the check so much it may be a logistical concern
for operators.
* cmd/anubis/js: fix happy cachebuster logic
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>