this post was submitted on 12 Feb 2025
12 points (83.3% liked)

Open Source

33247 readers
219 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago
MODERATORS
 

After dabbling in the world of LLM poisoning, I realised that I simply do not have the skill set (or brain power) to effectively poison LLM web scrapers.

I am trying to work with what I know /understand. I have fail2ban installed in my static webserver. Is it possible now to get a massive list of known IP addresses that scrape websites and add that to the ban list?

you are viewing a single comment's thread
view the rest of the comments
[–] lungdart@lemmy.ca 8 points 1 week ago

Fail2ban is not a static security policy.

It's a dynamic firewall. It ties logs to time boxed firewall rules.

You could auto ban any source that hits robots.txt on a Web server for 1h for instance. I've heard AI data scrapers actually use that to target big data rather than respect web server requests.