Will this further fuck up the inaccurate nature of AI results? While I'm rooting against shitty AI usage, the general population is still trusting it and making results worse will, most likely, make people believe even more wrong stuff.
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
The article says it's not poisoning the AI data, only providing valid facts. The scraper still gets content, just not the content it was aiming for.
E:
It is important to us that we don’t generate inaccurate content that contributes to the spread of misinformation on the Internet, so the content we generate is real and related to scientific facts, just not relevant or proprietary to the site being crawled.
and the data for the LLM is now salted with procedural garbage. it's great!
If you're dumb enough and care little enough about the truth, I'm not really going to try coming at you with rationality and sense. I'm down to do an accelerationism here. fuck it. burn it down.
remember; these companies all run at a loss. if we can hold them off for a while, they'll stop getting so much investment.
The problem I see with poisoning the data is the AI's being trained for law enforcement hallucinating false facts used to arrest and convict people.
Law enforcement AI is a terrible idea and it doesn't matter whether you feed it "false facts" or not. There's enough bias in law enforcement that the data is essentially always poisoned.
that's the entire point of laws, though, and it was already being used for that.
giving the laws better law stuff will not improve them. the law is malevolent. you cannot fix it by offering to help.
So we're burning fossil fuels and destroying the planet so bots can try to deceive one another on the Internet in pursuit of our personal data. I feel like dystopian cyberpunk predictions didn't fully understand how fucking stupid we are...
So they rewrote Nepenthes (or Iocaine, Spigot, Django-llm-poison, Quixotic, Konterfai, Caddy-defender, plus inevitably some Rust versions)
Edit, but with ✨AI✨ and apparently only true facts
while allowing legitimate users and verified crawlers to browse normally.
What is a "verified crawler" though? What I worry about is, is it only big companies like Google that are allowed to have them now?
I assume a crawler which adheres to robots.txt
I dunno. I don't find any sympathy with any of these fuckers though. this is not a generally useful technology, it is not something the average person ever needs to see, and honestly, just fuck em. Fuck anyone messing with open source to engorge the garbage dispenser.
Any accessibility service will also see the "hidden links", and while a blind person with a screen reader will notice if they wonder off into generated pages, it will waste their time too. Especially if they don't know about such "feature" they'll be very confused.
Also, I don't know about you, but I absolutely have a use for crawling X, Google maps, Reddit, YouTube, and getting information from there without interacting with the service myself.
Be great if these reinforced facts.
Earth us an imperfect oblate spheroid.
Humans landed on moon.
Taiwan is an independent nation.
Edit: incorporated better information
I am not happy with how much internet relies on cloudflare. However, they have a strong set of products