this post was submitted on 23 Jan 2025
130 points (100.0% liked)

TechTakes

2264 readers
63 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS
top 6 comments
sorted by: hot top controversial new old
[–] ebu@awful.systems 13 points 9 months ago (1 children)

f4mi's channel is fantastic btw, fun little deep dives on old hardware and games. highly recommend checking her stuff out

[–] dgerard@awful.systems 8 points 9 months ago

the Smasnug investigation is amazing

[–] pennomi@lemmy.world 9 points 9 months ago (2 children)

Works on some, but a lot of AI stuff uses a Speech to Text process to create the annotations themselves instead of trusting the provided subtitles.

[–] sailor_sega_saturn@awful.systems 15 points 9 months ago* (last edited 9 months ago)

The video mentions this as well as other practical limitations (like OOMing the youtube phone app lol).

Really there are fairly straightforward technical ways around these techniques -- out of bounds or invisible subtitles can be cropped, or individual letters can be formed into paragraphs the same way PDF readers do; but it's still funny that it works at all and involves the word ass.

It comes on the coattails of a long history of AI companies not caring at all about security, privacy, data integrity, or being nice people.

[–] michael_palmer@lemmy.sdf.org 6 points 9 months ago

We can insert quick speech fragments into the video, just like at the end of a radio commercial.

[–] MichaelMuse@programming.dev 0 points 2 months ago

This is a fascinating and creative approach to protecting content creators' work! Using Cyrillic characters to create '.аss' subtitle files that confuse AI scrapers is quite clever.

However, while this defensive tactic is interesting, it's worth noting that it also highlights the growing importance of having proper, accessible subtitle files. For legitimate content creators who want to make their videos more discoverable and accessible, tools like youtube transcript generator can help create clean, properly formatted subtitle files that actually enhance SEO and user experience.

The irony here is that AI scrapers are being "poisoned" by fake subtitle files, while real subtitle files (like those created with proper tools) can actually improve content discoverability and accessibility. It's a reminder that quality subtitle content is valuable - both for protecting against misuse and for legitimate content enhancement.

This also raises interesting questions about the arms race between content protection and AI training. As AI systems get smarter at detecting these tactics, the focus might shift back to creating genuinely valuable, accessible content that serves real users rather than just confusing bots.