this post was submitted on 16 Jun 2025
105 points (94.9% liked)

Selfhosted

46672 readers
1141 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

you are viewing a single comment's thread
view the rest of the comments
[–] shnizmuffin@lemmy.inbutts.lol 8 points 1 day ago (2 children)

Hey, you're treating that data with the respect it demands, right? And you definitely collected consent from those chat participants before you Hoover'd up their [re-reads example] extremely Personal Identification Information AND Personal Health Information, right? Because if you didn't, you're in violation of a bunch of laws and the Twitch TOS.

[–] CrayonDevourer@lemmy.world 1 points 19 hours ago* (last edited 19 hours ago) (2 children)

If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing. If you wanna spew your personal life on Twitch, there are bots that listen to all of the channels everywhere on twitch. They aren't violating any laws, or Twitch TOS. So, *buzzer* WRONG.

Right now, the same thing is being done to you on Lemmy. And Reddit. And Facebook. And everywhere else.

Look at a bot called "FrostyTools" for Twitch. Reads Twitch chat, Uses an AI to provide summaries of chat every 30 minutes or so. If that's not violating TOS, then neither am I. And thousands upon thousands of people use FrostyTools.

I have the consent of the streamer, I have the consent of Twitch (through their developer API), and upon using Twitch, you give the right to them to collect, distribute, and use that data at their whim.

[–] aksdb@lemmy.world 5 points 17 hours ago (2 children)

So, buzzer WRONG.

Quite arrogant after you just constructed a faulty comparison.

If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing.

That's absolutely not the same thing. Overhearing something that is in the background is fundamentally different from actively recording everything going on in a public space. You film yourself or some performance in a park and someone happens to be in the background? No problem. You build a system to identify everyone in the park and collect recordings of their conversations? Absolutely a problem, depending on the jurisdiction. The intent of the recording(s) and the reasonable expectations of the people recorded are factored in in many jurisdictions, and being in public doesn't automatically entail consent to being recorded.

See for example https://www.freedomforum.org/recording-in-public/

(And just to clarify: I am not arguing against your explanation of Twitch's TOS, only against the bad comparison you brought.)

[–] kattfisk@lemmy.dbzer0.com 3 points 6 hours ago (1 children)

You're both getting side-tracked by this discussion of recording. The recording is likely legal in most places.

It's the processing of that unstructured data to extract and store personal information that is problematic. At that point you go from simply recording a conversation of which you are a part, to processing and storing people's personal data without their knowledge, consent, or expectation.

[–] aksdb@lemmy.world 1 points 5 hours ago (1 children)

True.

Although in Germany for example it can also be an issue when recording. If you have a security camera pointed at a public space (that can include the sidewalk infront of your house), passersby can sue you to take it down and potentially get you fined. Even pretending to constantly record such an area can yield that result.

[–] tfm@europe.pub 1 points 1 hour ago

I'm not a lawyer but I suppose it would depend on the ToS and if the user agrees to the recording and processing. But if it allows the extraction of the real identity of the user it's probably a GDPR issue.

[–] CrayonDevourer@lemmy.world 0 points 8 hours ago* (last edited 8 hours ago) (1 children)

You build a system to identify everyone in the park and collect recordings of their conversations? Absolutely a problem, depending on the jurisdiction.

Literally not. The police use this right now to record your location and time seen using license plates all over the nation - with private corporations providing the service.

and being in public doesn't automatically entail consent to being recorded.

And yes, it's called 'expectation to the right of privacy'. Public venues are not 'private' locations, and thus do not need consent. You can, quite literally, record anyone in public.

Even the link you provided agrees.

[–] tfm@europe.pub 1 points 1 hour ago

In the US maybe but not in Germany, Austria and probably most countries in Europe.

[–] catty@lemmy.world 3 points 15 hours ago (1 children)

Doesn't Twitch own all data that is written and their TOS will state something like you can't store data yourself locally.

[–] CrayonDevourer@lemmy.world -1 points 8 hours ago* (last edited 8 hours ago) (2 children)

I'm not storing their data. I'm feeding it to an LLM which infers things and storing that data.

[–] catty@lemmy.world 2 points 3 hours ago (1 children)

Was this system vibe coded? I get the feeling it was...

[–] CrayonDevourer@lemmy.world 0 points 2 hours ago* (last edited 2 hours ago) (1 children)

There's not actually that much code. It's like 8 lines for an AI 'agent', and maybe another 16 lines for 'tools', and I'm using Streamlink for grabbing the audio stream, and pulseaudio has a 'monitor' device you can use to listen to what's playing on the speakers. Throw it on a very minimal linux distro on a VM, and that's it.

I don't do 'vibe coding', but that IS where I got the idea from. People who are doing 'vibe coding' nowadays aren't just plugging things into a generic AI, they're spinning up 'agents' and making tools via MCP and then those agents are tasked with specific things, and use the tools to directly write to files, search the internet, read documents, etc

[–] tfm@europe.pub 1 points 1 hour ago

I'd also consider writing a script with AI, which you don't understand, as vibe coding. Basically if you wouldn't be able to do it on your own it's vibe coding.

[–] catty@lemmy.world 1 points 3 hours ago

lol. Way to contradict yourself.

[–] carl_dungeon@lemmy.world -2 points 23 hours ago (1 children)
[–] interdimensionalmeme@lemmy.ml 1 points 17 hours ago (2 children)

There is no expectation of privacy in public spaces. Participants to these streams which are open to all do not have a prohibition on repeating what they have heard.

[–] kattfisk@lemmy.dbzer0.com 2 points 6 hours ago (1 children)

Repeating what they heard is very different from automatically processing the chat to harvest personal information about the participants.

Just because some data is publicly available doesn't mean all processing of that data is legal and moral.

[–] interdimensionalmeme@lemmy.ml 0 points 5 hours ago

It is qualitatively equivalent. Any single piece of information could have been copied, it is safe to assume it has all been copied.

Although I would be onboard for supporting an expectation of pruvacy in public spaces and making private cctv recording illegal.

[–] carl_dungeon@lemmy.world 1 points 11 hours ago

Right and what I was saying was even if it wasnt “public”, single party consent means the person recording can be that single party- so still a non-issue.