hedgehog

joined 2 years ago
[–] hedgehog@ttrpg.network 5 points 1 hour ago (1 children)

Your comment wasn’t in a meta discussion; it was on a post where they were venting about people complaining about them having a women’s only space. There was certainly no indication that the regular community rules didn’t apply, nor any invitation for men to comment.

Commenting that it’s hostile for them to have a women’s only space might be ironic, but couldn’t possibly be good faith, in that context. And if the same mod banned you from multiple communities, then either it was out of line and you could appeal it, or it was warranted due to the perceived likelihood of you causing problems in those other communities and the perceived low likelihood of you contributing anything of value to them.

Even now, you’re acting like the mod(s) banned you because of her / their emotions. You don’t see how that’s misogynistic?

It makes logical sense for bad actors to be preemptively banned. Emotions have nothing to do with it.

[–] hedgehog@ttrpg.network 2 points 1 day ago* (last edited 1 day ago)

Right now I have Ollama / Open-WebUI, Kokoro FastAPI, ComfyUI, Wan2GP, and FramePack Studio set up. I recently (as in yesterday) configured an API key middleware with Traefik and placed it in front of Ollama and Comfy, but currently nothing is using them yet.

I’ll probably try out Devstral with one of the agentic coding frameworks, like Void or Anon Kode. I may also try out one of the FOSS writing studios (like Plot Bunni) and connect my own Ollama instance. I could use NovelCrafter but paying a subscription fee to use my own server for the compute intensive part feels silly to me.

I tried to use Open Notebook (basically a replacement for NotebookLM) with Ollama and Kokoro, with Kokoro FastAPI as my OpenAI endpoint, but turns out it only supported, and required, text embeddings from OpenAI, so I couldn’t do that fully on my local. At some point, if they don’t fix that, I’m planning to either add support myself or set up some routes with Traefik where the ones OpenNotebook uses point to the service I want to use.

ETA: n8n is one of the services I plan to set up next, and I’ll likely end up integrating both Ollama and Comfy workflows into it.

[–] hedgehog@ttrpg.network 1 points 2 days ago

You got the idea!

[–] hedgehog@ttrpg.network 1 points 2 days ago* (last edited 2 days ago) (2 children)

We’re in c/showerthoughts. “What if my grandma was a bike?” would fit right in

[–] hedgehog@ttrpg.network 3 points 5 days ago

To be clear, I agree that the line you quoted is almost assuredly incorrect. If they changed it to "thousands of deepfake apps powered by open source technology" then I'd still be dubious, simply because it seems weird that there would be thousands of unique apps that all do the same thing, but that would at least be plausible. Most likely they misread something like https://techxplore.com/news/2025-05-downloadable-deepfake-image-generators.html and thought "model variant" (which in this context, explicitly generally means LoRA) and just jumped too hard on the "everything is an open source app" bandwagon.

I did some research - browsing https://github.com/topics/deepfakes (which has 153 total repos listed, many of which are focused on deepfake detection), searching DDG, clicking through to related apps from Github repos, etc..

In terms of actual open source deepfake apps, let's assume that "app" means, at minimum, a piece of software you can run locally, assuming you have access to arbitrary consumer-targeted hardware - generally at least an Nvidia desktop GPU - and including it regardless of whether you have to write custom code to use it (so long as the code is included), use the CLI, hit an API, use a GUI app, a web browser, or a phone app. Considering only apps that have as a primary use case, the capability to create deepfakes by face swapping videos, there are nonetheless several:

  • Roop
  • Roop Unleashed
  • Rope
  • Rope Live
  • VisoMaster
  • DeepFaceLab
  • DeepFaceLive
  • Reactor UI
  • inswapper
  • REFace
  • Refacer
  • Faceswap
  • deepfakes_faceswap
  • SimSwap

If you included forks of all those repos, then you'd definitely get into the thousands.

If you count video generation applications that can imitate people using, at minimum, Img2Img and 1 Lora OR 2 Loras, then these would be included as well:

  • Wan2GP
  • HunyuanVideoGP
  • FramePack Studio
  • FramePack eichi

And if you count the tools that integrate those, then these probably all count:

  • ComfyUI
  • Invoke AI
  • SwarmUI
  • SDNext
  • Automatic1111 SD WebUI
  • Fooocus
  • SD WebUI Forge
  • MetaStable
  • EasyDiffusion
  • StabilityMatrix
  • MochiDiffusion

If the potential criminals use easier ready-made (commercial) web-services instead of buying a RTX 5090, learning ComfyUI, dealing with the steep learning curve etc, we’d know we have to primarily fight those apps and services, not necessarily the generative AI tools.

This is the part where, to be able to answer that, someone would need to go and actually test out the deepfake apps and compare their outputs. I know that they get used for deepfakes because I've seen the outputs, but as far as I know, every single major platform - e.g., Kling, Veo, Runway, Sora - has safeguards in place to prevent nudity and sexual content. I'd be very surprised if they were being used en masse for this.

In terms of the SaaS apps used by people seeking to create nonconsensual, sexually explicit deepfakes... my guess is those are actually not really part of the figure that's being referenced in this article. It really seems like they're talking about doing video gen with LoRAs rather than doing face swaps.

[–] hedgehog@ttrpg.network 3 points 6 days ago (2 children)

Without searching for them myself to confirm, it’s plausible, especially if you take it to mean “apps leveraging open source AI technology.”

There are a ton of open source AI repos, many of which provide video related capabilities. The number of true open source AI models is very slim, but “Open weight” AI models are commonly referred to as open source, and from the perspective of building your app, fine tuning the model, or creating Loras for it, open weight is good enough.

Some Loras come with details on the training data set, so even if the base model is only open weights, the Lora can still be open source.

Until recently, Civitai had Loras for famous people, e.g., Emma Watson, and apparently just regular people. There was a post here last week, I think (or maybe to some other community), to 404 Media, about those being taken down thanks to credit card processors drawing a line in the sand at deepfake imagery.

ComfyUI is a self hostable AI platform (and there are also many hosts that offer it) that lets you build a workflow from multiple nodes, each of which generally integrates some open source AI tech that was otherwise released. For example, there are nodes that add the capabilities to perform:

  • image generation with Stable Diffusion, Flux, Hidream, etc
  • TTS with KokoroTTS, Piper, F5 TTS, etc
  • video generation with AnimateDiff, Cog, Wan2.1, Hunyuan, FramePack, FantasyTalking, Float
  • video modification, i.e., LatentSync, which takes a video and lipsyncs it to a provided audio file
  • image manipulation, i.e., controlnet, img2img, inpainting, outpainting, or even specific tasks like “remove the background” or “change the face to this other face”

If you think of a deepfake as just a video of a recognizable person doing a thing, you can create a deepfake by:

  • taking an existing video and swapping the face in each frame
  • faceswap video specific approaches, i.e., Roop.
  • an image to video workflow, i.e., with Wan: “the person dances.” You can expand the options available with Wan by using Loras.
  • a text to video workflow, where you use a Lora for that person
  • an image+audio to video workflow, i.e., with FantasyTalking/Float, creating a lipsync to an audio file you provide
  • a video+audio to video workflow with LatentSync to make it look like they said something different, particularly using a TTS (like F5 TTS) that does voice cloning to generate the new audio

My suspicion is that most of the AI apps that are available online are just repackaging these open source technologies, but are not open source themselves. There are certainly some, of course, though the ones I know of are more generic and not deepfake specific (ComfyUI, SwarmUI, Invoke AI, Automatic1111, Forge, Fooocus, n8n, FramePack Studio, FramePack Eichi, Wan2GP, etc.).

This isn’t a licensing issue, as many open source projects are licensed with MIT or Apache licenses, which don’t require you to open source derivative products. Even if they used the GPL, it wouldn’t be required for a SaaS web app. Only the AGPL would protect against that, and even then, only the changes to the AGPL library would need to be shared; the front end app could still be proprietary.

The other issue could be them not knowing what “app” means. If you think of a Lora as an app, then the sentence might be accurate. I don’t know for sure that there were thousands of Loras for people that published their training data, but I wouldn’t be surprised if that were the case.

[–] hedgehog@ttrpg.network 1 points 1 week ago (2 children)

Have you tried just setting the resolution to 1920x1080 or are you literally trying to run AAA games at 4K on a card that was targeting 1080p when it was released, 4 and a half years ago?

[–] hedgehog@ttrpg.network 1 points 1 week ago* (last edited 1 week ago)

I think the best way to handle this would be to just encode everything and upload all files. If I wanted some amount of history, I'd use some file system with automatic snapshots, like ZFS.

If I wanted to do what you've outlined, I would probably use rclone with filtering for the extension types or something along those lines.

If I wanted to do this with Git specifically, though, this is what I would try first:

First, add lossless extensions (*.flac, *.wav) to my repo's .gitignore

Second, schedule a job on my local machine that:

  1. Watches for changes to the local file system (e.g., with inotifywait or fswatch)
  2. For any new lossless files, if there isn't already an accompanying lossy files (i.e., identified by being collocated, having the exact same filename, sans extension, with an accepted extension, e.g., .mp3, .ogg - possibly also with a confirmation that the codec is up to my standards with a call to ffprobe, avprobe, mediainfo, exiftool, or something similar), it encodes the file to your preferred lossy format.
  3. Use git status --porcelain to if there have been any changes.
  4. If so, run git add --all && git commit --message "Automatic commit" && git push
  5. Optionally, automatically craft a better commit message by checking which files have been changed, generating text like Added album: "Satin Panthers - EP" by Hudson Mohawke or Removed album: "Brat" by Charli XCX; Added album "Brat and it's the same but there's three more songs so it's not" by Charli XCX

Third, schedule a job on my ~~remote machine~~ server that runs git pull at regular intervals.

One issue with this approach is that if you delete a file (as opposed to moving it), the space is not recovered on your local or your server. If space on your server is a concern, you could work around that by running something like the answer here (adjusting the depth to an appropriate amount for your use case):

git fetch --depth=1
git reflog expire --expire-unreachable=now --all
git gc --aggressive --prune=all

Another potential issue is that what I described above involves having an intermediary git to push to and pull from, e.g., running on a hosted Git forge, like GitHub, Codeberg, etc.. This could result in getting copyright complaints or something along those lines, though.

Alternatively, you could use your server as the git server (or check out forgejo if you want a Git forge as well), but then you can't use the above trick to prune file history and save space from deleted files (on the server, at least - you could on your local, I think). If you then check out your working copy in a way such that Git can use hard links, you should at least be able to avoid needing to store two copies on your server.

~~The other thing to check out, if you take this approach, is git lfs.~~ EDIT: Actually, I take that back - you probably don't want to use Git LFS.

[–] hedgehog@ttrpg.network 12 points 1 week ago

It’s the new hyped up version of “no-code” or low-code solutions, but with AI so you have more flexibility to footgun.

[–] hedgehog@ttrpg.network 6 points 1 week ago

Not any lazier. Script kiddies didn’t write the code themselves, either.

[–] hedgehog@ttrpg.network 2 points 2 weeks ago (1 children)

Are you talking about a warning for a self signed cert or for not using HTTPS?

[–] hedgehog@ttrpg.network 3 points 2 weeks ago

It was already known before the whistleblower that:

  1. Siri inputs (all STT at that time, really) were processed off device
  2. Siri had false activations

The “sinister” thing that we learned was that Apple was reviewing those activations to see if they were false, with the stated intent (as confirmed by the whistleblower) of using them to reduce false activations.

There are also black box methods to verify that data isn’t being sent and that particular hardware (like the microphone) isn’t being used, and there are people who look for vulnerabilities as a hobby. If the microphones on the most/second most popular phone brand (iPhone, Samsung) were secretly recording all the time, evidence of that would be easy to find and would be a huge scoop - why haven’t we heard about it yet?

Snowden and Wikileaks dumped a huge amount of info about governments spying, but nothing in there involved always on microphones in our cell phones.

To be fair, an individual phone is a single compromise away from actually listening to you, so it still makes sense to avoid having sensitive conversations within earshot of a wirelessly connected microphone. But generally that’s not the concern most people should have.

Advertising tracking is much more sinister and complicated and harder to wrap your head around than “my phone is listening to me” and as a result makes for a much less glamorous story, but there are dozens, if not hundreds or thousands, of stories out there about how invasive advertising companies’ methods are, about how they know too much, etc.. Think about what LLMs do with text. The level of prediction that they can do. That’s what ML algorithms can do with your behavior.

If you’re misattributing what advertisers know about you to the phone listening and reporting back, then you’re not paying attention to what they’re actually doing.

So yes - be vigilant. Just be vigilant about the right thing.

 

cross-posted from: https://lemmy.world/post/19716272

Meta fed its AI on almost everything you’ve posted publicly since 2007

view more: next ›