this post was submitted on 29 May 2025
17 points (90.5% liked)

Linux

54869 readers
441 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 6 years ago
MODERATORS
 

I want to have a mirror of my local music collection on my server, and a script that periodically updates the server to, well, mirror my local collection.

But crucially, I want to convert all lossless files to lossy, preferably before uploading them.

That's the one reason why I can't just use git - or so I believe.

I also want locally deleted files to be deleted on the server.

Sometimes I even move files around (I believe in directory structure) and again, git deals with this perfectly. If it weren't for the lossless-to-lossy caveat.

It would be perfect if my script could recognize that just like git does, instead of deleting and reuploading the same file to a different location.

My head is spinning round and round and before I continue messing around with find and scp it's time to ask the community.

I am writing in bash but if some python module could help with it I'm sure I could find my way around it.

TIA


additional info:

  • Not all files in the local collection are lossless. A variety of formats.
  • The purpose of the remote is for listening/streaming with various applications
  • The lossy version is for both reducing upload and download (streaming) bandwidth. On mobile broadband FLAC tends to buffer a lot.
  • The home of the collection (and its origin) is my local machine.
  • The local machine cannot act as a server
top 8 comments
sorted by: hot top controversial new old
[–] DecentM@lemmy.ml 11 points 1 week ago (1 children)

Don't know of a solution that does this, but you could solve it with a two-step process. First, rsync the files to the server as-is, then use a background job on the server that converts lossless to lossy every hour or so.

Storage is really cheap these days though, why compress lossy in the first place?

[–] N0x0n@lemmy.ml 1 points 1 week ago

You can compress to lossy and still don't make the difference between each while saving a ton of storage. Opus 192k is really good and mostly transparent.

I do agree that storage is cheap however if you have to make backups, it really gets satured very fast !

[–] QuazarOmega@lemy.lol 9 points 1 week ago

Why use git exactly? You're never changing the content of the files themselves (excluding the effect of lossy compression) so you also don't need to track those changes, right?
This seems more like a job for rsync.

Aside from that, I don't know more for how to achieve the full setup you're trying to create, sorry

[–] eager_eagle@lemmy.world 7 points 1 week ago* (last edited 1 week ago) (2 children)

I want to convert all lossless files to lossy, preferably before uploading them

so it's not exactly a mirror, right?

here's an idea:

  • A - the source containing lossless files
  • B - the local storage of lossy files
  • C - a remote mirror of B

With that, you can do:

  • A -> B: a systemd service that makes this conversion.
  • B -> C: git or syncthing to mirror and/or version control.

This uses more storage than you probably intended to (lossy files are also stored locally), but it's a mirror.

so it’s not exactly a mirror, right?

Correct!

Your workflow is actually something I thought of but the duplication of all lossy files would be a bit too much and replacing them with symlinks would not work with git afaik.

I don't think git is the right tool for this. It's designed for text files, not binary. Also, there's no need for version control here. Git won't store diffs of binary files, so if a file changes (even the slightest change like an mp3 tag) it will keep a full copy of the old file.
OP wants to sync, so I would use rsync here. It will be way faster and efficient. If you want to know what rsync did, you can keep a log file of it's output.

[–] bastion@feddit.nl 3 points 1 week ago (1 children)

Make a script. I'd use xonsh or python with sh.py.

  • create a dict for remote to local filename map
  • walk your local collection
    • for each file, determine what the correct remote name (including a valid extension) would be, and add the pair to the dict, with remote filenames as keys, local filenames as values
  • make a set like local_munged_names from that dict's keys
  • walk your remote tree, and store the filenames in a set like remote_names
  • names_to_upload = local_munged_names - remote_names
  • for each name in names to upload, look up the local filename from the remote to local filename map. Then, encode it if it needs encoding, and upload.
[–] vandsjov 2 points 4 days ago

I like this. I would probably over complicate things with a index CSV file (or SQL) that stores checksum values of files to identify renamed or moved files.