this post was submitted on 17 Oct 2025
1380 points (99.2% liked)

linuxmemes

27732 readers
872 users here now

Hint: :q!


Sister communities:


Community rules (click to expand)

1. Follow the site-wide rules

2. Be civil
  • Understand the difference between a joke and an insult.
  • Do not harrass or attack users for any reason. This includes using blanket terms, like "every user of thing".
  • Don't get baited into back-and-forth insults. We are not animals.
  • Leave remarks of "peasantry" to the PCMR community. If you dislike an OS/service/application, attack the thing you dislike, not the individuals who use it. Some people may not have a choice.
  • Bigotry will not be tolerated.
  • 3. Post Linux-related content
  • Including Unix and BSD.
  • Non-Linux content is acceptable as long as it makes a reference to Linux. For example, the poorly made mockery of sudo in Windows.
  • No porn, no politics, no trolling or ragebaiting.
  • 4. No recent reposts
  • Everybody uses Arch btw, can't quit Vim, <loves/tolerates/hates> systemd, and wants to interject for a moment. You can stop now.
  • 5. πŸ‡¬πŸ‡§ Language/язык/Sprache
  • This is primarily an English-speaking community. πŸ‡¬πŸ‡§πŸ‡¦πŸ‡ΊπŸ‡ΊπŸ‡Έ
  • Comments written in other languages are allowed.
  • The substance of a post should be comprehensible for people who only speak English.
  • Titles and post bodies written in other languages will be allowed, but only as long as the above rule is observed.
  • 6. (NEW!) Regarding public figuresWe all have our opinions, and certain public figures can be divisive. Keep in mind that this is a community for memes and light-hearted fun, not for airing grievances or leveling accusations.
  • Keep discussions polite and free of disparagement.
  • We are never in possession of all of the facts. Defamatory comments will not be tolerated.
  • Discussions that get too heated will be locked and offending comments removed.
  • Β 

    Please report posts and comments that break these rules!


    Important: never execute code or follow advice that you don't understand or can't verify, especially here. The word of the day is credibility. This is a meme community -- even the most helpful comments might just be shitposts that can damage your system. Be aware, be smart, don't remove France.

    founded 2 years ago
    MODERATORS
     

    Does lemmy have any communities dedicated to archiving/hoarding data?

    you are viewing a single comment's thread
    view the rest of the comments
    [–] kayzeekayzee@lemmy.blahaj.zone 185 points 1 day ago (5 children)

    For wikipedia you'll want to use Kiwix. A full backup of wikipedia is only like 100GB, and I think that includes pictures too.

    [–] CrabAndBroom@lemmy.ml 1 points 12 hours ago

    You can also offline the whole of Project Gutenberg with Kiwix, it's about 70GB IIRC.

    [–] clif@lemmy.world 38 points 1 day ago* (last edited 1 day ago) (2 children)

    Last time I updated it was closer to 120GB but if you're not sweating 100 GB then an extra 20 isn't going to bother anyone these days.

    Also, thanks for reminding me that I need to check my dates and update.

    EDIT: you can also easily configure a SBC like a Raspberry Pi (or any of the clones) that will boot, set the Wi-Fi to access point mode, and serve kiwix as a website that anyone (on the local AP wifi network) can connect to and query... And it'll run off a USB battery pack. I have one kicking around the house somewhere

    [–] Fmstrat@lemmy.world 2 points 19 hours ago* (last edited 19 hours ago) (1 children)

    Do you recommend adding anything else to it?

    For instance, OSM maps?

    I've been thinking about running the Kiwix app + OSMAnd on an old Android phone and auto updating it once a year.

    [–] clif@lemmy.world 1 points 18 hours ago (2 children)

    That's a good question (and good idea) that I hadn't really thought about past a collection of ZIMs. The one I built advertises it's own AP SSID that anyone can connect to and then access the ZIMs that are served via kiwix-serve on HTTP/80. That is, I wanted a single, low power, headless device that multiple people could use simultaneously via wifi and browser rather than a personal device.

    I hadn't really thought about other helpful services past that. I mean, we've got a (wee) server so why not use it? I like the idea of OSM and their website is open source but has a lot of dependencies :

    openstreetmap-website is a Ruby on Rails application that uses PostgreSQL as its database, and has a large number of dependencies for installation

    A fully-functional openstreetmap-website installation depends on other services, including map tile servers and geocoding services, that are provided by other software. The default installation uses publicly-available services to help with development and testing.

    I wonder how hard it would be to host everything it needs locally/offline... and what that would do to power consumption : )

    Thanks for the idea - something to look into, for sure.

    [–] Fmstrat@lemmy.world 1 points 13 hours ago (1 children)

    I might beat you to it. I've got Kiwix running in docker, just did a PR to the kiwix-zim-updater so it can run in Docker on a cron schedule next to the server, and have spun those up with Karakeep (self-hosted web archive I use for bookmarking).

    Right now I'm adding a ZIM list feature to the updater to list available ZIMs by language, and then I'll move on to OSM.

    [–] clif@lemmy.world 1 points 4 hours ago

    You'll definitely beat me to it : D

    Do me a favor and tag me when you post your how to?

    [–] techwithjake@sh.itjust.works 1 points 15 hours ago (1 children)

    Saw your comment on mine and finally saw this one.

    I'm gonna take a look at openstreetmap-tile-server and see about running that since if all has gone to shit, who knows if GPS will work. Least it's almost like a paper map and can be auto-updated as long as we still have internet. Quick Gist someone wrote here.

    [–] clif@lemmy.world 1 points 4 hours ago

    Yeah, I feel the same in that it's assuredly doable, but how hard is it?

    If you're able to dig into and make some progress, please tag me because I'm interested but don't have much time these days.

    [–] techwithjake@sh.itjust.works 14 points 1 day ago (1 children)

    Just built one of those using Dietpi as the OS and NVME M.2 for the storage. I have many different ZIMs and running different services and only using about 270GB.

    Works great for offline use. Probably should add an ISO or 2 as well.

    [–] clif@lemmy.world 3 points 17 hours ago* (last edited 17 hours ago) (1 children)

    What other services are you running?

    @fmstrat@lemmy.world asked what else I was running in a sibling comment to yours and I didn't have an answer because I'm not... yet : )

    [–] techwithjake@sh.itjust.works 6 points 15 hours ago

    DietPi makes it dead simple to run most of these things as their "software suite" is pretty robust and simple to setup.

    For "user facing" applications:

    • Homer Dashboard as the landing page when going to the .local address in a browser
    • Kiwix for the ZIMs
    • Hedgedoc for personal note taking/wiki
    • Lychee photos for a very lightweight photo album maker/viewer for keepsake photos.

    For "admin side" stuff:

    • Portainer to manage the containers/stacks
    • Watchtower to auto-update the containers while they're still network connected
    • Transmission daemonized to download and seed the ZIMs or anything else non-pirate related
    • Use jojo2357's ZIM updater to auto-update ZIMs via cron job while they're still network connected
    • DietPi-Dashboard as an all-in-one dashboard to monitor and control the RPi from a web interface. (Yeah I know I can do everything SSH'ing in but I'm lazy.)
    • File Browser just in case I want other people to have access to files but since it's in maintenance mode and I'm unsure I want others to have access, might strip it out

    I try to use containers from LinuxServer.io whenever possible. Mostly just cause it's what I do on my main server.

    I'm still looking at adding/removing things as I get more time to sit down but I'm pretty happy with it's current state.

    [–] Fmstrat@lemmy.world 2 points 19 hours ago* (last edited 19 hours ago) (1 children)

    120GB not including Wikimedia πŸ˜‰

    Also, I wish they included OSM maps, not just the wiki.

    [–] bobo1900@startrek.website 1 points 13 hours ago

    You can easily download planet.osm, I think it's a couple of TB for the compressed file.

    [–] mistermodal@lemmy.ml 29 points 1 day ago

    Yeah also if you make a Zim wiki or convert a website into Zim then you can run that stuff too. If you use Emacs it's easy to convert some pages to wikitext for Zim too

    [–] Gigasser@lemmy.world 1 points 1 day ago (1 children)

    I wonder if there's anyways to edit these files afterwards? They tend to be read only, right? I must confess, I don't have too much experience with this myself.

    [–] Prathas@lemmy.zip 4 points 1 day ago (1 children)

    It's probably hundreds of thousands of HTML files, no? What is the fear about being able to edit or not?

    [–] Gigasser@lemmy.world 1 points 4 hours ago (1 children)

    I believe kiwix uses zim files.

    [–] Prathas@lemmy.zip 1 points 3 hours ago

    Okay, I'm unfamiliar with both. Well, I still don't understand why read-only state matters; are you concerned about tampering?