I’m in the process of setting up backups for my home server, and I feel like I’m swimming upstream. It makes me think I’m just taking the wrong approach.

I’m on a shoestring budget at the moment, so I won’t really be able to implement a 3-2-1 strategy just yet. I figure the most bang for my buck right now is to set up off-site backups to a cloud provider. I first decided to do a full-system backup in the hopes I could just restore it and immediately be up and running again. I’ve seen a lot of comments saying this is the wrong approach, although I haven’t seen anyone outline exactly why.

I then decided I would instead cherry-pick my backup locations instead. Then I started reading about backing up databases, and it seems you can’t just back up the data directory (or file in the case of SQLite) and call it good. You need to dump them first and backup the dumps.

So, now I’m configuring a docker-db-backup container to back each one of them up, finding database containers and SQLite databases and configuring a backup job for each one. Then, I hope to drop all of those dumps into a single location and back that up to the cloud. This means that, if I need to rebuild, I’ll have to restore the containers’ volumes, restore the backups, bring up new containers, and then restore each container’s backup into the new database. It’s pretty far from my initial hope of being able to restore all the files and start using the newly restored system.

Am I going down the wrong path here, or is this just the best way to do it?

  • Nibodhika@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    3 days ago

    I figure the most bang for my buck right now is to set up off-site backups to a cloud provider.

    Check out Borgbase, it’s very cheap and it’s an actual backup solution, so it offers some features you won’t get from Google drive or whatever you were considering using e.g. deduplication, recover data at different points in time and have the data be encrypted so there’s no way for them to access it.

    I first decided to do a full-system backup in the hopes I could just restore it and immediately be up and running again. I’ve seen a lot of comments saying this is the wrong approach, although I haven’t seen anyone outline exactly why.

    The vast majority of your system is the same as it would be if you install fresh, so you’re wasting backup space in storing data you can easily recover in other ways. You would only need to store changes you made to the system, e.g. which packages are installed (just get the list of packages then run an install on them, no need to backup the binaries) and which config changes you made. Plus if you’re using docker for services (which you really should) the services too are very easy to recover. So if you backup the compose file and config folders for those services (and obviously the data itself) you can get back in almost no time. Also even if you do a full system backup you would need to chroot into that system to install a bootloader, so it’s not as straightforward as you think (unless your backup is a dd of the disk, which is a bad idea for many other reasons).

    I then decided I would instead cherry-pick my backup locations instead. Then I started reading about backing up databases, and it seems you can’t just back up the data directory (or file in the case of SQLite) and call it good. You need to dump them first and backup the dumps.

    Yes and no. You can backup the file completely, but it’s not a good practice. The reason is that if the file gets corrupted you will lose all data, whereas if you dumped the database contents and backed that up is much less likely to corrupt. But in actuality there’s no reason why backing up the files themselves shouldn’t work (in fact when you launch a docker container it’s always an entirely new database pointed to the same data folder)

    So, now I’m configuring a docker-db-backup container to back each one of them up, finding database containers and SQLite databases and configuring a backup job for each one. Then, I hope to drop all of those dumps into a single location and back that up to the cloud. This means that, if I need to rebuild, I’ll have to restore the containers’ volumes, restore the backups, bring up new containers, and then restore each container’s backup into the new database. It’s pretty far from my initial hope of being able to restore all the files and start using the newly restored system.

    Am I going down the wrong path here, or is this just the best way to do it?

    That seems like the safest approach. If you’re concerned about it being too much work I recommend you write a script to automate the process, or even better an Ansible playbook.

    • RadDevon@lemmy.zipOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      Check out Borgbase, it’s very cheap and it’s an actual backup solution, so it offers some features you won’t get from Google drive or whatever you were considering using e.g. deduplication, recover data at different points in time and have the data be encrypted so there’s no way for them to access it.

      I looked at Borgbase, but I think it will be a bit more pricey than Restic + Backblaze B2. Looks like Borgbase is $80/year for 1TB, which would be $72/year on B2 and less if I don’t use all of 1TB.

      The vast majority of your system is the same as it would be if you install fresh, so you’re wasting backup space in storing data you can easily recover in other ways.

      I get this, but it would be faster to restore, right? And the storage I’m going to use to store these files is relatively little compared to the overall volume of data I’m backing up. For example, I’m backing up 100GB of personal photos and home movies. Backing up the system, even though strictly not necessary, will be something like 5% of this, I think, and I’d lean toward paying another few cents every month for a faster restore.

      Thanks for your thoughts on the database backups. It’s a helpful perspective!

      • Nibodhika@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        If all you care is money, then it’s even less on hertzner at 48/year. But the reason I recommended Borgbase is because it’s a bit more known and more trustworthy. $8 a year is a very small difference, sure it will be more than that because, like you said, you won’t use the full TB on B2, but still I don’t think it’ll get that different. However there are some advantages to using a Borg based solution:

        • Borg can do backup to multiple places at once, so you can have the same thing do a backup to the cloud and to some secondary disk
        • Borg is an open source tool, so you can run your own Borg server, which means you can have backups sent to your desktop
        • Again, because Borg is open you can run a raspberry pi with a 1TB usb disk for backup, and that would be cheaper than any solution
        • Or you could even pair with a friend hosting their backup on your server and he doing the same for you.

        And the most important part, migrating from one to the other is simple, just changing config, so you can start with Borgbase, and in a year buy a minicomputer to leave on your parents house and having all of the config changes needed in seconds. Whereas migrating away from B2 will involve a secondary tool. Personally I think that this flexibility is worth way more than those $8/year.

        Also Borg has deduplication, versioning and cryptography, I think B2 has all of that but I’m not entirely sure, because it’s my understanding that they duplicate the entire file when some changes happen so you might end up paying lots more for it.


        As for the full system backup I still think it’s not worth it, how do you plan on restoring it? You would probably have to plug a liveusb and perform the steps there, which would involve formating your disks properly, connect to the remote server and get your data, chroot into it and install a bootloader. It just seems easier to install the OS and run a script, even if you could shave off 5 minutes if everything worked correctly in the other way and you were very fast doing stuff.

        Also your system is constantly changing files, which means more opportunities for files to get corrupted (a similar reason why backing up the folder of a database is a worse idea than backing um a dump of it), and some files are infinite, e.g. /dev/zero or /dev/urandom, so you would need to be VERY careful around what to backup.

        At the end of the day I don’t think it’s worth it, how long do you think it takes you to install Linux on a machine? Because I would guess around 20 min, restoring your 1TB backup will certainly take much longer than that (probably a couple of hours) and if you have the system up you can get critical stuff that doesn’t require the full backup early. Another reason why Borg is a good idea, you can have a small critical stuff backup to restore in seconds, and another repository for the stuff that takes longer. So Immich might take a while to come back, but authentik and Caddy can be up in seconds. Again, I’m sure B2 can also do this, but probably not as intuitively.