I have been lurking on this community for a while now and have really enjoyed the informational and instructional posts but a topic I don’t see come up very often is scaling and hoarding. Currently, I have a 20TB server which I am rapidly filling and most posts talking about expanding recommend simply buying larger drives and slotting them in to a single machine. This definitely is the easiest way to expand, but seems like it would get you to about 100TB before you cant reasonably do that anymore. So how do you set up 100TB+ networks with multiple servers?

My main concern is that currently all my services are dockerized on a single machine running Ubuntu, which works extremely well. It is space efficient with hardlinking and I can still seed back everything. From different posts I’ve read, it seems like as people scale they either give up on hardlinks and then eat up a lot of their storage with copying files or they eventually delete their seeds and just keep the content. Does the Arr suite and Qbit allow dynamically selecting servers based on available space? Or are there other ways to solve these issues with additional tools? How do you guys set up large systems and what recommendations would you make? Any advice is appreciated from hardware to software!

Also, huge shout out to Saik0 from this thread: https://lemmy.dbzer0.com/post/24219297 I learned a ton from his post, but it seemed like the tip of the iceberg!

  • dipper_retreat@lemmy.dbzer0.comOP
    link
    fedilink
    English
    arrow-up
    4
    ·
    24 hours ago

    Thanks for this fantastic write up, and your other response! I definitely learned a lot just looking up all the terms. Just a couple of questions if you have time.

    For your 16x bay boxes, are you running like old Optiplex or PowerEdge hardware or something else? I ask because these seem to be available in large supplies from surplus sites and Im curious if one is strictly better or easier to work with. Also, I’ve read that you should loosely match TB of storage to GB of RAM. The PowerEdge hardware has tons of DIMMs but old PCs don’t so curious if you’ve had to deal with that since zfs seems so well optimized.

    For the split categories, ie. 2x for TV you mentioned, do you need to run two instances of Sonarr? Or do you just manually change the path when a single box gets full? Otherwise, how do you keep the two instances in sync?

    Lastly, I’ve done quite a bit of reading on OMV and Proxmox but I don’t actually use them yet. Do you recommend Proxmox with an OMV vm or just OMV baremetal?

    Thanks for taking the time!

    • tenchiken@lemmy.dbzer0.comM
      link
      fedilink
      English
      arrow-up
      2
      ·
      23 hours ago

      For my larger boxes, I only use SuperMicro. Most other vendors do weird shit to their back planes that make them incompatible or charge for licenses for their ipmi/drac/lightsout . Any reputable reseller of server gear will offer SuperMicro.

      The disk to ram ratio is niche, and I’ve almost never run into that outside of large data warehouse or database systems (not what we’re doing here). Most of my machines run nearly idle even serving files several active streams or 3gb/sec data moves on only 16gb RAM. I use CPU being maxed out as a good warning that one of my disks needs checking, since silvering or degraded in ZFS chews CPU.

      That said, hypervisors eat RAM. Whatever machine you might want to perform torrents, transcoding, etc, give that box RAM and either a good supported GPU or a recent Intel quicksync chip.

      For organizing over the arrays, I use raided SSD for the downloads, with the torrent client moving to the destination host for seeding on completion.

      Single instance of radarr and sonarr, instead I update the root folder for “new” content any time I need to point to a new machine. I just have to keep the current new media destination in sync between the Arr and the torrent client for that category.

      The Arr stacks have gotten really good lately with path management, you just need to ensure the mounts available to them are set correct.

      In the event I need to move content between 2x different boxes, I pause the seed and use rsync to duplicate the torrent files. Change path and recheck the torrent. Once that’s good I either nuke and reimport in the Arr, or lately I’ve been doing better naming convention on the hosts so I can use preserving hardlinks. Beware, this is pretty complex route unless you are very comfortable in Linux and rsync!

      I’m using OMV on bare metal personally. My proxmox doesn’t even have OMV, it’s on a mini PC for transcoding. I see no problem running OMV inside proxmox though. My baremetal boxes are dedicated for just NAS duties.

      For what it’s worth, keep tasks as minimal and simple as you can. Complexity where it’s not needed can be pain later. My nas machines are largely identical in base config, with only the machine name and storage pool name different.

      If you don’t need a full hypervisor, I’d skip it. Docker has gotten great in its abilities. The easiest docker box I have was just Ubuntu with DockGE. It keeps it’s configs in a reliable path so easy to backup your configs etc.