Idea: use 2 drives to store series put videos on bigger and nfo and jpg on smaller

tostane · September 25, 2018, 2:17pm

I was wondering if you had a large video library if you split the videos on to drive with large clusters and put the nfo an jpg on small partitions with little clusters would there be enough space gain to be worth it.
I was thinking if a script could auto move the small files after sonar download them to the main drive.
I believe this would also make using snap raid more efficient if the files were near same size on the drives.
Im trying to do this on ext4 file system.

Then use Linux drive combine to make the 2 drive look like one. they would need to maintain identical folders. I have searched around and cannot find much info on this.

drive 1-> video drive 2->nfo jpg == software generated drive 3->drive 1+2

sienar · September 27, 2018, 9:04pm

I don’t think there would be enough savings to justify the complexity of splitting the metadata into a path separate from the video files. The jpg’s are big enough to not worry about, most are over 32KB and many are well over 64KB. So that leaves nfo files, 1 per series and 1 per episode. Assuming you were using 128K clusters and all your nfo files were 1K, and you had a library with 200 series and 5000 episodes. So for just the nfo files that might save about 600 out of roughly 7,226,881 MB (aka 6.9 TB) for that particular library. The jpgs would be a lot less savings, so let’s say for both best guess of around 1 GB out of over 7,000 GB. If you think it’s worth your time to worry about that 1GB when 8 TB hard drives are around $150, go for it.

tostane · October 2, 2018, 3:52pm

It was just an idea some people may store metadata for many sources like kodi wdtv and pnp media servers, these all create a lot of little thumbs and xml files and if you allow artists and directors etc thumbs there can get up near 50 little files per show. I just think there has to be a file system that handles the small files. I think ntfs will hide them if they re very small but im not sure how ext4 does it. I just tossed this out as an idea to see if anyone thought it was worth looking in to.

sienar · October 2, 2018, 9:06pm

I just don’t think it’s an issue worth the extra effort in a world with 10+ TB drives. There’s just not that much slack space in clusters wasted in most files systems so it doesn’t add up to much savings in the grand scheme of things.

My media is on ZFS with up to 1MB record sizes (basically clusters). But it’s dynamic and will create records as small as 4KB I believe, definitely not worth bothering with there.

For NTFS, the default cluster size is 4KB for volumes up to 16TB, 8KB for volumes between 16 and 32TB, and so on.

For EXT4, I think the biggest worry is running out of inodes with tons of files in a file system as it’s default block size is 4KB like NTFS.
https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout

So, even if you multiplied my math by 50, but were using the sane defaults I mentioned, then you still would only save maybe 1 GB of disk space.

tostane · October 11, 2018, 8:37am

I avoided using zfs i have been using mhddfs in single user environment as it has poor security and not being developed any more it seems. It does work good as long as you avoid moving files. I need to find a better file system now that i have near 25 drives attacked. mhdffs will just stop working every few days and i have to restart. I know mhdffs can auto move folders if they overlap a drive and thought if someone just make small change to how it worked they could have it allocate files by cluster size of drives. Then create several drives with varying cluster sizes and let it mirror the directory and auto move files to where they waste the less space.
I did write to one developer of these file systems and he also did not think it would be effective at saving space.
a file system that could auto compress the text files and leave the thumbs and videos uncompressed could be prefect for this if it exists.

everon · October 13, 2018, 7:42pm

It does, it’s called ZFS. If you enable compression, it will compress any files that are compressible, and leave the rest in their original form.

tostane · October 17, 2018, 10:39am

zfs looks nice for someone just getting started. since i have like 18 drives using mhddfs It would be hard to switch. I looked into BRTFS it is a journaling system that can be put on ext4 drives easily and offers compression and may have raid in the future. I tried to convert a 5 tb drive to it and i took like half a day. then i changed it back to ext4. I need to know more on how it works before trying to convert all drives to it.
I have decided to get rid if mhddfs and change to mergerfs as mergerfs is still being developed and will work on my system. I will need to set up a small test system and learn how btrfs works. Btrfs seems like it would want to join all the drives into a raid 0 which is what i want to avoid. Id rather keep all files on 1 drive than get the extra speed from raid 0.
On videos if you loose a drive in a raid and backups fail it pretty much destroys all the drives. at least on a non raid system i still have the remaining drives. Im hoping i can get btrfs compression to work on individual drives then just use merger fs to join them if there is no option in btrfs to join and keep the separated.
Its all for my personal library. sadly If i go btrfs it would take a week or two to convert every drive.
I tested out freenas that had zfs and If i was starting all over that looks like a good way to go.

sienar · October 22, 2018, 7:09pm

Everon is definitely right, ZFS is great at compression for compressible files and uses dynamic record sizes (ZFS equivalent of a cluster) so small files don’t waste space and you can set record sizes up to 1MB so large files are stored much more efficiently as well.

As for migrating TO ZFS, depending on how sensitive you are to redundancy eating capacity and also to performance, ZFS can be pretty easy to migrate into piecemeal using mirroring. If you’re going to do any level of RAIDZ (parity modes like RAID5/6), you really want don’t want to have to expand that much as it’s best done in multiples of the same number of disks. But, if you start with 2 drives in a mirror, and then simply add disks 2 at a time as you migrate data into it from your old system, it’ll be fine. It’s not optimal doing it that way as you end up with data unevenly distributed between vdevs, but it’ll work and you’ll gain all but the performance benefits of ZFS in the process. It won’t be too slow, you’ll still get the performance of a 2 drive mirror which is more than enough to max out most home network connections.

If the data is backup worthy, then ideally you could either restore from that backup to a fresh, full size ZFS array. Or if you did the piecemeal import and you want to rectify that to optimal performance, build a large enough ZFS array on your backup system and then you do a ZFS send to the backup system, then rebuild/empty the primary servers ZFS pool and ZFS send the backup copy back to the primary. This could also be done on a single server as well.

The point is, you get a ton of flexibility with ZFS which makes it pretty fantastic in my view, for most any kind of file serving duty.

tostane · February 5, 2019, 9:24am

Well I looked at zfs and just really like to maintain directory’s on each drive. So i went with Btrfs and enable compression then I used mergerfs to join the drives. This gave me a fast system and freed up some space. using the deduplication features i found a lot of small files and freed up more space.

I tried to put btrfs on some large drives and had problems but it worked ok on 8tb and smaller. larger drive take to long to read the data on boot and fail in fstab.

I really wish btrfs had a option to maintain directory’s on the drives individually and still combine them into one larger drive.

I did this on debian and ddbian is not really current on btrfs but i got it to work for what i needed.

Zfs is just a bit more expensive to set up and maintain. I custom build my drive boxes from atx towers and they hold 10 drives each as esata. and i run 4 port esata cards in my pc. so i can connect 40 drives if i need.