Building a 1pb rig
-
@manfromafar I've built a ton of zfs servers, one of my primary miners is a 16 drive zfs chassis. It doesn't have any SMR drives though, so you're saying a zfs server removes the write penalty for random writes on SMR ?
-
@manfromafar very nice! i need that asap.
-
@haitch said in Building a 1pb rig:
@manfromafar I've built a ton of zfs servers, one of my primary miners is a 16 drive zfs chassis. It doesn't have any SMR drives though, so you're saying a zfs server removes the write penalty for random writes on SMR ?
@haitch yes it alleviates some of the write penalties since zfs can cache the writes in ram and write them out to the disks in sequential order and its stripping the data across multiple disks and in new locations so your not having the SMR drive trying to move data on the fly as much since as you would when sequentially writing all the data to the drive. I was able to plot about 8TB in a day over gige to a smr z2 array then copy to another standalone smr drive for actual mining
-
@manfromafar No issues with the COW performance when you got up to about 90% capacity? I've been told that getting over that threshold can be worse than native SMR performance.
-
This post is deleted!
-
@haitch Only if you want to actively use the storage for read and writes. but since your going to be mining burst you can plot away and only ever read from the pool. The reason ZFS gets balls slow over 80% and useless at 90% is ZFS wants to write in long contiguous blocks so trying to find these spots on teh disk gets harder and hard. and since the files get more and more fragmented over time when rewriting it slows down performance as your jumping all over the disk.
OH speaking of 80% I have to move and sorrt somethings on box 2 its over 80% good thing it just houses read material ^>^
-
@manfromafar my zfs miner was built in the early days of burst, so is either wplotgenerator or gpu generator buffer createx, so sequential files. Hadn't considered how the ram/write log cache might affect plotting performance.ight have to reconsider, and replot, Pennywise's design.
-
@haitch If you've already plotted I wouldnt worry about switching since ZFS only helps remove the write penalties. It does nothing for the read speeds except you can read from all disks at once. Since you'll blow through the arc in ram and never actually use that buffering.
-
#derailingthreads one post at a time. ^>^
-
@manfromafar yeah, the arc is useless unless the chain d3vice to use the same scoop twice in a row, unlikely, but it was the write caching I hadn't considered - but given the difference in plotting vs. writing speed, I'm not sure how much it'd help. I plot with 36 xeon threads, so in non zfs plotting, it gets to less than 10% written but 100% plotted.
-
@haitch said in Building a 1pb rig:
@manfromafar yeah, the arc is useless
aehem, no.
On one box i have set# zfs set secondarycache=metadata /plotsand it has ~120 GiB of metadata in the L2ARC. That is, "where is block x in file y", so after a while no metadata lookup occurs during mining.
You may also keep that in ARC. But does it have any measurable impact on mining speed ? I don't know. It is the only zpool with small files, plotted in 2015, soon to be retired. To free slots for 8 TB SMRs. But I just peeked at the diff, which is around 60 PB right now. In 6 months I will nkow wether that was a wise investment. 8)
-
@vaxman said in Building a 1pb rig:
@haitch said in Building a 1pb rig:
@manfromafar yeah, the arc is useless
aehem, no.
On one box i have set# zfs set secondarycache=metadata /plotsand it has ~120 GiB of metadata in the L2ARC. That is, "where is block x in file y", so after a while no metadata lookup occurs during mining.
You may also keep that in ARC. But does it have any impact on mining speed ? I don't know.
What you've done here ensures that ALL non ZFS data is pulled straight from disk. You've just removed arc completely from your system for for that dataset. Also unless you ahve more the 64GB of ram in your system having a large l2arc is useless since it would be better to use the space taken my l2arc in arc for actual caching of data.
-
@manfromafar I don't understand what you are saying here. Yes, there is an ARC. And I configured the L2ARC for that pool/dataset to only hold metadata.
primarycache is still "all".
Setting primarycache=none would have the effect you describe, and L2 would never be filled (as L2 is fed from ARC).
I just checked the weekly diff average for the last month [PiB];
4: 25.0788
3: 27.5536
2: 31.4638
1: 34.3933we're still good on ROI.
-
@vaxman Damn - now I need to think about replotting those drives as a ZFS again .......
-
Ah yeah forgot which you were configuring. But the result is still the same you blow the MRU cache in arc away every 4 minutes when you reread 30TB of data. but since it should still be fine.
@haitch why replot if you already ahve everything setup?
-
@manfromafar Because I'm not completely happy with the way it's currently setup - so was considering a replot anyway, this discussion just made me think about going back to ZFS pools again - which it originally was.
-
@haitch what (besides from the hard 4% reservation) made you leave zfs ?
When I first started I had a pile of rusty Seagate ST2000VX, and having them on a raidz was so much better than hidden behind a RAID controller. They leaked bits, but I was able to mine them without errors. I'm so content that I didn't even look at other filesystems in the last 3 years. Having mechanisms for cache control at this level is very helpful. I wouldn't replot, though. Or do you have all disks tied up into a single large object ?
-
@vaxman It was originally created as a ZFS testbed. Each drive was an individual Raid 0 array, 16 drives, with the whole thing connected to a VMWare server, via different methods (1Gb Eth, 10Gb Eth, Infiniband, FC) with a VM that plotted/mined it. After my testing was done I wasn't convinced I was getting the best possible performance from it - so converted it to a Windows box with DAS. No real change in performance, with a loss of flexibility. So considering flipping back.
-
@haitch Pennywise plotting it's way to 153TB, Pennywise 2 hardware ordered, negotiating for 192TB of storage for it .... both have the capability for external expansion in 320TB chunks. :) Be afraid, my monsters are coming
Update: Negotiations for 192TB of storage apparently successful....... for less than $22/TB and free shipping .....
-
goodluck!
