GPU plot generator v4.1.1 (Win/Linux)
-
@vaxman
yes there are 12 targets in parallelthey are PMR drives not SMR
i used 96gb ram (8gb per drive, 2mb per write) this might be a problem, but 2mb per seek should be enough to get 800mb/s / 12 drives = 66mb/s on a PMR drive...
if i understand the code correctly and asume equal gpus and hdds the generated data should be evenly distributed between the drives (CommandGenerate.cpp, line 320). I guess there might be a problem, because i sometimes see single drives idle for a longer time. this is when the gpus stop calculating new plots. i think this could be caused by windows and its thread handling when the right thread does not run fast enough to get to line 404...
i'm already using zfs, but only on my small freeNAS system with 4*6tb ;) i really like zfs but im not really sure there is any benefit for mining...
-
the generated data should be evenly distributed
I didn't read the code (wouldn't understand much of it, I guess) but are you sure that computing a nonce (4096*64 Bytes) is context-free ?
easily checked:
Make 12 directories on your SSD, plot 12 files of, say 10 GB, in parallel.
Then scale down to 8,6,4,2 parallel plots and observe the plot time.
If it scales linearly, you are good to go parallel as much as you like.
If it gets faster the less you parallelize, the algo is not "context-free" and needs more context than fits into local (gpu) memory.Optionally adjust file size to always plot the same amount of data.
Finding the sweet spot..
-
@vaxman
i spent the few days trying your suggestions, but it didn't matter how many plots were written to the ssd, it was the same speed (about 200k/min) all the time. Its the same like plotting to ram, since its a quite powerful samsung 960 pro with writing speeds exceeding 2gb/s.When the next 24 hdds arrived i tried switching to linux (ubuntu 17.04) and ext4 filesystem and it seems like i was right to blame the problems on windows.
- ext4 does not need to create the plot file before starting to write to it. this saves about 12-15h at the start of the plot
- linux seems to handle those threads much better than windows. with the same settings and the same hardware my nonces/min stay at around 210-220k.
- it does not matter how many plot files are used (as long as there are more than 5, which are required to get enough write speed) the nonce/min stay at above 210k
After testing for some hours i completely wiped the new hdds and started plotting around 10h ago. Currently 20% are done and there are less than 40h remaining while plotting 24*8tb.
-> Switching from windows (120h for 12 * 8tb) to linux (50h for 24 * 8tb) is a very good choice if you want to plot in direct mode or to many drives at the same time :)
-
I forgot to mention that i'm mining the first 24*8tb (NTFS formatted, some 8tb plot files, some 1tb plot files) at the same time. Reading speeds are around 4gb/s, roundtimes below 15s even while plotting
-
I updated the GPU plot generator (v4.1.1). It now comes with file pre-allocation when launched with admin rights, which greatly speed things up in terms of IO operations.
I'm searching for Linux and MacOS owners to test it. I'll provide pre-built binaries soon for these OSes to ease things up.
-
thanks for these updates cryo! I had been using an older version which worked well via buffer method (plots 5TB in about 12-13 hours) but was never able to successfully get the direct mode to work, which is of course preferable. So, I just downloaded 4.1.1 and am eager to try direct mode again to see what happens.
One question though... I noticed that after downloading both 4.1.1 and 4.1.0 off your github page, the ZIP only contains the .exe file. Is this correct? So, the exe is the only file that changed and if so, then I should just replace the old exe file with the new one and leave all of the other files (dll, bat, txt) intact. Or, do some of these files also need to be replaced? if so, where do i find a download with everything? Not sure what version I am using as I can't see it anywhere.
Thanks for the help!
-
I changed the build system, so the only files required are the .exe and the [kernel] folder.
If it doesn't launch due to missing DLLs, install the Microsoft C++ 2015 redistribuables.
-
@cryo thank for the info. I just started a small 500GB plot using 4.1.1 direct mode and it launched fine after I installed the C++ package. However, I wanted to sanity check the write speed as it seems very low. I am only getting about 8-12 MB/sec in direct mode, compared to 80-120 MB/sec using the buffer mode from a previous version. Certainly I expected this to be somewhat slower since it writes plots already optimized, but is 10X slower reasonable? Or maybe I have set something wrong?
I used the same devices.txt setting (0_0_4096_128_8192) as with buffer mode and the same 20000 value in my .bat to use ~5GB of CPU memory. Any tips would be appreciated.
-
hmmm, and now it dropped down to 2-3MB/s, which means it will never finish at that rate. Been stuck there for more than 10 minutes so there must be something wrong.
-
@GabryRox In direct mode there is a long delay as it builds the empty file before filling it in - just wait, it'll get faster. As long as it's not an SMR drive .....
-
thanks @haitch - but this has been going almost 3 hours now, hovering between 2-7 MB/s most of the time, with an ETA for completing a measly 500GB plot in about 28 hours! I had to look up what an SMR drive is, and unfortunately, I probably do fall into the category as most of my drives are 5TB Seagate Expansion or Backup+Hub external HDDs.
Does this mean I am pretty much hosed trying to write in direct mode with these? At this rate, it would take 2 weeks to direct-plot a single 5TB drive lol.
-
@cryo said in GPU plot generator v4.1.1 (Win/Linux):
I updated the GPU plot generator (v4.1.1). It now comes with file pre-allocation when launched with admin rights, which greatly speed things up in terms of IO operations.
Thanks for your update to /GPU Plot Generartor.
Running in Windows10 /Admin mode. (Latest Visual-Studio version installed). But getting an Error message: "[ERROR] bad allocation". The Plot size is only about ⅓ of the HDD. The same Plot works Ok (but slowly) with CPU plotters.
What causes this error message, and how can it be resolved?
Merci.
-
@GabryRox It's about two weeks for an 8TB drive, a 5TB will be about 10 days. You have a couple of options - plot in buffer mode - fast plot, slower mining, or plot to a PMR drive then copy to the SMR drive. If you're plotting a lot of drives - get an SSD, direct plot to that then move the plot to an SMR - rinse and repeat.
-
thanks @haitch - I actually have an empty 500GB Samsung EVO 830 in my newly built PC so I will try writing direct mode to that in maybe 400GB plots, then start copying those over to my SMR drives. that will end up making about 11 plots (not ideal i know) on the 5TB drives but what I've seen with my current 6-7 plotted drives is that my 1 drive with 13 small optimized plots reads 40-50% faster than my other drives with say 1-3 large, non-optimized plots, so it will still be worth it I think. Thanks again for the tip.
-
@GabryRox That will work out very well. Love those Samsungs almost as much as the Intel NVMe's.
-
@haitch Yup! This method may take a bit more baby-sitting but it will be soooo much faster! I am writing 400GB direct mode plots to that SSD in about 1 hour flat, then another 45 minutes or so to copy to the Seagate HDD. Even at 2 hours per file, that's only about 22 hours to almost fill a 5TB HDD vs 10 days the other way. Granted, i can't monitor this during sleep but since I work from home and already sit by this PC 10-12 hours a day, I can get a 5TB drive done easily in 2 days. Thanks again for this tip, really appreciate it!
-
@GabryRox you could write a little script that generates a file, then moves it to the hdd, writes the next file...
i think something like this could work:
gpuPlotGenerator generate direct <file>
move /y <file> <folder_on_hdd>
gpuPlotGenerator generate direct <file2>
move /y <file2> <folder_on_hdd>
-
@GabryRox With the buffer strategy the bottleneck part is the computing power, to some extent. However, for the direct strategy it's the IO bandwidth.
There are two solutions to efficiently use the direct strategy:- Plot to a faster disk (a SSD is the best choice), then copy the resulting files to a slower disk.
- Plot multiple disks at once, up to your computing power (based on your observations you could easily plot to 10 drives at the same time with one single GPU and fill them in the same amount of time).
@haitch The long delay is gone with the new version when you launch it with admin rights (4.1+). Still, it takes time to write each plot.
@BeholdMiNuggets The bad alloc error is a RAM issue. The [staggerSize] in the direct mode is used to determine the amount of RAM used by the process. Example:
# Will generate an optimized plots file named [123456_0_1000000_1000000] (250GB) using 250MB of RAM. ./gpuPlotGenerator generate direct 123456_0_1000000_1000
-
@cryo said in GPU plot generator v4.1.1 (Win/Linux):
@BeholdMiNuggets The bad alloc error is a RAM issue. The [staggerSize] in the direct mode is used to determine the amount of RAM used by the process. Example:
# Will generate an optimized plots file named [123456_0_1000000_1000000] (250GB) using 250MB of RAM. ./gpuPlotGenerator generate direct 123456_0_1000000_1000Thanks Cyro (& others). - Any chance you give us another, practical /eg - for optimised (gpu) plots?
~ Don't see a ReadMe file in the current version of the GPU plotter, so referencing previous editions.Eg.(just a random example!).
For an Nvidia Gtx-1080Ti GPU (aka Gtx-1090), with 11Gb of Gram /frame-buffer. On a Pc with a Total of 16Gb of RAM. And 2x 8Tb HDDs (say). ***What could /should the [ devices.txt ] file contain?
And what would the process Command Line be?Much appreciated, /B.M'Nugs.
-
@BeholdMiNuggets The
README.mdfile is available in the repository. I've forgotten to include it in the binary releases, I corrected this for thev4.1.1.About your example:
The GPU RAM buffer must be paired with a CPU RAM buffer. Also, another buffer needs to be created for each output file to store the
staggerSizereordered plots.
As you want to plot indirectmode, thestaggerSizedoesn't have so much impact. It just needs to evenly divide the GRAM to free the graphic card in time to generate the nonces in parallel.
So let's say 8GB GRAM and 2x2GB RAM, for a total of 14GB RAM if you count the paired buffers.
8GB = 32768 plots
2GB = 8192 plots
7.9TB = 33046528 plotsThe
devices.txtfile should contain:<PLATFORM> <DEVICE> 32768 <LOCAL_WORK_SIZE> <HASHES_NUMBER>With:
PLATFORM/DEVICE: The platform/device couple of your GTX1090Ti, as provided by thelistPlatformsandlistDevicescommands, or by using thesetupcommand.LOCAL_WORK_SIZE: A GTX1080Ti posseses 3584 computing units. You can try 2048 for this parameter. If it is rejected by the card, divide by two, and so on (1024, 512, 256).HASHES_NUMBER:4096. If your screens blinks or you experience display driver crashes, use a small number, like4.
The command line will be:
./gpuPlotGenerator generate direct <DRIVE1>:/<ADDRESS>_0_33046528_8192 <DRIVE2>:/<ADDRESS>_33046528_33046528_8192With:
ADDRESS: The numerical value of your Burst address.
As discussed previously, depending on your disks, it may be better to plot on SSDs or to more disks at the same time to enhance the overall throughput.
