Ideas for a simpler and more efficient proof-of-capacity scheme



  • Hi, I wanted to share a simpler but more efficient proof-of-capacity scheme than the one used by Burstcoin. I also feel this alternative approach is somewhat more straightforward and easier to understand.

    Say every miner has a unique public key. Now calculate a list of hashes (using any algorithm) of that public key (or a shortened hash of it) concatenated with a simple counter nonce. E.g:

    Hash(PK||0),
    Hash(PK||1),
    Hash(PK||2),
    Hash(PK||3),
    ...
    

    Classify these nonces, based on the resulting hashes, into up to 65536 separate files (or "buckets") named '0000'..'ffff' (hex for 0..65535). The file '0000' would store all the nonces that resulted in hashes that start with '0000', file '0001' would store all the nonces that generated a hash that start with '0001', etc.

    The resulting set of files would look something like:

    File '0000':
    
    000000f1edfe
    000008dfc5a0
    0000ababc735
    000054e108d1
    0000122c90ed
    ...
    
    File '0001':
    
    0000000063f8
    00000024e595
    00000005b9c1
    00000e64f04b
    00000f87f9ee
    ...
    
    ...
    
    File 'ffff':
    ...
    

    (note the buckets contain the nonces - not the hashes themselves. The nonces are shown in ascending order because they are generated using a counter and thus added in increasing order. The corresponding hashes are not ordered in any way within the bucket)

    Using, say, 6 bytes for each nonce, as demonstrated, would allow indexing a potential space of up to 256^6 = 281 trillion nonces, which would cover up to 1.7 petabytes of mined data.

    This scheme allows performing 256^2 times (65536x) less hash operations than a pure processor-only miner to locate a winning nonce - by selecting the appropriate bucket and then testing all the nonces in it for a match. This still requires millions of hash operations on each lookup for a terabyte scale mined dataset. However, with a simple trick, that advantage can be significantly improved. Just add one or two additional bytes of the corresponding hash along with the nonce value. So now every entry in the bucket would look like:

    [bytes 3 and 4 of the hash][nonce]
    

    Example:

    File '0000':
    
    3405, 000000f1edfe
    9f29, 000008dfc5a0
    5be2, 0000ababc735
    1e27, 000054e108d1
    a346, 0000122c90ed
    ...
    
    File '0001':
    
    7ab2, 0000000063f8
    323c, 00000024e595
    1956, 00000005b9c1
    53f3, 00000e64f04b
    839c, 00000f87f9ee
    ...
    
    ...
    
    File 'ffff':
    ...
    ...
    

    This requires 8 bytes per entry, but now provides an even stronger advantage where a processing-only miner would need to calculate 256^4 = ~4 billion times more hashes than a disk based one, on average (9 orders of magnitude more processing power).

    Some notes:

    • Since only 8 bytes are stored per hash (regardless of hash length), it means that in terms of storage, this scheme is 4 times more space efficient than the one currently used by Burstcoin (4 times more 'tickets' can be packed in one unit of storage).
    • For 1TB of mined data, each one of the buckets would be about 16MB in size, thus on average, only 16MB would be read from disk and only 32 hash operations performed per lookup.
    • The entries don't need to be sorted in any special order within their bucket. Testing for a match would simply use linear search (by streaming the data to memory in chunks), and since individual files are not very large, that would be very fast (in the order of several tens to hundreds of milliseconds on a modern hard drive).
    • The hash itself can be of any length. E.g SHA-3 can be used, even with a digest size of up to several MB.
    • Multiple disks can address different ranges of nonces (every disk would have its own set of buckets).
    • The mining process can occur continuously and incrementally in the background. Buckets are append-only files and entries can be read from them as soon as they are added. This allows the initial mining phase to be relatively long (say, up to several weeks or more in time) but provide immediate availability of any tickets generated and thus potential reward for the miner.

    If you've noticed any error(s) in my calculations please let me know and I'll try to correct it. I'm also interested on knowing about other alternative schemes that have been proposed?


  • admin

    I'm not much into this core details, but i provided your post to people, that may have an opinion on this. Thanks for providing your thoughts.



  • @randomuser I have no idea what you're talking about. However, i tell you what, I will learn and find out. It on my list to do. Can you provide a source whereby one can lean about PoC from a beginners level?

    Keep posting and adding more clarification.

    Have a magnificent day on PURPOSE!



  • The original design for Burstcoin uses a system of 'plots' and 'scoops'. 'Plots' are 256KB clusters which contains chains of hashes (an individual hash is called a 'scoop').

    Every 'round' (or 'block') of Burstcoin derives both a target value to match (similar to a lottery target) and a random target scoop number to match. To calculate any particular 'scoop', its whole 'plot' needs to be known.

    I'm not completely sure but there seems to be the belief that grouping the hashes (or 'tickets', as I call them, in the sense of the each block being a 'lottery') this way allows for a greater advantage against a pure processing based miner.

    This belief may not be well-founded. It is true that this renders calculating any individual scoop up to 4096 times more difficult, but it also decreases the number of valid 'candidates' for a particular block 4096 times.

    I think a more effective metric would be to compare the time it would take for a processing-only miner to generate an equivalent amount of candidates, say, comparable to a large, terabyte scale mined dataset.

    If the block time is, say 1 minute, and mining to disk took a month, that means that the equivalent computing power needed by a processing-only miner is about 43,200x (there are 60 * 24 * 30 minutes per month) to test the same amount of candidates.

    I understand this still may not be completely clear. The original description and flow chart wasn't very clear as well. I'll try to do my best to simplify this further if I can.


Log in to reply
 

Looks like your connection to Burst - Efficient HDD Mining was lost, please wait while we try to reconnect.