Evolution - Dashdrive Discussion

oaxaca

Well-known member
Foundation Member
Each member of the second tier will be required to have a specific amount of storage space in
order to power the DashDrive filesystem. By sharding the storage via the collateral transaction
hash, we can define 1024 different shared storage devices on the network. We use 1024
because, we can identify shards by using the first 10 bits of a unique hash per storage object.
For example, with a 40GB allocation requirement, the network can enjoy about 40960GB of
storage space. When users interact with the network they will transmit information to be stored
on DashDrive via the decentralized API.

For redundancy, each shard will be stored multiple times on the network. For example if the
network has 5000 Masternodes, we will store each item ((5000/1024)+seed_count) times.
DashDrive supports a few advanced features such as transactional commits, where users can
require multiple files get written to different destinations on the network. If any write fails, the
entire commit for all files will be reverted.

In addition, reading or writing files is only possible when a user has access to a given file, such
as their own profile page. When trying to read files a user does not have access to, they will be
denied access.

Writing files can be done only by having enough quorum signatures, and can be used to do
maintenance or allow users to update information on the network.
 
I get it. It's terrifying and beautiful.

Gone are the days of downloading 3500 copies of the blockchain onto the drive of each node. Now, the hard drives of each node are 1/3500 of the blockchain. Instead of querying your local copy "dash-cli getinfo", you query the high-performance shared computing resource called Dashdrive. This is disk drive striping on steroids.
 
Is DashDrive only intended for blockchain and transactions?

This will be the most expensive secure storage available. It would be best to keep any outside services off of the DashDrive. Bitcoin is getting sidechains polluting their blockchain. Eventually, this DashDrive will be huge and you don't want to be stuck because 3rd party storage was allowed early on.

A single transaction lookup would be slower if you have to figure out where it is stored, then request it and download it. Maybe you don't. If you store your transactions locally, they don't take up much space, you have all of your data to search around if you want. The cool part is that there is no waiting for a blockchain to download to make a transaction. That brings this to merchant friendly territory.
 
Its planned to move the blockchain to distributed mirrored storage using DashDrive? I hadn't come across that bit yet, is it in the Evolution docs? Certainly a step that justifies the "Evolution" tag, getting close to evolution squared at this stage :)
 
I think we have to take really care of this: how redundancy will work.
I don't think we can lose any data on DashDrive.
If for example the various copy of one shard can be identified and located (on which MN they are) this can be a big weakness and open posibilites to hacker to DDOS or even data loss.
 
I think we have to take really care of this: how redundancy will work.
I don't think we can lose any data on DashDrive.
If for example the various copy of one shard can be identified and located (on which MN they are) this can be a big weakness and open posibilites to hacker to DDOS or even data loss.

As I understand it, the shards are "striped" across various masternodes as well as having redundant sets. The optimum number of copies can be calculated using "uptime" of each node.

Should be pretty slick.

Unless the kill switch is pulled on the internet of course.
 
The evolution paper does not talk about storing the blockchain in DashDrive in a sharded way (as primary source of truth). If this is really the plan, this would definitely deserve its section on its own. There are major challenges: E.g.: Who validates the transactions in a block? How to agree which is the longest chain? What happens if the longest chain changes?

I think the current sharding system has problem with reliability and consistency and therefore should not be used to store any information that is critical for the system to process transaction or to prevent fraud.

Quoting the Dash Evolution Paper:
"By sharding the storage via the collateral transaction hash, we can define 1024 different shared storage devices on the network. We use 1024 because, we can identify shards by using the first 10 bits of a unique hash per storage object.
...
For redundancy, each shard will be stored multiple times on the network. For example if the network has 5000 Masternodes, we will store each item ((5000/1024)+seed_count) times. "

I don't know what seed_count is supposed to be. Let's do some math: If we assume that the collateral transaction hash is randomly distributed and we have 5000 masternodes. What will the chance be that a shard has no masternode assigned to it? For a given shard x, the probability that all 5000 hashes are different from x is (1023/1024)^5000 = 0.755762539% This seems low. However, this is just for a single shard x. We have 1024 of them. What is the probability that all of them have at least one masternode assigned to it? It is (1-0.00755762539%)^1024 = 0.042288898% With high probability there is always some shard that has not a single masternode assigned to it! Do not trust on randomness to produce uniform distributions.

I think this could be solved by assigning shards to masternodes in a deterministic way in the Quorum Chain: Whenever a new masternode is added to the chain, it can be assigned to the shard that has the lowest number masternodes assigned to them so far. This will provide an uniform assignment as long as there are no long series of removes without adding new masternodes.

Another problems of the sharding are DoS attacks: An attacker can DoS all masternodes of a specific shard. This will make this shard inaccessible. Wallets stored in this shard can no longer be accessed.

Even worse, an attacker can replace all masternodes of a specific shard by its own, with running just a few masternodes on his own: The shard of its own masternodes can be chosen (if sharding is done using collateral transaction hash the transaction can be modified until the hash shows the desired shard, with the Quorum Chain, the masternodes can be added at the correct time). According to the paper, dash evolution double-spend prevention relies on DashDrive:
"Double spending is not possible on the Dash Network due to the “Commit or Rollback” (COR) feature of DashDrive. After sending a transaction to the network, it will be written to DashDrive and the inputs will be reversed via usage of the filesystem."
As far as I understand, the prevention works by writing to a DashDrive file which is given by the input address. If the write succeeds, it is assumed that there is no double spend. A double spend attacker knows into which shard this write will go. By controlling all masternodes of this shard, the attacker can accept both writes making the network fail to detect the double spend of his input. Therefore, I do not think DashDrive is suitable for Double-Spend detection with its current design.

Even in absence of malicious attackers I do not see how the DASH network guarantees that the content of the DashDrive of all masternodes in a shard is consistent: Writes to some masternodes can succeed while writes to others fail, masternodes can crash, there can be network partitioning and all the bad things that are possible to happen in a distributed storage system.If the shards are not consistent, the dash quorum lookups will have different results which will cause transactions to fail.

Thoughts?






 
I don't think that's the plan. I might be wrong, but I think the sharded storage is for customer information, which is going to be a strain on the system, but not if sharded.

Eventually, I think there will be a very secure way to trim the blockchain, but that's not currently a priority :)

Anyway, your other observations are deeper than I dare go, but I'll PM Evan to look at what you wrote in case you found an issue :) Thanks!!!
 
Last edited by a moderator:
Do we know the exact Dash Evolution node hosting requirements? Have they been posted anywhere?
 
I don't think that's the plan. I might be wrong, but I think the sharded storage is for customer information.
This was my understanding. Decentralized Mega where you pay for your storage to the masternode network. I don't believe there is any plans to use this storage to host the blockchain.
 
Oh no, not yet. I am pretty sure it won't be much more than what most of us have at the moment. I mean, my hosting comes with 65 gb of storage. And it's inexpensive.

coingun, just one thing, I don't think most people will have to pay for anything, only if they go over a basic limit or something :) Otherwise, yah :)
 
Do we know the exact Dash Evolution node hosting requirements? Have they been posted anywhere?

The "Scalability and performance" section of the evolution paper has some numbers on expected storage based on rate of network transactions. I think the idea is to not have fixed requirements but to require the MNs to scale as the network growths -- which is supposed to happen in lockstep with an increase in dash price and therefore MN revenue.
 
Hey Guys

Can I please ask that we keep VPS prices in mind when the main developers decide on harddrive space requirements. I am only here at Dash because I wanted to help the Bitcoin XT crowd, and then found out how much a VPS with a 60GB/80GB harddisk would cost. I realised I would run at a massive loss if I ran a node. Not all ISP connections in the world are great, and not all people can run their computers 24h a day.

For me a masternode is never about making a profit. But I love that it pays for itself. That's the beauty.
 
Just wanted to say that Evan confirmed that the Dash Drive is user data and Masternode data, no transactional data :)

Does this also mean it dash will not use the DashDrive for preventing double spend? Can somebody remove this from the paper then? Can you confirm transaction locking will still happen the same way as described in the InstantX paper? eduffield
 
No, where is that written in the paper? No, the DashDrive is simply a sharded storage system. It should start out with small requirements, and will grow over time.

The way transactions will be vetted, is the same way instantX transactions work, only instead of a single Masternode quorum being selected, virtually all the Masternodes will be randomly grouped into quorums of, I think it's decided now, 10 masternodes. They check and lock all the transactions that come their way, and those are the transactions the miners have to put into the blockchain. In this way, instead of traditional mining, where they can process 4-7 transactions per second, we will be able to conservatively process 4 X 350 transactions per second with the infrastructure we currently have, or a conservative 1400 transactions per second. To increase the number, we simply have to lower the collateral to own a masternode, halving it will pretty much double capacity.

So, with Evolution, miners will have no say as to what goes into the blockchain, rather Masternodes will put a lock all transactions and send them on to the miners.
 
No, where is that written in the paper? No, the DashDrive is simply a sharded storage system. It should start out with small requirements, and will grow over time.

The way transactions will be vetted, is the same way instantX transactions work, only instead of a single Masternode quorum being selected, virtually all the Masternodes will be randomly grouped into quorums of, I think it's decided now, 10 masternodes. They check and lock all the transactions that come their way, and those are the transactions the miners have to put into the blockchain. In this way, instead of traditional mining, where they can process 4-7 transactions per second, we will be able to conservatively process 4 X 350 transactions per second with the infrastructure we currently have, or a conservative 1400 transactions per second. To increase the number, we simply have to lower the collateral to own a masternode, halving it will pretty much double capacity.

So, with Evolution, miners will have no say as to what goes into the blockchain, rather Masternodes will put a lock all transactions and send them on to the miners.

See section 7.4 from the dash evolution paper (quotes below). This made me assume the DashDrive is used instead of MN (Quorum) locks.
The miners would not have to keep the blockchain and verify the inputs are unspent. They'll instead have to verify that the transaction locks -- which will use some resources as well.

7.4 Double-Spend Prevention - Commit or Rollback
Double spending is not possible on the Dash Network due to the “Commit or Rollback” (COR) feature of DashDrive. After sending a transaction to the network, it will be written to DashDrive and the inputs will be reversed via usage of the filesystem.


CTransaction()

(

Input(1) => /dashdrive/inputs/hash1,

Input(2) => /dashdrive/inputs/hash2,

Input(3) => /dashdrive/inputs/hash3,

)


If at any point a write to DashDrive fails, the earlier writes will be reverted back to the state before the commit started. This allows two users (or one attacker) can attempt to write a transaction inputs, being that inputs are unique file locators on the network, only one will successfully be able to write the transaction commit. In addition to reserving resources on the network, DashDrive stores the whole transaction history across the shared filesystem, while awaiting archival in the permanent blockchain.​
 
May I suggest that our developers use the Kinetic Open Storage API for the dashdrive? It will save a lot of maintenance for the MN operators. And, it can be expanded easily. That's something really good. Trust me, spend some time to read it before you think it was just a drive connected to the network.


 
Last edited by a moderator:
Back
Top