Dear All,
Here is the incident root cause analysis done by
@UdjinM6 (many thanks for that!)
Forked Blocks:
Hash: 00000000000047e6e7aebb3fa1a0565a283f05d11d9324c23252424fb6e6518d
Previous Block: 000000000002dc1c76bbac26dede4791e11565f8bed8030652d8823afd1443c8
Height: 523412 <-------------
Transaction Merkle Root: d106445cd5dc1bff22b37fc4fabcea78593a19aa9e93307231ab90355a5faf2a
Time: 1471728542 (2016-08-20 21:29:02)
Difficulty: 20 351.533 (Bits: 1b03385f)
Nonce: 2929999341
Payee: Xoi3AgTmuBGR2beRJsXKrLtsX8WAtZweZP <-------------
Hash: 000000000000e54f036576a10597e0e42cc22a5159ce572f999c33975e121d4d
Previous Block: 000000000002dc1c76bbac26dede4791e11565f8bed8030652d8823afd1443c8
Height: 523412 <-------------
Transaction Merkle Root: 9f8f2556ad9e488596adb21fb3fc161b9b10b744a68920806b72908ea49b7b01
Time: 1471728541 (2016-08-20 21:29:01)
Difficulty: 20 351.533 (Bits: 1b03385f)
Nonce: 1110724282
Payee: XtCicpfofmiPUzDQzAmdy6Sf5zCmr3cXhi <-------------
What happened :
1. We had an inconsistency in masternode list between pre-0.12.0.58 (57?) and 0.12.0.58 nodes,
_probably_ due to a bug in ipv6 MN signatures itself or due to its fixes #847 and #858 (improved in #860) /facepalm here/ or both. Anyway there must be some inconsistency to trigger (2).
2. Masternode list inconsistency was the reason why results of voting for a winning masternode could look like this: { "523412": "XtCicpfofmiPUzDQzAmdy6Sf5zCmr3cXhi:5, Xoi3AgTmuBGR2beRJsXKrLtsX8WAtZweZP:5" } Note, that neither one of masternodes received enough votes (6+) to be declared a clear winner.
3. We have a rule which says that if there are not enough votes to declare a clear winner, network should accept a block with any of nodes (which were voted for) included as valid (
https://github.com/dashpay/dash/blob/master/src/masternode-payments.cpp#L519)
4. However there probably was large enough number of nodes who had vote results like { "523412": "XtCicpfofmiPUzDQzAmdy6Sf5zCmr3cXhi:4, Xoi3AgTmuBGR2beRJsXKrLtsX8WAtZweZP:6" } or { "523412": "XtCicpfofmiPUzDQzAmdy6Sf5zCmr3cXhi:6, Xoi3AgTmuBGR2beRJsXKrLtsX8WAtZweZP:4" } and they thought that there was a clear winner but it was different for each side.
5. So when two large miners solved block with different winner every node which had a clear winner recieved a block it liked, rejected another block and later continued with choosen branch.
6. Miner who solved 00000000000047e6e7aebb3fa1a0565a283f05d11d9324c23252424fb6e6518d was minnig on pre-0.12.0.58 node and like probably most of pre-0.12.0.58 nodes had a clear winner, so he also rejected another block and continued minig on his chain.
7. Nodes that had no clear winner probably accepted both blocks they received but later just continued with the longest chain.
Temporary solution:
If suggestion about masternode list inconsistency (1) above is right then having most nodes on 0.12.0.58 should reduce inconsistency i.e. it will be much more unlikely to have a situation when there are two different winners with 6+ votes for different parts of network.
Current known issues:
We can't get rid of masternode list inconsistency completely right now due to the way it's implemented.
Thoughts on future solutions:
Since masternode winner list is actually a part of the consensus, we need to implement a deterministic solution to avoid any inconsistency. This can be either some on-chain solution or off-chain protocol enhancement like another round of voting or some kind of vote synchronization among top10 masternodes.
Final thoughts:
Having most of masternodes and all (or at least all major) miners on 0.12.0.58 should be enough for now. In long-term a solution must be provided to address inconsistency issue.