I’m sad to report that it appears very unlikely that the EHF signature will be recovered in this first cycle.
The issue, as we understand it currently, is that during implementation, the values for requestID and msgHash were switched. As a result, a masternode which participated in an earlier failed attempt to form an EHF message is unable to participate in subsequent attempts. This is because the LLMQ Signing System requires that the requestID be unique, and that a node will not sign two different msgHash for the same requestID. This is how we can guarantee that no two conflicting blocks can both receive chainlock, as they will share a common requestID, which will result in nodes not signing for both.
We believe at this point that the best option is to release a new version, v21.1.0, that fixes this problem by disregarding the expected architectural requirements that requestIDs must be unique for a single request. We believe this is the best option as the proper fix would require releasing a whole new major version, v22, requiring adoption by all stakeholders, including miners, exchanges, and masternodes as well as significantly extending the timeline to activation.
On testnet and devnets this issue did not manifest itself, for a couple of reasons.
Likely the largest reason is the threshold on devnets and testnets were generally 60% as opposed to 85% on mainnet.
Additionally, a few testing iterations used SPORK_24_TEST_EHF to delay signing until we had sufficient nodes upgraded. At the time of this testing, we had not hardened mainnet sporks and the plan for mainnet rollout included keeping Spork 24 disabled until sufficient nodes had upgraded.
Ultimately, while we did substantially test EHF, activating it twice on testnet and many more times on various devnets, we did not adequately test situations where there were multiple quorum cycles in a row which had substantial adoption, but not sufficient adoption to create a recovered signature. We did however test having a set of quorums just below the threshold, seeing that the message was not created, and then upgrading nodes and seeing that the EHF message was now able to be created.
These investigations have been done primarily by knst and myself over the weekend, in somewhat less than ideal stressful circumstances. We are working to make sure now that our findings are correct so we can resolve the issue the best way possible to ensure the fastest possible activation date for this version of Core and Evolution.
Long term, subsequent EHF activations will have the swapping of the requestID and msgHash resolved.
I will try to communicate more here as we continue this testing, and then release v21.1.
The issue, as we understand it currently, is that during implementation, the values for requestID and msgHash were switched. As a result, a masternode which participated in an earlier failed attempt to form an EHF message is unable to participate in subsequent attempts. This is because the LLMQ Signing System requires that the requestID be unique, and that a node will not sign two different msgHash for the same requestID. This is how we can guarantee that no two conflicting blocks can both receive chainlock, as they will share a common requestID, which will result in nodes not signing for both.
We believe at this point that the best option is to release a new version, v21.1.0, that fixes this problem by disregarding the expected architectural requirements that requestIDs must be unique for a single request. We believe this is the best option as the proper fix would require releasing a whole new major version, v22, requiring adoption by all stakeholders, including miners, exchanges, and masternodes as well as significantly extending the timeline to activation.
On testnet and devnets this issue did not manifest itself, for a couple of reasons.
Likely the largest reason is the threshold on devnets and testnets were generally 60% as opposed to 85% on mainnet.
Additionally, a few testing iterations used SPORK_24_TEST_EHF to delay signing until we had sufficient nodes upgraded. At the time of this testing, we had not hardened mainnet sporks and the plan for mainnet rollout included keeping Spork 24 disabled until sufficient nodes had upgraded.
Ultimately, while we did substantially test EHF, activating it twice on testnet and many more times on various devnets, we did not adequately test situations where there were multiple quorum cycles in a row which had substantial adoption, but not sufficient adoption to create a recovered signature. We did however test having a set of quorums just below the threshold, seeing that the message was not created, and then upgrading nodes and seeing that the EHF message was now able to be created.
These investigations have been done primarily by knst and myself over the weekend, in somewhat less than ideal stressful circumstances. We are working to make sure now that our findings are correct so we can resolve the issue the best way possible to ensure the fastest possible activation date for this version of Core and Evolution.
Long term, subsequent EHF activations will have the swapping of the requestID and msgHash resolved.
I will try to communicate more here as we continue this testing, and then release v21.1.