TEZ-4729: Improve ShuffleHandler#512
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
🎊 +1 overall
This message was automatically generated. |
| // Create a JobTokenSecretManager initialized with the session token's password | ||
| // for shuffle delete requests. The shuffle handler verifies requests using the | ||
| // token password, so the MAC must be initialized with it (as tasks do). | ||
| shuffleJobTokenSecretManager = new JobTokenSecretManager( |
There was a problem hiding this comment.
not sure if we need a separate secret manager, as the sessionToken is provided for jobTokenSecretManager already:
There was a problem hiding this comment.
JobTokenSecretManager has two different key paths:
computeHash(byte[] msg)(instance method) — uses mac.doFinal(msg) with the MAC master key (random at construction)retrieveTokenSecret(String jobId)— returns the per-job key from thecurrentJobTokensmap (populated byaddTokenForJob)
The client-side AsyncHttpConnection.computeEncHash() calls:
encHash = SecureShuffleUtils.hashFromString(msgToEncode, jobTokenSecretMgr);
// → mgr.computeHash(msg) ← uses MAC master key, NOT the per-job token from the map
The server-side verifyRequest calls:
SecretKey tokenSecret = getSecretManager().retrieveTokenSecret(appid); // ← uses the map
SecureShuffleUtils.verifyReply(urlHashStr, enc_str, tokenSecret);
So addTokenForJob populates the map (server path), but the client hash computation uses the MAC field (instance method). They're completely separate code paths. The MAC must be initialized
with the token password for the client-side hash to match what the server verifies.
It was leading to test failure here: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-512/3/testReport/org.apache.tez.auxservices/TestShuffleHandlerJobs/testOrderedWordCount/
Did I decode it wrong?
There was a problem hiding this comment.
that's right, thanks for clarifying, I tend to forget SecretManager internals
so the code comment indeed explains the situation well: client side uses the instance method, so we must initialize the instance MAC with the sessionToken, which is used from the map on the server/shufflehandler side
No description provided.