All of your questions were already answered in your original question.
In Main Mode, the Pre-Shared-Key (PSK) is verified in Messages 5 and 6. Message 5 and 6 are Protected by the Session keys ISAKMP generates, described above.
In Aggressive Mode, none of the messages in the negotiation are encrypted. And the PSK is verified in Messages 2 and 3. Notice, I said in both cases the PSK is verified, and I never said the PSK is exchanged. Obviously, if nothing in Aggressive mode is Encrypted, and you simply sent the Pre-Shared-Key across the wire unencrypted, there would be a huge gaping security vulnerability.
Lucky for us, the writers of ISAKMP already thought of that. And as a result, they created a special method for verifying that each party has the correct PSK, without actually sharing it across the wire. There are two items that are use to validate to each Peer that they both have the same PSK: the Identity Method and the Identity Hash.
VPN Peers can choose to identify themselves by various methods; most commonly, peers will simply use their source IP address. But they have the option to use a FQDN or Hostname. Each of these, along with the correlating value for the chosen method, are what make up the Identity Method. So for example, if I had the IP 5.5.5.5, and I wanted to use my IP address to identify myself, my ID Method would effectively be [IP Address, 5.5.5.5]. (Note: BOTH values make up the entire ID Method)
The ID Method is then combined (using a PRF) with the Seed value we discussed earlier (SKEYID), and a few other values, to create the Identity Hash. Recall, that what went into creating SKEYID in the first place was the Pre-Shared-Key.
The ID Method and ID Hash are then sent across the wire, and the other party attempts to re-create the ID Hash using the same formula. If the receiver is able to re-create the same ID Hash, it proves to the receiver that the sender must have had the correct pre-shared-key.
and
When doing a pre-shared key authentication, Main Mode is defined as
follows:
Initiator Responder
---------- -----------
HDR, SA -->
<-- HDR, SA
HDR, KE, Ni -->
<-- HDR, KE, Nr
HDR*, IDii, HASH_I -->
<-- HDR*, IDir, HASH_R
Message 5 is HDR, IDii, and HASH_I
Message 6 is HRD, IDir, and HASH_R
In my description, IDix was the ID Method, and HASH_x was the ID Hash. Was there something in there that didn't make sense?
If Firewall1 has the IP 1.1.1.1, and Firewall2 has the IP 2.2.2.2, and they are both using the Pre-Shared-Key of "Zebra", then Message 5 and Message 6 would include this information:
1.1.1.1 2.2.2.2
---------- -----------
HDR*, IDii, HASH_I ---5-->
<--6--- HDR*, IDir, HASH_R
Message 5 would include:
- IDii, which is effectively [IP Address, 1.1.1.1].
- HASH_I, which is a hash of the following values: (SEED Value + IDii + Public Values)
Message 6 would include:
- IDir, which is effectively [IP Address, 2.2.2.2]
- HASH_R, which is a hash of the following values: (SEED Value + IDir + Public Values)
The Public Values above are values that were exchanged in cleartext Messages 1, 2, 3, and 4, that anyone would have access to if they were eavesdropping on the wire. The Seed Value is the only part of the Hash that would only be known to whomever had the original Pre-Shared-Key. As one of the values that went into creating the Seed Value is the string "Zebra" -- aka, the pre-shared-key. See above, and the original answer for more details on how that is formed.