How roaming actually works at the frame level - not what the marketing sheet says, but what happens between the client and AP and what it looks like in a PCAP.
— Shankar K., Wi-Fi engineer, Irving TX · 15 years 802.11 protocol analysis
The three amendments work as a system. 802.11k gives the client a target. 802.11v gives the AP a voice. 802.11r removes the re-authentication delay. Without all three, something in the roaming chain is broken - and you will see it in the PCAP.
Amendment
Role
Who initiates
PCAP filter
Without it
802.11k
Gives client a candidate AP list - no blind scanning
Client requests, AP responds
wlan_mgt.fixed.action_code == 4
Client scans all channels blind. Roam latency 500ms+
802.11v
AP nudges sticky client to a better AP
AP initiates BTM Request
wlan_mgt.fixed.action_code == 7
Sticky clients hold weak signal. No AP-side steering
802.11r
Eliminates EAPOL re-auth on roam
Pre-negotiated via FT protocol
wlan_mgt.rsn.akms.type == 4
Full 4-way handshake on every roam. 300-800ms delay
Field note: The most common enterprise roaming failure I diagnose is an 802.11r deployment where clients fall back to full re-auth. The PCAP tells the story immediately - look at the AKM type in the AssocReq RSN IE. If it's type 2 (PSK) instead of type 4 (FT-PSK), the client bypassed FT entirely. Usually caused by CCKM being advertised alongside FT, or client driver not supporting the AP's FT mode.
Roam latency targets
Roam type
Typical latency
Voice threshold
802.11r impact
Full re-auth (no k/v/r)
300–800ms
❌ Drops call
Baseline
802.11k + scan reduction
150–300ms
⚠ Risky
Faster target selection
802.11k + 802.11v
100–200ms
⚠ Marginal
Proactive steering
802.11r FT over-the-air
20–50ms
✓ Passes
No EAPOL on roam
802.11k + 802.11v + 802.11r
10–30ms
✓ Excellent
Full stack
802.11k - Radio Resource Management (Neighbor Report)
Without 802.11k, a client that decides to roam must scan every channel to find a candidate AP. With 802.11k, the client asks its current AP for a list of nearby APs - their BSSIDs, channels, and signal estimates - and scans only those. Scan time drops from 300–500ms to under 50ms.
How it works - frame sequence
1
Client sends Neighbor Report Request
Action frame: Category=0 (Radio Measurement), Action=4. Client asks "give me a list of APs I can roam to." Triggered when RSSI drops below client's internal threshold or when 802.11v BTM Request arrives.
wlan_mgt.fixed.action_code == 4
2
AP responds with Neighbor Report Response
Action frame: Category=0, Action=5. Contains a list of Neighbor Report IEs - each with BSSID, BSSID Information (capabilities), Operating Class, Channel Number, and PHY Type. A well-configured AP returns 3–6 candidates.
wlan_mgt.fixed.action_code == 5
3
Client scans targeted channels only
Instead of scanning 36+ channels, the client probes only the channels listed in the Neighbor Report. Time to find a target drops from 300–500ms to under 50ms in most deployments.
Client requesting neighbor list - 802.11k is working
wlan_mgt.fixed.action_code == 4
Action frame cat=0, code=5 with no IEs
AP returned empty neighbor list - 802.11k config broken
wlan_mgt.fixed.action_code == 5
No action frames before roam, many probe requests
802.11k not enabled - client scanning blind
wlan.fc.type_subtype == 0x04
RRM Capabilities IE (id=70) in AssocReq
Client supports 802.11k measurement requests
wlan_mgt.rrm.caps
Field note: The most common 802.11k failure is an empty Neighbor Report Response. The AP sends the response but the IE list is empty - the controller's neighbor list is misconfigured or the RF group isn't built correctly. In PCAP: you see action_code=5 from the AP but no Neighbor Report IE content. Client then falls back to full channel scan. Filter action_code == 5, expand the IE list in Wireshark, and count the neighbor entries.
802.11v - BSS Transition Management
802.11k lets the client ask. 802.11v lets the AP tell. The AP sends a BSS Transition Management Request nudging the client to roam to a better AP. The client can accept or reject. This is how enterprise systems solve the sticky client problem - clients that hold a weak signal instead of roaming.
BTM Request - frame sequence
1
AP sends BTM Request
Action frame: Category=10 (WNM), Action=7. Contains: Disassociation Imminent flag (forces client to act), BSS Termination Duration, Session Information URL, and optionally a Candidate List with target BSSIDs ranked by preference.
wlan_mgt.fixed.action_code == 7
2
Client sends BTM Response
Action frame: Category=10, Action=8. Status code 0 = client accepted and will roam. Status code 1 = client rejected the request. Status 2–7 = client unable to roam for specific reasons (no candidate, leaving ESS, etc.).
wlan_mgt.fixed.action_code == 8
3
Client roams (or ignores)
If status=0: client roams within the Disassociation Timer window. If status=1 (reject): AP must decide whether to force disassociate or tolerate the client. Most enterprise systems log BTM Reject and escalate - some force deauth after N rejections.
wlan.fc.type_subtype == 0x02 (Reassoc after BTM accept)
BTM status codes - what the client is telling you
Status code
Meaning
Action
0
Accept - will transition
Normal. Watch for Reassoc to listed candidate.
1
Reject - unspecified reason
Client refused. Check if Disassociation Imminent was set - AP should force it.
2
Reject - no suitable candidate
Candidate list was empty or all candidates are weaker. Fix neighbor list.
3
Reject - not enough bandwidth
Client has active traffic it won't interrupt. Rare.
4
Reject - BSS Termination undesired
Client is actively using the BSS and won't leave.
6
Reject - leaving ESS
Client is disconnecting anyway. Informational.
Field note: Status code 1 (unspecified reject) is the most common BTM failure. The client says no without a reason. In most cases this is an Android or Windows client with aggressive power management that ignores BTM. Verify Disassociation Imminent bit is set in the Request - without it, clients are free to ignore. With it set, the client has a countdown timer and must act or be disassociated.
802.11r - Fast BSS Transition (FT)
Without 802.11r, every roam triggers a new 4-way EAPOL handshake - 300–800ms. With 802.11r, the EAPOL key hierarchy is pre-distributed so the client can roam with a single Auth+Reassoc exchange. The correct name is Fast BSS Transition, not "802.11r" - the amendment was absorbed into the main standard in 802.11-2020.
Key hierarchy - PMK-R0 and PMK-R1
The MSK (Master Session Key) from 802.1X or PSK derivation creates the PMK-R0 at the R0KH (typically the controller or first AP). The PMK-R0 holder distributes PMK-R1 to target APs (R1KHs) before the client arrives. When the client roams, the target AP already has the key - no RADIUS round-trip needed.
FT over-the-air vs FT over-the-DS
Method
Frame path
PCAP identification
When to use
FT over-the-air
Client ↔ Target AP directly via 802.11
FT Auth Request + FT Auth Response between client and target AP BSSID. No frames to current AP.
Default. Most deployments. Lower latency.
FT over-the-DS
Client → Current AP → DS → Target AP
FT Action frames through current AP. Current AP relays to target. Two extra hops visible.
When client cannot hear target AP directly. Tunnelled through wired DS.
FT frame sequence - over-the-air
1
FT Authentication Request (to target AP)
Auth frame with Algorithm=2 (FT). Contains RSNE with FT AKM, MDE (Mobility Domain IE, id=54), FTE (Fast BSS Transition IE, id=55) with SNonce and client's PMKR1Name.
Auth response with Status=0 (success). Contains FTE with ANonce, PMKR1Name verification, and GTK wrapped in FTE. If Status=53 here - "Invalid PMKID" - the PMK-R1 was not pre-distributed to this AP.
wlan_mgt.fixed.status_code == 53 → FT failure
3
FT Reassociation Request
Reassoc to target AP. Contains RSNE, MDE, FTE with MIC (Message Integrity Code). No EAPOL handshake follows - this IS the handshake. The PTK is derived from the FT exchange, not EAPOL.
wlan.fc.type_subtype == 0x02
4
FT Reassociation Response - roam complete
Status=0 = roam complete. No EAPOL M1-M4 follows. Total exchange: 4 frames, 10-30ms. Compare to full re-auth: 4 EAPOL frames + RADIUS round-trip = 300-800ms.
wlan.fc.type_subtype == 0x03
Status code 53 - the most misdiagnosed roaming failure
Status 53 = "Invalid PMKID" in the FT Auth Response. The target AP rejected the client's PMK-R1Name - meaning the AP does not have the pre-distributed key. Root causes: R1KH distribution is broken between APs, APs are in different Mobility Domains, or the client is attempting FT to an AP in a different FT domain. Check wlan_mgt.fixed.status_code == 53 in the FT Auth Response frame.
Field note: I see status 53 frequently when a customer upgrades APs in phases. Half the APs are in the new FT domain, half are legacy. The client successfully FTs between new APs, then hits a legacy AP and gets status 53. The PCAP is the only way to identify which specific AP is rejecting - the controller logs often show "roam failed" without the status code detail.
Sticky Client - Diagnosis and Fix
A sticky client holds on to a weak AP signal instead of roaming to a stronger one. The client's RSSI drops below -75 dBm - sometimes below -85 dBm - while a stronger AP is clearly available in the same capture. The client either ignores BTM Requests or never receives them.
PCAP diagnostic sequence
1
Identify the client
Filter by client MAC. Look at radiotap.dbm_antsignal over time. If RSSI stays below -75 dBm for more than 5 seconds while the client is actively transmitting, you have a sticky client.
wlan.addr == <client_MAC>
2
Check for BTM Requests
Did the AP send a BTM Request to this client? If yes - did the client respond? A missing BTM Response means the client dropped the frame. A BTM Response with status=1 means the client rejected the nudge.
wlan_mgt.fixed.action_code == 7
3
Check retry rate at low RSSI
Filter retries for this client at the RSSI where they are sticky. High retry rate + low RSSI = client is degrading network performance for everyone else on the AP. Quantify the impact before escalating to the vendor.
wlan.fc.retry == 1 && wlan.addr == <client_MAC>
4
Check probe behavior
Is the client sending probe requests? If not - it is not even scanning for alternatives. If yes - is it probing on multiple channels or only the current channel? A client probing only channel 6 while stuck on channel 6 is fully stuck.
Client frames drop below minimum rate threshold. Client disassociates.
Force deauth
AP sends deauth when RSSI below threshold. Last resort - causes brief disconnection.
Deauth from AP BSSID to client MAC. Client re-associates to stronger AP.
Field note: iOS clients are the best roaming clients on the market - they respond to BTM, support 802.11k, and roam proactively. Android is inconsistent - some OEMs disable BTM response entirely. Windows is in the middle. If a sticky client is an Android device, check the OEM: Samsung and Pixel behave differently from each other on the same SSID. The PCAP will show you exactly what the device is doing - the vendor's "Wi-Fi optimized" claim means nothing if the BTM Response shows status=1.
PMKID Caching and OKC
PMKID caching (also called PMK caching or fast reconnect) allows a client to skip the 4-way handshake when returning to a previously visited AP. OKC (Opportunistic Key Caching) extends this across all APs in the same ESS by sharing the PMK from the original authentication. Both are EAPOL-visible in a PCAP.
PMKID caching vs OKC vs 802.11r
Method
Scope
How to identify in PCAP
Re-auth needed?
PMKID caching
Same AP only (return visit)
PMKID in AssocReq RSNE matches previous session. No EAPOL M1-M4 follows AssocResp.
No - PTK derived from cached PMK
OKC
Any AP in same ESS
PMKID in Reassoc RSNE. AKM type 2 (PSK) but no new EAPOL - PTK derived from shared PMK. Fast roam without 802.11r.
No - controller distributes PMK
802.11r FT
Any AP in same FT Mobility Domain
AKM type 3 or 4. FTE in Auth frames. Auth Algorithm=2. No EAPOL post-Reassoc.
No - PTK from FT exchange
Full re-auth
N/A - fallback
AKM type 1 or 2. Full EAPOL M1-M4 after AssocResp. RADIUS visible on wired capture.
Yes - 300-800ms
How to read PMKID in a PCAP
The PMKID is a 16-byte value in the RSN IE of the AssocReq or Reassoc frame. It is derived from: PMKID = HMAC-SHA1-128(PMK, "PMK Name" || AA || SPA) - where AA is the AP MAC and SPA is the client MAC. If the PMKID in the frame matches what the AP has cached, the client skips EAPOL. If it does not match - or if PMKID is absent - the AP initiates a new 4-way handshake.
# Filter AssocReq frames containing PMKID
wlan_mgt.rsn.pmkid # Then expand: RSN Information → PMKID List → PMKID[0] # Compare value across roaming events - same PMKID = cached PMK reused
Field note: OKC is the Aruba-pioneered approach and is more widely compatible than 802.11r because it does not require FT AKM negotiation. The client uses a standard PSK AKM (type 2) but carries a PMKID in the Reassoc request. In a PCAP: you see Reassoc with PMKID, no EAPOL M1-M4 after AssocResp, and the client is immediately on the network. If you see Reassoc with PMKID followed by EAPOL - the controller did not have the PMK cached, OKC failed, and fell back to full re-auth.
See roaming failures in a real PCAP
WiFi Analyser detects 802.11r FT failures, EAPOL completeness, roaming SLA timing, and sticky client patterns - automatically from your PCAP upload.