Designing low-latency secure transit validators.
Tap-to-decision under 300 milliseconds. Mutual authentication. Transaction MAC. Audit write. Gate actuation. The same gate makes that decision tens of thousands of times a day, in temperature, vibration, humidity, occasional sabotage, and decade-long unattended deployment. Here is how the architecture actually fits in the budget — and the engineering choices that decide whether your deployment ships or stalls.
The budget
Industry practice puts tap-to-decision — from the moment the rider’s card enters the field to the moment the gate latch decides to open or stay closed — at 300 milliseconds or less. High-throughput metros (Hong Kong, Tokyo, London) target 250 ms; some run at 200 ms. Buses are more forgiving (passengers expect a brief pause at the door); inspectors’ handhelds are the most forgiving (the inspector is patient). For the gate-class case, 300 ms is the budget you have to work inside.
That 300 ms covers everything. RF activation. Anti-collision. SELECT (file or AID). EV2 mutual authentication. Application read. Fare evaluation. Stored-value debit. Transaction MAC. Audit write to NV storage. Receipt buffer. Gate actuation signal. Display update. Sound. The card has to land on the antenna and not be lifted before all of this completes.
Where the time actually goes
From actually-shipped deployments, here is roughly how the budget breaks down on a well-tuned EV2 validator with a SAM:
- ~25 ms — RF activation and ISO 14443-3 anti-collision.
- ~30 ms — SELECT APPLICATION + first APDU exchange.
- ~80 ms — EV2 mutual authentication. Two card round-trips, two SAM round-trips, derivation work in both.
- ~80 ms — fare logic: read profile / season / value, evaluate against fare table, debit if needed.
- ~50 ms — commit: TMAC, audit write, receipt buffer.
- ~35 ms — gate actuation, UI feedback, sound.
Total: about 300 ms. There is no slack. Every layer of the system has to behave or you blow the budget on the worst tap of the day, which is the tap that makes the news.
The RF stage
The first 25 ms is the RF link, and it is the layer most often blamed for the wrong things. The fix is not in firmware; it is in antenna design and tuning. Mistakes we have seen ship and have to be redone:
- Antenna tuned with no enclosure, then dropped into the metal-rich validator housing where the resonance shifts.
- Antenna shared with a cellular modem, with PCB layout that bleeds 13.56 MHz harmonics into LTE bands.
- EMC certification done with the validator on a bench; in the field, mounted on a metal turnstile post, the field strength at the tap surface is half what was designed for.
- Anti-collision parameters left at vendor defaults; bursting more than one card in field at once costs 30+ ms before the validator picks one.
The fix in all cases is to tune the antenna in the actual deployed housing, on the actual mounting structure, with the actual neighbouring radios powered on. Bench-tuning is a starting point, not a finish line.
The authentication stage
EV2 mutual authentication is two card round-trips and two SAM round-trips, all happening serially because each step depends on the last. ~80 ms is achievable. Where teams blow this budget:
- USB or UART contention with the SAM. The SAM lives on a serial channel. If you are sharing that channel with a logging service, a lifecycle daemon, or a UI compositor, the channel is not always available when the auth flow needs it. Dedicate the SAM channel.
- Locking the SAM session per-tap rather than per-boot. Some implementations open and close a host secure channel to the SAM on every tap. Don’t. Open it once at boot, hold it for the validator’s service shift, recover gracefully on errors.
- Synchronous logging in the auth path. Writing to a file system between the card-side rounds adds a context-switch and a flush. Buffer logs in RAM; flush on a separate thread.
The fare-logic stage
Fare logic seems like it should be cheap — it’s just a state machine over a few values from the card. It isn’t cheap when the fare-rule engine is interpreted, when it pulls from disk for fare-table lookups, or when it goes through an OS layer that doesn’t have a hard real-time path. We have seen 200 ms on a fresh box drop to 80 ms simply by precomputing the fare table at depot-sync time and pinning it in RAM. That is now most of your budget back.
The implementation rule we settle on: everything the validator needs to make a fare decision must be in RAM at tap time. Fare tables, time-of-day rules, station-pair tariff matrices, blacklist, transfer windows. Disk reads in the tap path are a bug.
The commit stage
The TMAC + audit write is where teams often discover that NV storage is not free. eMMC has wear-levelling pause behaviour; flushing a single record can cost 50 ms by itself if the controller decides this is a good moment to recycle a block. The fix is a tamper-evident ring buffer in NV that’s designed for append-mostly access:
- Pre-allocated. The validator boots having already provisioned the next 24 hours’ worth of audit records.
- Append-only. Each entry written sequentially with a CMAC over its content + previous-entry hash.
- Background flush. A separate thread moves committed records to durable storage; the tap path waits only for the in-RAM append.
- Tamper-evident. The ring is signed at boot; a power-cut-recovery routine confirms the ring is consistent before the validator returns to service.
This shape gives you ~10 ms in the tap path for audit write and is honest about durability.
The gate-actuation stage
~35 ms for gate hardware, UI, sound. Where this goes wrong: the actuation signal goes through a microcontroller that’s also handling the LCD or the LED, on a single CPU thread. Use a dedicated I/O microcontroller for actuation. The display can update at human-perceptible latencies; the gate cannot.
What changes when network arrives
Most of the world’s gates make their decision offline. When network is intermittently available, the architecture does not change; the network just opens new asynchronous paths:
- Blacklist updates flow in (cards reported lost in the last hour are now flagged at the gate).
- Transaction batches flow out (the receipts buffered locally are uploaded for settlement).
- Configuration updates flow in (new fare tables for next quarter, soft-loaded before activation).
None of these touch the tap path. If the network goes away mid-shift, the validator notices on the next batch heartbeat and continues to operate. Riders never see the difference. This is what separates a good architecture from a fragile one.
Failure modes worth designing against
SAM unresponsive.
Validator must decide: fail-safe (open) or fail-secure (closed). Operator policy. Whichever you pick, log the failure, alarm the depot, and don’t silently degrade.
Blacklist out of date.
Most operators choose to honour the cached blacklist with an indicator that the validator is “stale”. Newly-blacklisted cards still get through until depot sync; the back office reconciles after the fact.
Power lost mid-tap.
Card may have committed; validator may not have. Anti-tearing counters on the card and the audit ring on the validator together flag the inconsistency on next read. Issuer reconciles in batch.
NV storage full.
Validator should refuse to start a fresh tap rather than overwrite undelivered receipts. Operator notices in the next depot sync.
Reader-CPU panic.
Watchdog. Reboot. Re-open SAM session. Resume. Total downtime < 10 seconds.
A practical hardware shape
For an unattended gate validator with the budget above, the platform that most cleanly fits is:
- Application processor — ARM Cortex-A class, Linux. Runs the fare-logic state machine, network stack, UI, depot-sync.
- I/O microcontroller — Cortex-M class, bare-metal or RTOS. Drives the gate actuator, the buzzer, the LED. Dedicated, deterministic.
- RF front-end — CLRC663 or comparable, wired to a tuned antenna in the housing.
- SAM — AV2 / AV3 in a SIM-form FRU with a tamper-evident socket.
- NV storage — industrial-grade eMMC or SD with overprovisioning, with the audit ring pre-allocated.
- Backhaul — cellular primary, Wi-Fi failover, wired option for fixed installations. Backhaul down ≠ validator down.
- Power — battery backup sized for graceful shutdown plus 60 seconds of taps under PoE / mains drop.
What this all means in practice
Tap-to-decision under 300 ms is not impossible. It requires that every layer of the system — antenna, RF front-end, SAM channel, fare-rule engine, NV storage, gate actuation — respects its budget. The architecture that makes this credible is the same SAM-protected closed-loop architecture that makes the deployment auditable in the first place. The budget and the security are not in tension; they are co-designed.
Where most deployments stumble is not the cryptography. It is the engineering discipline below the cryptography — the antenna tuning, the SAM channel ownership, the fare-table residency, the audit ring design, the failure-mode policy. Get those right and the security primitives will fit in the budget; get them wrong and no clever crypto choice will save you.