Home/Insights/Engineering

Deploying transformer models on $400 of edge hardware

The cheapest path to "computer vision in your venue" is a Jetson Orin Nano and an existing RTSP camera. The hard part is getting a transformer model fast enough to make decisions in real time.

What we learned

Quantize aggressively. INT8 with calibration on real customer footage cuts inference latency by ~3x with negligible accuracy loss.

Batch where you can. If the same Orin watches three or four streams, micro-batching across cameras gives another 1.5x.

Don't underestimate the I/O. Decoding 1080p @ 30fps off RTSP eats more CPU than the model itself. Use NVDEC.

All insights