SigLIP on Jetson for Real-Time Edge Inference

You, Sun Mar 01 2026 • ml

An 800M-parameter SigLIP vision encoder needed to run in real time on a Jetson Orin Nano for an on-device security application. Out of the box it was nowhere near fast enough. The final configuration reached 63ms per frame (15.8 FPS) through INT8 quantization with NVIDIA ModelOpt, TensorRT engine compilation, and unlocking the Orin's Super power mode via nvpmodel. Along the way I debugged a kernel regression in the NvMap allocator (CVE-2025-33182) that capped pinned memory allocations, resolved with a targeted kernel module swap rather than a full downgrade. The project was a good reminder that edge ML is half model optimization, half systems work.2026 © Gabriel Noya.RSS