IoTEdge AIIntegration

Raspberry Pi 5 + AI HAT+: Build a React Native Companion App for Edge AI Devices

UUnknown

2026-01-24

10 min read

Build a secure React Native companion app for Pi 5 + AI HAT+ 2: telemetry, control, WebSocket streams, and remote access best practices for 2026.

Hook: Get real-time control of on-device generative AI — without long release cycles or fragile integrations

Shipping production-grade mobile tools to monitor and control generative models running on a Raspberry Pi 5 with the AI HAT+ 2 is now realistic in 2026. If you’re battling slow iteration cycles, flaky third-party components, and uncertain remote access for edge devices, this guide gives a practical blueprint: an end-to-end React Native companion app that securely monitors, controls, and visualizes on-device AI via WebSocket and modern remote tunneling approaches.

Executive summary (most important first)

What you’ll build: a production-ready React Native companion that connects to a Pi 5 + AI HAT+ 2, receives inference telemetry over a secure WebSocket (wss), sends control commands, and displays real-time visualizations and logs. The app supports secure remote connections (Cloudflare Tunnel / Tailscale / WebRTC signaling), OTA model updates, and CI-driven builds and tests.

Why now (2026): Hardware and runtimes matured in late 2025—AI HAT+ 2 unlocked quantized LLMs at the edge on Pi 5 using optimized backends (ggml/gguf variants, on-device NN acceleration). Security and tunnel tooling (Cloudflare Zero Trust, Tailscale + WireGuard, modern WebRTC data channels) are mainstream for remote device access.

Takeaways:

Use secure WebSocket (wss) with JWT + optional mTLS for device authentication.
Prefer Cloudflare Tunnel or Tailscale for reliable remote access; fall back to signed WebRTC data channels for P2P.
Implement telemetry, command, and model-update channels separately to simplify QoS and retries.
Integrate tests and CI: unit, E2E (Detox), and OTA release pipelines (EAS/AppCenter/CodePush).

System architecture (high level)

Design modular services between the Pi and the mobile app. Keep the device-side lightweight and resilient.

Pi 5 + AI HAT+ 2: Local runtime (e.g., ggml/llama.cpp forks adapted for Pi), a small Node.js or Python gateway exposing a secure WebSocket and a REST admin endpoint.
Remote Access: Cloudflare Tunnel or Tailscale for secure HTTP/WSS reachability. Optional WebRTC for P2P low-latency channels.
Companion App: React Native (TypeScript), WebSocket client, state management (Zustand/Redux), and visualization (react-native-svg / react-native-reanimated).
CI/CD: GitHub Actions builds, unit/E2E tests (Jest + Detox), and OTA distribution (Expo EAS or Microsoft App Center / CodePush as needed).

Device-side: minimal, robust server on the Pi

Keep the Pi agent as a small, well-tested process that exposes two things: a control & telemetry WebSocket server and an authenticated REST API for model provisioning and logs.

Why Node.js or Python?

Both have robust WebSocket ecosystems. For architectures where inference is Python-native, colocate a lightweight Python FastAPI process. For pure JS stacks or when using wasm/ggml bindings, Node.js is straightforward.

Example: Node.js WebSocket server (ws)

// pi-agent/server.js
const https = require('https');
const fs = require('fs');
const WebSocket = require('ws');

const server = https.createServer({
  key: fs.readFileSync('/etc/ssl/private/host.key'),
  cert: fs.readFileSync('/etc/ssl/certs/host.crt')
});

const wss = new WebSocket.Server({ server });

wss.on('connection', (ws, req) => {
  // Basic JWT check placeholder - validate token from query or subprotocol
  // Accepts telemetry subscriptions and command messages
  ws.on('message', (msg) => {
    const data = JSON.parse(msg.toString());
    handleMessage(ws, data);
  });
});

server.listen(8443);

Keep message payloads small (JSON slim), send periodic health pings, and implement reconnect/backoff on the client.

Message model (recommended)

Define clear message types to separate concerns:

telemetry: continuous metrics (CPU, GPU/NPU usage, latency, token throughput)
inference_output: partial/streaming tokens for generative responses
command_ack: confirmations for control commands
error: structured errors

{
  "type": "telemetry",
  "payload": {
    "cpu": 32.1,
    "npu": 67.3,
    "latency_ms": 120,
    "tokens_per_sec": 10
  },
  "ts": 1670000000000
}

Companion app: React Native architecture

Choose an app skeleton that aligns with your release strategy. For many teams in 2026, a TypeScript React Native project with a modular state store (Zustand), background tasks, and OTA support is the sweet spot.

Core libraries and why

TypeScript: compile-time safety for protocol messages.
React Native (0.72+): latest stable in 2026 with improved Hermes performance and JSI bindings.
Zustand: lightweight state for telemetry streams; easier to test than large Redux setups.
react-native-svg / victory-native: efficient charts for streaming metrics.
Detox: reliable native E2E testing for CI pipelines.

WebSocket client (sample TypeScript)

// mobile/src/services/piSocket.ts
import { Platform } from 'react-native';

export type Message = { type: string; payload?: any; ts?: number };

export class PiSocket {
  ws: WebSocket | null = null;
  token: string;

  constructor(token: string) {
    this.token = token;
  }

  connect(host: string) {
    const url = `${host.replace(/^http/, 'wss')}/ws?token=${this.token}`;
    this.ws = new WebSocket(url);

    this.ws.onopen = () => console.log('connected');
    this.ws.onmessage = (e) => this.handleMessage(e.data);
    this.ws.onclose = () => setTimeout(() => this.connect(host), 2000);
    this.ws.onerror = (e) => console.error('ws error', e);
  }

  handleMessage(raw: string) {
    try {
      const msg: Message = JSON.parse(raw);
      // dispatch to store / event handlers
    } catch (err) {
      console.warn('bad msg', err);
    }
  }

  send(cmd: Message) {
    this.ws?.send(JSON.stringify(cmd));
  }
}

Implement exponential backoff with jitter for reconnects. Keep the socket lifecycle in a custom hook or singleton service for easier testing.

Secure remote connectivity strategies (practical)

Remote access is the most sensitive part. The wrong choice creates attack surface or brittle NAT traversal. Below are production-proven approaches in 2026.

1) Cloudflare Tunnel + Zero Trust (recommended)

Pros: managed TLS, easy access control, WSS support, minimal infra. Cons: vendor lock-in if not planned.

Flow: run cloudflared on the Pi to expose local WSS (no inbound firewall changes). Use Cloudflare Access to enforce identity and short-lived JWTs.

2) Tailscale / WireGuard

Pros: private mesh network, low-latency, good for fleets. Cons: requires client to be on tailscale network or use subnet routing.

Use Tailscale SSH and ACLs to limit who can reach the device. For mobile, embed Tailscale’s mobile client or require team machines to be on the same tailnet.

3) WebRTC DataChannel with signaling server

Pros: peer-to-peer, low latency for streaming tokens. Cons: complexity and need for STUN/TURN.

Combine a short-lived cloud signaling server to negotiate a direct data channel. Authenticate with a signed challenge JWT to confirm the client identity.

Authenticated WebSocket example

// Client sends a signed JWT on connect or uses mTLS
ws = new WebSocket('wss://pi.example.dev/ws', [], {
  headers: { Authorization: `Bearer ${jwt}` }
});

UI patterns for monitoring and control

Design the UI for constant streams: separate telemetry, live inference stream, and admin controls. Use lightweight visualizations that prioritize clarity over decoration.

Telemetry dashboard

Live charts: CPU/NPU, latency, tokens/sec (sliding window).
Alerts: thresholds (e.g., NPU > 85%) trigger notifications and recommended actions.
Logs: tail inference logs with search/filters.

Control panel

Model selection & provisioning: select gguf artifact and trigger signed OTA update.
Runtime params: temperature, max tokens, sampling method.
Operational controls: pause/resume inference, restart process, and capture heap dumps.

Model updates and supply chain security

OTA model updates must be signed and verified on the Pi. Treat model artifacts as first-class software packages.

Sign model artifacts with a devops key (Ed25519/ECDSA).
On Pi, verify signature before replacing model files.
Keep model version metadata accessible over the REST admin API for the companion app to show provenance.

DevOps and CI integration (practical checklist)

Automate builds, tests, and releases. Use infrastructure-as-code for tunnels and secrets.

CI pipeline (GitHub Actions): run TypeScript compile, Jest unit tests, and static analysis.
Build matrix: Android (Hermes) and iOS (Hermes/JSI) artifacts built in CI; produce debug/test flavors for E2E.
E2E tests: Detox on Android/iOS emulator as part of PR checks. Include scenarios: connect to simulated Pi (mock WebSocket), stream telemetry, and execute command flows.
OTA model release pipeline: sign artifact in CI and publish to artifact repo (S3-like). Trigger device agent update via webhook or when device checks in.
Secrets: store signing keys and JWT private keys in a secrets manager. Use short-lived credentials in CI.

Sample GitHub Action job (outline)

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install node
        uses: actions/setup-node@v4
        with: { node-version: '20' }
      - name: Install deps
        run: yarn
      - name: Run unit tests
        run: yarn test
      - name: Build Android
        run: yarn android:ci
      - name: Run Detox
        run: yarn detox:test

Operational considerations and reliability

Plan for flaky networks and device reboots:

Implement idempotent commands on the Pi (retries must be safe).
Store last-known-state on the Pi and in the app to avoid conflicting commands after reconnects.
Use exponential backoff and sequence numbers for command ordering.
Emit health heartbeats and use a TTL for telemetry — stale values should be visually indicated.

Performance and UX tips for large inference streams

Streaming tokens from on-device LLMs can be high-volume. Don’t treat tokens as raw logs.

Buffer tokens and diff-render to avoid thousands of tiny UI updates. Batch small windows (e.g., 50–200 ms).
Offer a low-bandwidth mode: send only meta-telemetry and sampling of tokens when on cellular.
Compress telemetry messages (e.g., protobuf with gzip) for constrained networks. For broader latency and edge patterns see the mass-cloud sessions playbook.

Security checklist

Mutual auth: JWT + optional client certs for the highest assurance. Designing robust identity and permissions benefits from Zero Trust guidance like Zero Trust for Generative Agents.
Least privilege: restrict control commands to operator roles; use Access lists (Cloudflare / Tailscale ACLs).
Signed model artifacts; verify digests and signatures on-device.
Audit logs: append-only logs for command events and critical model updates (send to remote SIEM when network permits).
Pen-testing: schedule regular device and network pentests; test for replay attacks on command channels.

2026 trends and future-proofing

Late 2025 and early 2026 saw several trends that should influence your design:

Edge quantized LLMs (gguf, int8/4) are standard — plan for model size variants.
On-device accelerators and NPU support are improving; design telemetry to include accelerator metrics and power usage.
Privacy regulations and device attestation: include hardware-backed attestation when possible. For thinking about privacy-first on-device personalization, see Designing Privacy-First Personalization with On-Device Models.
Growing adoption of Zero Trust and identity-first device access (Cloudflare / Tailscale) — avoid building bespoke VPNs.

"In 2026, running generative AI at the edge is less about feasibility and more about operational and security discipline."

Troubleshooting quick guide

No connect from mobile: check Cloudflare/Tailscale status, verify JWT and server cert fingerprints.
High latency: inspect NPU usage, swap to a smaller model or increase token batching. See low-latency streaming techniques at VideoTool's low-latency playbook.
Log flooding: enable sampling mode or downsample telemetry at the device.
OTA failure: ensure signature verification and checksum match; keep fallbacks to previous model version.

Sample roadmap to production (6–8 weeks)

Week 1: Pi agent prototype; WebSocket server + simulated inference data.
Week 2: React Native app skeleton, WebSocket client, basic UI showing telemetry.
Week 3: Secure remote access (Cloudflare Tunnel / Tailscale) and authentication flows.
Week 4: Model update pipeline + signing; on-device verification and fallback logic.
Week 5: E2E tests (Detox), CI integration, and build automation.
Week 6–8: Hardening (mTLS, logging, pentest), UX polish, roll out to pilot devices.

Actionable checklist before you ship

Document the message schema and provide TypeScript types for consumers.
Automate signing of model artifacts in CI and require verification on device.
Implement role-based access for control features in the app.
Test network loss cases and command idempotency.
Integrate E2E tests into PR workflow; require passing for release branches.

Conclusion & next steps

Edge AI on Raspberry Pi 5 + AI HAT+ 2 is production-ready in 2026, but the differentiator is how well you operationalize security, remote access, and developer experience. Use verified tunnels (Cloudflare/Tailscale), signed model pipelines, and a small robust Pi agent. On the mobile side, prefer TypeScript, persistent WebSockets with JWT/mTLS, and light, testable UIs that surface the right telemetry at a glance.

Get started now: scaffold a TypeScript React Native project, implement the simple WebSocket client above, and prototype with a simulated Pi agent. Move to a managed tunnel for remote access and add model signing in your CI as next steps.

Call to action

Ready to ship a production-ready companion app? Clone our starter kit (includes Pi agent, React Native app, and GitHub Actions workflow) and follow the step-by-step README to deploy a secure pilot in under a week. If you want a tailored architecture review or a CI pipeline for your team, contact our builder services and accelerate your edge AI rollout.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.