Camera Image Ingestion — Overview
This document is the single entry point for how images get from a physical sensor into the central ROS 2 host in the PhotogrammetricWAAM system.
There are exactly three hardware pipelines and two roles each camera
can serve. Everything else in this folder (and in
PhotogrammetricWAAM-Edge/ and
ros2_ws/launch/) is one concrete instance of that 3 × 2 matrix.
TL;DR — The 3 × 2 matrix
| Hardware | Edge host | Wire | Edge software | Role (1) HQ stills | Role (2) low-latency MJPG |
|---|---|---|---|---|---|
| IMX708 (Pi Cam 3) | Raspberry Pi (CSI) | WiFi/ETH | simple_picamera2_streamer/app.py — picamera2 → cv2.imencode → HTTP /stream /jpg /set | ✅ (planned) | ✅ (current default, 8 Hz) |
| OV2640 / OV5640 | XIAO ESP32-S3 Sense (DVP) | WiFi/ETH | CameraWebServer_for_esp-arduino_3.0.x.ino — esp_camera → MJPG on :81/stream | ✅ (planned) | ✅ (current default) |
| DSLR (Canon / Nikon / Sony) | Raspberry Pi (USB) | WiFi/ETH | mqtt__gphoto2_delegate.py — gphoto2 capture-and-download | ✅ (only role) | ✗ (not supported) |
ROS 2 client side (the kernel host) is uniform across all three: it runs
image_publisher_node against either an MJPG stream URL (roles 2, IMX708 + ESP32S3)
or consumes the on-disk JPG/RAW that the gphoto2 delegate dumps (role 1, DSLR
and on-demand IMX708/ESP32S3).
See ros2_ws/launch/image_publisher_client/README.md.
The two roles, in detail
Every photogrammetry-grade camera in this system serves one of two roles at any given moment. The IMX708 and the ESP32-S3-attached OV2640/OV5640 are capable of either role; the DSLR is permanently locked to role (1).
Role (1) — Highest-fidelity still producer
Goal: Best possible JPG (or RAW) of one moment in time, on demand or at a slow cadence. Latency does not matter.
- Maximum sensor resolution (e.g. IMX708 4608×2592, OV5640 2592×1944, DSLR ≥24 MP).
- Highest JPG quality (low quantisation) — or RAW where available.
- Capture is triggered, not free-running. One trigger → one (or one stack of) frames written to durable storage with a session ID.
- Consumed by the photogrammetry / SfM pipeline downstream — not by RViz/Foxglove.
- Control plane: MQTT (recipient-based topics — see
mqtt__gphoto2_delegate.spec.mdfor the canonical request/response shape).
Role (2) — Lowest-latency MJPG streamer
Goal: Real-time monitoring on the ROS 2 graph (visible in
rqt_image_view, Foxglove, RViz). Per-frame fidelity is sacrificed for timeliness.
- Down-rezzed and/or higher JPG compression (e.g. IMX708 at 2304×1296 @ 8 Hz, ESP32-S3 OV5640 typically HD/SVGA).
- Continuous MJPG over plain HTTP (
multipart/x-mixed-replace). - Encoded once at the edge, decoded once on the ROS 2 client by
image_publisher_node, republished assensor_msgs/Imageon a per-camera namespace (/cam0/image_raw,/xiao_143/image_raw, …). - Control plane: HTTP (
/seton RPi today; open work for ESP32-S3 — see TODOs below).
role-(1) role-(2)
"snapshot mode" "viewfinder mode"
┌────────────────────────┐ ┌─────────────────────────┐
any │ highest quality JPG │ │ smallest-possible JPG │
cam │ on demand, slow rate │ │ fast as possible, │
│ → durable storage │ │ free-running │
│ → SfM / photogrammetry │ │ → ROS 2 image topic │
│ → RViz/Foxglove via │ │ → live monitoring │
│ image_publisher of │ │ (rqt_image_view) │
│ a *file* path │ │ │
└────────────────────────┘ └─────────────────────────┘
▲ ▲
│ MQTT request/response │ HTTP GET /stream
│ (gphoto2-style topics) │ (multipart MJPG)
The three hardware pipelines, in detail
1. IMX708 on Raspberry Pi (CSI)
[ IMX708 sensor ]──CSI──▶[ Raspberry Pi ]──HTTP MJPG──▶[ ROS 2 host ]
libcamera/picamera2 image_publisher_node
+ cv2.imencode JPG → /camN/image_raw
app.py @ :8000
Edge: ros2_ws/edge/simple_picamera2_streamer/app.py.
A single Python process that owns the camera, runs a capture thread at the
configured FrameDurationLimits, and serves three endpoints:
| Endpoint | Method | Purpose |
|---|---|---|
/stream | GET | multipart/x-mixed-replace MJPG, frame-rate-locked to the capture loop (currently 8 Hz) |
/jpg | GET | One latest JPG frame (single-shot) |
/set | GET | Set ExposureTime, AnalogueGain, or LensPosition (puts AF into manual when LensPosition is given) |
Role today: running role (2) only — see TODO todo-imx708-fb-roles for the
mode-switch work.
Client: see Both tmuxp variants below.
2. OV2640 / OV5640 on XIAO ESP32-S3 Sense (DVP)
[ OV2640 / OV5640 ]──DVP──▶[ XIAO ESP32-S3 ]──HTTP MJPG──▶[ ROS 2 host ]
esp_camera + httpd image_publisher_node
CameraWebServer_for_*.ino → /xiao_NNN/image_raw
stream @ :81/stream
OTA @ :8080/update
telemetry → MQTT broker
Custom Arduino-ESP32 (3.0.x) firmware derived from Espressif's CameraWebServer.
Each board is statically configured by a single #define DEVICE_ID 1xx which
also drives:
- Static IP
172.31.1.<DEVICE_ID> - MQTT topics
esp32s3/<DEVICE_ID>/{log,temp,rssi} - MQTT client id
esp32s3-<DEVICE_ID>
The MJPG stream is served on port 81 (the canonical Espressif port —
not the same as the IMX708 streamer's :8000). OTA flashing lives on
:8080/update.
Currently the firmware is hard-configured for role (1)-leaning settings
(FRAMESIZE_5MP, set_quality(s, 6), set_aec_value(s, 800), awb=OFF,
fb_count=1, CAMERA_GRAB_LATEST) — see TODO todo-esp32s3-fb-roles
for the dynamic role-switching work.
Client: see Both tmuxp variants below.
3. DSLR on Raspberry Pi (USB)
[ Canon/Nikon/Sony ]──USB──▶[ Raspberry Pi ]──gphoto2 capture──▶[ shared FS ]
mqtt__gphoto2_delegate.py ─sync─▶ ROS 2 host
MQTT request/response image_publisher_node
→ /dslr_NN/image_raw
Edge:
PhotogrammetricWAAM-Blender-UI/02__STILLS/_EDGE_CAMERA_DAEMON/.../mqtt__gphoto2_delegate.py.
A Python service that subscribes to {hostname}/gphoto2 (or ALL/gphoto2),
shells out to gphoto2 --set-config … --capture-image-and-download …, writes
the result into a session-ID'd directory, and publishes a structured
{hostname}/gphoto2/response along with a
photogrammetry/sync/available notification for the file-sync layer.
Role today: role (1) only. DSLRs do not stream MJPG in this stack.
(gphoto2 --capture-movie exists but is intentionally out of scope —
the DSLR is the fidelity reference.)
SSH operator view: INBOX/TMUXP_VIEWS/DSLR.tmuxp.yml
opens parallel SSH sessions to the DSLR-hosting Pis (id2-rpi4.local, pi3m50.local).
Batch coordination across many DSLRs + Pi cams is the job of
batch_request_delegate.py —
one MQTT batch request fans out to N services and aggregates N responses
into a single batch response.
Edge ↔ ROS 2 contract — the two halves
Edge half (server)
| Pipeline | Server | Listens on | Output |
|---|---|---|---|
| IMX708 | simple_picamera2_streamer/app.py | TCP :8000 (HTTP) | MJPG /stream, JPG /jpg, control /set |
| ESP32-S3 | CameraWebServer_for_esp-arduino_3.0.x.ino | TCP :81 (httpd) + :8080 OTA | MJPG /stream, MQTT telemetry |
| DSLR | mqtt__gphoto2_delegate.py | MQTT {host}/gphoto2 | JPG/RAW file + MQTT response |
ROS 2 client half
The ROS 2 host runs image_publisher_node (one per camera URL or per file
path), which:
- Decodes the MJPG / JPG into an OpenCV
Mat. - Publishes
sensor_msgs/Imageon<__ns>/image_raw(andcamera_infoif provided). - Republishes at the rate set by
publish_rate.
Critical empirical finding (see
simple_picamera2_streamer/README.md):publish_rateMUST match the edge capture rate exactly, otherwise OpenCV internally buffers MJPG frames andrqt_image_viewshows stale frames from seconds in the past. With the IMX708 streamer at 8 Hz, the client must be launched withpublish_rate:=8.— not 7.9, not 10.
Two tmuxp launch styles exist for this client side, depending on where you're running from:
- Mac/laptop with pixi (no system ROS install):
pixi_image_publishers.yml— prefixes every command withpixi run -e kilted ros2 … - ROS host / docker container with
ros2on PATH:image_publishers.tmuxp.yml— same commands without thepixi runprefix.
Plus a parameterised Python launch file at
xiao_sense_esp32s3_eyes.py for the
ESP32-S3 fleet, and
esp32s3_eth.tmuxp.yml
for the wired ESP32-S3 + Lepton thermal mix.
See ros2_ws/launch/image_publisher_client/README.md
for the full namespace map and per-host IP allocation.
Open implementation work (tracked, not yet implemented)
These are intentional gaps — documented here so the architecture page is the source of truth, then mirrored in the project todo list.
todo-esp32s3-fb-roles — Runtime role switching on the ESP32-S3 XIAO
The OV2640/OV5640 firmware is currently hard-pinned to one operating point. Add a runtime mode-switch (over MQTT or HTTP) that reconfigures the camera without a reflash:
| Setting | Role (1) HQ stills | Role (2) low-latency MJPG |
|---|---|---|
config.fb_count | 1 (max single-frame size in PSRAM) | 2 (pipeline encoder, hide latency) |
config.frame_size | FRAMESIZE_5MP (2592×1944) | FRAMESIZE_HD or _SVGA |
set_quality() | low number = high quality (≈ 4–6) | higher number (≈ 12–20) |
config.grab_mode | CAMERA_GRAB_WHEN_EMPTY | CAMERA_GRAB_LATEST |
set_exposure_ctrl | manual, locked AEC value | auto |
set_whitebal | manual, locked WB | auto OK |
Rationale: with PSRAM at a premium, fb_count=1 lets a 5MP JPG actually fit; fb_count=2 hides JPG-encode latency for streaming.
todo-imx708-fb-roles — Mode switching in simple_picamera2_streamer/app.py
Today app.py is built around picam2.create_video_configuration(main={"size": (2304,1296)}, buffer_count=4) — fixed at role (2). Add an
endpoint (e.g. GET /mode?role=stills / …?role=stream) that:
- For role (1):
picam2.switch_mode_and_capture_file(...)against a still configuration at full sensor resolution, optionallyRAW+JPEG, then revert. - For role (2): keep the current free-running 8 Hz video path.
This makes one IMX708 host serve both the SfM batch capture and the live viewfinder without contention.
todo-mqtt-bridge — Unify the control plane
Right now control is heterogeneous:
- IMX708 is controlled by HTTP
GET /set?ExposureTime=…. - ESP32-S3 has only MQTT telemetry (log/temp/rssi) — control is via the
Espressif web UI on
:81/. - DSLR is fully MQTT (request/response, recipient-based).
Decide whether the RPi streamer and the ESP32-S3 firmware should adopt the same recipient-based MQTT contract as the gphoto2 delegate. If yes, the batch_request_delegate already aggregates across services and would Just Work.
See also
simple_picamera2_streamer/README.md— IMX708 pipeline detailros2_ws/launch/image_publisher_client/README.md— ROS 2 client side, namespace map, tmuxp variantsPhotogrammetricWAAM-Edge/.../CameraWebServer_for_esp-arduino_3.0.x/PROJECT_README.md— ESP32-S3 firmware specificsmqtt__gphoto2_delegate.spec.md— DSLR MQTT contractbatch_request_delegate.spec.md— multi-camera batch coordinator