CameraWebServer_for_esp-arduino_3.0.x — Project-specific notes

This is the firmware half of the ESP32-S3 + OV2640/OV5640 image pipeline, one of three hardware pipelines in the PhotogrammetricWAAM stack. See the parent overview for the bigger picture and how this pipeline relates to the IMX708 (RPi) and DSLR (gphoto2) pipelines.

Upstream README.md is preserved as-is — it is just an Espressif/Seeed compatibility note. This doc captures the project-specific divergence: our DEVICE_ID convention, MQTT telemetry, port layout, and the open role-switching work.


Hardware

  • MCU board: Seeed Studio XIAO ESP32-S3 Sense (PSRAM-equipped — required).
  • Sensor: OV2640 (default board) or OV5640 (5 MP daughter sensor). Both work with the same sketch; only frame_size upper-bound differs.
  • Network: WiFi (most boards) or wired ETH via an external PHY (the "TSO" boards — see esp32s3_eth.tmuxp.yml).
  • PSRAM is required for any frame_size above SVGA — the firmware refuses high-res mode if psramFound() is false.

Per-board identity — the DEVICE_ID convention

Every flashed board is uniquely identified by one preprocessor define near the top of the sketch:

#define DEVICE_ID 143

That single number drives everything else on the network:

Derived valuePatternExample for DEVICE_ID=143
Static IP address172.31.1.<DEVICE_ID>172.31.1.143
MQTT log topicesp32s3/<DEVICE_ID>/logesp32s3/143/log
MQTT temperature topicesp32s3/<DEVICE_ID>/tempesp32s3/143/temp
MQTT RSSI topicesp32s3/<DEVICE_ID>/rssiesp32s3/143/rssi
MQTT client idesp32s3-<DEVICE_ID>esp32s3-143

To deploy a new board: change just that one number, reflash, done.

A small preprocessor trick concatenates DEVICE_ID into MQTT topic strings at compile time:

#define _STRINGIFY(x) #x
#define _TOSTRING(x) _STRINGIFY(x)
#define DEVICE_ID_STR _TOSTRING(DEVICE_ID)

const char *mqtt_topic = "esp32s3/" DEVICE_ID_STR "/log";

Network endpoints

A flashed board exposes two HTTP servers and one MQTT client:

EndpointPortServerPurpose
http://172.31.1.<id>:81/stream81esp_camera httpdMJPG stream (the one consumed by image_publisher_node)
http://172.31.1.<id>:81/81esp_camera httpdEspressif's stock HTML control UI (sliders for resolution, quality, etc.)
http://172.31.1.<id>:8080/update8080WebServer + ElegantOTAOTA firmware update
mqtt://172.31.1.252:18831883PubSubClient (outbound)Telemetry publishing (one-way today)

:81/stream is the canonical Espressif port — different from the IMX708 streamer's :8000/stream. Wire that into all image_publisher_node filenames accordingly.

MQTT telemetry topics

The firmware publishes (one-way, no callback) at fixed intervals:

TopicIntervalPayload
esp32s3/<DEVICE_ID>/log1 sA monotonic counter (sanity / liveness check)
esp32s3/<DEVICE_ID>/temp5 sInternal core temperature in °C (temperatureRead() — ±5–10 °C)
esp32s3/<DEVICE_ID>/rssi5 sWiFi RSSI in dBm (negative; closer to 0 == stronger)

Reconnect uses non-blocking 2-second backoff so a missing broker never stalls the camera/HTTP loop.


Current camera operating point (as flashed)

The firmware is presently hard-pinned to a role-(1)-leaning still-quality configuration. This is not yet runtime-switchable — see Open work below.

// From CameraWebServer_for_esp-arduino_3.0.x.ino, setup()
config.frame_size      = FRAMESIZE_5MP;          // 2592 × 1944 (OV5640 limit)
config.jpeg_quality    = 27;                     // (lower number = higher quality after the override below)
config.fb_count        = 1;                      // explicit override of the more usual fb_count=2
config.grab_mode       = CAMERA_GRAB_LATEST;
config.fb_location     = CAMERA_FB_IN_PSRAM;
config.xclk_freq_hz    = 20_000_000;
config.pixel_format    = PIXFORMAT_JPEG;

// Then via sensor_t setters:
s->set_framesize(s, FRAMESIZE_5MP);
s->set_quality(s, 6);                            // 6 = high quality (range 0..63, lower = better)
s->set_wb_mode(s, 3);                            // fixed white balance mode
s->set_exposure_ctrl(s, 0);                      // AE off
s->set_aec_value(s, 800);                        // manual exposure value
s->set_gain_ctrl(s, 0);                          // AGC off
s->set_whitebal(s, 0);                           // AWB off
s->set_awb_gain(s, 0);                           // AWB gain off
s->set_raw_gma(s, 1);                            // gamma correction on

Why this combination is role-(1)-leaning today:

  • FRAMESIZE_5MP + jpeg_quality=6 produces ~150–400 KB JPGs at the largest size the sensor can natively output.
  • fb_count=1 means only one frame's worth of PSRAM is reserved — necessary to fit a 5 MP JPG in PSRAM at all on the XIAO.
  • All the …_ctrl(s, 0) / set_aec_value / set_whitebal(0) calls lock exposure / gain / white balance, which is what you want for photogrammetry (frame-to-frame consistency) but not what you want for a viewfinder in changing lighting.

Despite this still-leaning configuration the boards are also currently acting as role (2) MJPG streamers — the ROS 2 host pulls :81/stream and republishes at 16 Hz (WiFi) / 24 Hz (ETH). They get away with it because the JPGs are small enough to push, but the encode latency is higher than it needs to be for real-time monitoring.


Open work — role switching (todo-esp32s3-fb-roles)

The firmware needs runtime support for switching between role (1) and role (2) without reflashing. Target settings, copied from ros2_ws/edge/README.md:

SettingRole (1) HQ stillsRole (2) low-latency MJPG
config.fb_count1 (max single-frame size in PSRAM)2 (pipeline encoder, hide latency)
config.frame_sizeFRAMESIZE_5MP (2592×1944)FRAMESIZE_HD or _SVGA
set_quality()4–6 (highest quality)12–20 (smaller frames)
config.grab_modeCAMERA_GRAB_WHEN_EMPTYCAMERA_GRAB_LATEST
set_exposure_ctrlmanual, locked AEC valueauto
set_whitebalmanual, locked WBauto OK

Why fb_count matters specifically on this MCU:

  • Role (1)fb_count=1 is essentially mandatory. With the OV5640 at full 5 MP and JPG quality 6, a single framebuffer can already approach the PSRAM ceiling; a second buffer either won't allocate or will force a smaller resolution.
  • Role (2)fb_count=2 lets the JPG encoder run on buffer N while the sensor DMA is filling buffer N+1, hiding encode latency end-to-end. This is the dominant lever for "fastest stream".

Switch trigger candidates (decision pending — see also todo-mqtt-bridge):

  1. New MQTT topic esp32s3/<DEVICE_ID>/role/{request,response} that takes {"role": "stills" | "stream"} and reconfigures the sensor in-place. This is the cleanest fit with the existing mqtt__gphoto2_delegate contract and would make the ESP32-S3 schedulable by the batch_request_delegate.
  2. New HTTP endpoint :81/role?… parallel to Espressif's existing /control. Faster to wire up but doesn't unify the control plane.

Implementation hazard: esp_camera_init cannot be re-run without esp_camera_deinit() first, and switching frame_size at runtime is safer through the sensor_t setters than through a full reinit. Test on a single board before fleet rollout.


OTA flashing

ElegantOTA is mounted on the secondary WebServer at port 8080:

http://172.31.1.<DEVICE_ID>:8080/update

Drag-and-drop the new .bin from PlatformIO/Arduino IDE's build artifacts. The board reboots into the new firmware on completion.


Quick checks at boot

The serial console prints (at 115200 baud):

BEGIN SETUP
===========
...
v0.0
FIRMWARE COMPILED: Apr 8th, 2025
JPEG quality 6
awb: OFF
framebuffer - 2          # (note: subsequently overridden to 1 — see informConnectionURL())
CAMERA_GRAB_LATEST
VFLIP!!!
HMIRROR!!!

Camera Stream: http://172.31.1.143:81/stream
OTA Update:    http://172.31.1.143:8080/update

DEVICE_ID    : 143
MQTT broker  : 172.31.1.252:1883
MQTT client  : esp32s3-143
MQTT topics  : esp32s3/143/log , esp32s3/143/temp , esp32s3/143/rssi
OVERRIDING FB COUNT - 1

Use this as a checklist when bringing up a new board.


Related