Mongla — AUV System Wiki

01 — The Platform

Duburi 4.2 — hardware at a glance

Octagonal Marine 5083 aluminum hull, 8× Blue Robotics T200 thrusters in vectored_6dof configuration — same ArduSub frame as BlueROV2 Heavy.

Component	Spec
Hull	Octagonal Marine 5083 aluminum, in-house
Frame	vectored_6dof (8× T200) — same as BlueROV2 Heavy
Flight controller	Pixhawk 2.4.8 · ArduSub 4.x · EKF3
Companion SBC	Raspberry Pi 4B · BlueOS · MAVLink router
Mission SBC	Nvidia Jetson Orin Nano · all ROS2 nodes
Depth sensor	Bar30 (ArduSub AHRS2 altitude)
IMU	BNO085 on ESP32-C3 · USB CDC · gyro+accel
DVL	Nortek Nucleus1000 · 192.168.2.201 · TCP 9000
Cameras	Blue Robotics Low-Light HD USB (fwd + down)
Network	5-port onboard switch · FathomX PoE tether
Payload	Slingshot torpedo · aluminum grabber · solenoid dropper

03 — Motion Control

Open-loop vs closed-loop movement

Mongla uses three control modes depending on the axis. ArduSub owns the 400 Hz inner loops; Python shapes setpoints at 5–20 Hz.

Forward / Lateral

Open-loop, timed

Python sends RC_CHANNELS_OVERRIDE with a thrust percentage for a set duration. With smooth_translate:=true, a trapezoid_ramp shapes the envelope — easing in, cruise, easing out. The ease-out IS the brake. After: 200 ms reverse kick (constant mode) + 1.2 s settle.

Yaw / Depth

Closed-loop via ArduSub

Python streams setpoints — SET_ATTITUDE_TARGET for yaw at 10–20 Hz, SET_POSITION_TARGET_GLOBAL_INT for depth at 5 Hz. ArduSub's 400 Hz attitude and position PIDs close the actual loops. Python never fights the firmware.

Heading Lock

Hybrid P-loop @ 20 Hz

A Python daemon thread reads yaw_source, computes proportional error, and streams Ch4 yaw-rate overrides at 20 Hz. Translation commands run on top — they only write Ch5/Ch6, leaving Ch4 to the lock. Source-agnostic: AHRS, BNO085, or DVL heading.

05 — Vision → Control

How the AUV sees, aligns & follows

Bounding box geometry drives three independent control channels simultaneously. The camera frame is a normalized coordinate system — errors feed directly into P-gains.

Bounding box → thrust channels

cx_err = (bbox_cx − 0.5) / 0.5 # [−1, +1]
cy_err = (bbox_cy − 0.5) / 0.5
dist_err = target_bbox_h_frac − bbox_h/frame_h

yaw_pct = kp_yaw × cx_err # → Ch4
lat_pct = kp_lat × cx_err # → Ch6 (strafe)
fwd_pct = kp_fwd × dist_err # → Ch5
depth_Δ = kp_dep × cy_err # → depth setpoint

Vision verb modes

Verb	Active axes	Settling condition
vision_align_yaw	Ch4 only	\|cx_err\| < deadband for N frames
vision_align_lat	Ch6 only	\|cx_err\| < deadband
vision_align_depth	depth setpoint	\|cy_err\| < deadband
vision_hold_distance	Ch5	bbox_h_frac ≈ target ± tol
vision_align_3d	yaw+fwd+depth	all axes settled simultaneously

Live tuning during a run: all vision gains are ROS params — change them without restarting:
ros2 param set /duburi_manager vision.kp_yaw 80.0
ros2 param set /duburi_manager vision.deadband 0.06

Tracking mode: With --tracking true, VisionState reads from /tracks instead of /detections. ByteTrack+Kalman bbox is smoother and has stable IDs — better for slow-moving targets, occlusion, and low-confidence frames. Raw detections have lower latency (no tracking buffer).

08 — RoboSub 2026

Competition tasks & Mongla's approach

RoboSub is an international student competition for fully autonomous underwater vehicles. Tasks test perception, navigation, manipulation, and mission management — the exact capabilities Mongla is built for.

Competition format (RoboSub 2026): 15-minute autonomous run. No human interaction after the start signal. Points scored by completing tasks in sequence. Gate must be passed first. Source: robonation.gitbook.io/robosub-resources

Collecting Data — Gate

Pass through gate · choose side (reef shark / sawfish)

MANDATORY

Submerge to gate depth (set_depth -0.8), engage heading lock toward gate bearing
Drive forward, camera detects gate frame (YOLO class: gate) — vision_align_yaw centres it horizontally
YOLO detects divider colour (RED=reef shark / BLACK=sawfish) — select chosen side
vision_align_lat or lateral offset moves AUV to correct side of centre
move_forward through gate with heading lock — DVL confirms passage distance

Navigate the Channel — Slalom

Weave between red/white PVC buoy pairs

Path markers (orange 4ft × 6in) on pool floor — downward camera detects and aligns heading
Detect red buoy (left) and white buoy (right) of each pair via forward camera
Pass between each pair: move_lateral to correct offset, heading lock maintains forward direction
DVL measures forward progress between pairs — prevents overshoot

Drop a BRUVS — Bin

Drop markers into correct half of bin

Descend over bin using Bar30 depth control, downward camera detects bin and divider line
YOLO classifies target half (based on chosen animal: reef shark side)
vision_align_3d centres AUV horizontally over correct half using downward camera
Solenoid dropper releases marker at correct depth above bin

Tagging — Torpedoes

Fire torpedoes through target openings on board

Acoustic pinger (hydrophone) guides initial approach to torpedo board area
Forward camera detects target board and classifies openings (YOLO classes: large/small circles)
vision_align_3d --axes yaw,depth aligns torpedo tube with target opening
vision_hold_distance holds correct standoff for torpedo trajectory
Fire slingshot torpedo — repeat for second opening with repositioning

Ocean Cleanup — Octagon

Surface inside octagon · face image · collect trash · place in baskets

MAX POINTS

Acoustic pinger guides navigation to octagon location
AUV surfaces inside octagon frame using depth setpoint = 0
Forward camera detects the reference image on octagon wall; vision_align_yaw faces it
Arm grabber, manoeuvre with vision_align_3d to collect floating trash objects
Place collected objects into correct basket (classified by YOLO from visual markers)
Maximum bonus points: collect multiple pieces and sort correctly

RoboSub 2026 Task Descriptions — RoboNation Team Handbook §3.2 · robonation.gitbook.io
Competition Sequence of Events — §3.4 · robonation.gitbook.io

09 — Mission Execution

Full run: from CLI to competition completion

A complete RoboSub run as a Mongla mission script. One YAML-like DSL drives all subsystems in sequence.

Mission DSL — actual code

# missions/robosub_prequal.py
def run(duburi, log):
    duburi.arm()
    duburi.set_depth(-0.8)
    duburi.lock_heading(target=0, timeout=180)

    # ── GATE ──────────────────────────────
    log("approaching gate")
    duburi.vision.find(camera='laptop',
                       target='gate',
                       sweep='yaw_right')
    duburi.vision.yaw(target='gate',
                      duration=10, camera='laptop')
    duburi.move_forward_dist(distance_m=4.0, gain=60)

    # ── BIN ───────────────────────────────
    log("finding bin")
    duburi.set_depth(-1.5)
    duburi.vision.lock(axes='yaw,forward,depth',
                       target='bin',
                       camera='downward',
                       duration=12)
    duburi.drop()  # solenoid dropper

    # ── TORPEDO ───────────────────────────
    duburi.set_depth(-0.8)
    duburi.vision.lock(axes='yaw,depth',
                       target='torpedo_board',
                       distance=0.4, duration=15)
    duburi.fire_torpedo(1)

    duburi.unlock_heading()
    duburi.disarm()

Live-add missions: drop any missions/your_name.py exposing def run(duburi, log), rebuild duburi_planner, and it appears in ros2 run duburi_planner mission --list instantly. No registry edit needed.

State during a vision_align_3d

10 — Interactive Simulator

Vision alignment — live control simulator

Drag the target inside the camera frame or use the sliders to see how bounding-box errors map to real RC override values. All maths match the live codebase exactly.

↖ drag the target · or use the sliders on the right

Kp yaw 1.2

Kp forward 0.8

Kp depth 0.6

Deadband 0.06

Target bbox size (distance) 0.45

Bbox X (horizontal) 0.5

Bbox Y (vertical) 0.5

Bbox height fraction 0.3

YAW Ch4

FWD Ch5

DEPTH Δ

Ch4 PWM

1500µs

cx_err = (bbox_cx − 0.5) / 0.5
cy_err = (bbox_cy − 0.5) / 0.5
dist_err = target_frac − bbox_h_frac
yaw_pct = Kp_yaw × cx_err (clamped ±18%)
fwd_pct = Kp_fwd × dist_err
dep_Δ = Kp_dep × cy_err × 0.1

Vision align yaw

Centre target horizontally

Only Ch4 driven. AUV yaws until cx_err < deadband. Heading lock suspends during this — vision becomes the yaw authority for the duration.

Vision hold distance

Stand-off via bbox height

Target bbox_h_frac maps to physical distance. Ch5 drives proportionally to dist_err. Approach if bbox too small; back off if too large.

Vision align 3D

All axes simultaneously

Yaw + forward + depth errors computed every 50 ms from the same detection. Each axis has its own deadband and gain — settle condition is all three simultaneously within threshold.

11 — Simulation

Gazebo SITL — BlueROV2 sim target

ArduSub SITL + Gazebo Harmonic gives a faithful vectored_6dof 8-thruster sandbox. Mongla runs identically against sim or real hardware — only the connection profile changes.

T1 — ArduSub SITL

sim_vehicle.py \
  -L RATBeach \
  -v ArduSub \
  -f vectored_6dof \
  --model=JSON \
  --out=udp:0.0.0.0:14550 \
  --out=udp:127.0.0.1:14551 \
  --console

T2 — Gazebo world

gz sim -v 3 -r \
  bluerov2_underwater.world

# GZ_SIM_RESOURCE_PATH must
# include bluerov2_gz/models
# and bluerov2_gz/worlds

T3 — Manager + drive

ros2 run duburi_manager start

duburi arm
duburi set_depth --target -0.5
duburi move_forward \
  --duration 5 --gain 60
duburi disarm

Sim parity: BlueROV2 Heavy in Gazebo uses the same vectored_6dof 8-thruster ArduSub frame as Duburi 4.2. Mass, hull shape, and payload differ — but all motion verbs, MAVLink messages, and sensor paths work identically. Develop in sim, deploy at pool without code changes.

10 — Theory & References

Concepts & citations

Mongla is built on well-established robotics, control theory, and computer vision foundations. These are the primary sources for the concepts used.

Control Theory

PID Control

Proportional-Integral-Derivative control — the foundation of depth and heading stabilisation. ArduSub implements cascaded PID (angle → rate) for attitude and altitude.

Åström & Hägglund, "PID Controllers: Theory, Design and Tuning" ISA 1995
ArduPilot attitude control: ardupilot.org
Wikipedia: PID controller

State Estimation

EKF3 & Sensor Fusion

Extended Kalman Filter fuses IMU, magnetometer, barometer, and (optionally) DVL to produce the vehicle's pose estimate. ArduSub EKF3 runs at 400 Hz onboard Pixhawk.

Kalman, R.E. "A New Approach to Linear Filtering" ASME 1960
Thrun et al. "Probabilistic Robotics" MIT Press 2005
ArduPilot EKF3: ardupilot.org/dev/docs/ekf3.html

Computer Vision

YOLO Object Detection

You Only Look Once — single-pass CNN for real-time detection. YOLO v26 on Jetson GPU delivers ~30 Hz detection of gates, buoys, bins, and other RoboSub objects.

Redmon et al., "You Only Look Once" CVPR 2016 · arxiv
Ultralytics YOLO: docs.ultralytics.com
Wikipedia: YOLO

Object Tracking

ByteTrack + Kalman

ByteTrack associates every detection (not just high-confidence ones) to tracks, bridging occlusion gaps. Per-track Kalman smooths bounding box jitter for stable vision control.

Zhang et al., "ByteTrack" ECCV 2022 · arxiv
Filterpy Kalman: filterpy.readthedocs.io
Wikipedia: Kalman filter

Navigation

DVL Dead-Reckoning

Doppler Velocity Log measures velocity relative to seabed via acoustic Doppler shift. Integrating v(t) over time gives position — enabling GPS-denied closed-loop distance moves.

Nortek Nucleus 1000 Technical Manual · nortek.com
Wikipedia: Acoustic Doppler
Dead reckoning: en.wikipedia.org

Middleware

ROS 2 & MAVLink

Robot Operating System 2 (Humble) provides pub/sub, actions, and parameters. MAVLink is a lightweight binary protocol used to command ArduSub and receive telemetry.

ROS 2 Humble: docs.ros.org/en/humble
MAVLink 2.0: mavlink.io
ArduSub: ardusub.com

Duburi 4.2 — hardware at a glance

All subsystems in one view

Open-loop vs closed-loop movement

Vision pipeline: camera → detection → tracking

How the AUV sees, aligns & follows

Bounding box → thrust channels

Vision verb modes

Yaw, position & state estimation

Complete system operation

Competition tasks & Mongla's approach

Full run: from CLI to competition completion

Mission DSL — actual code

State during a vision_align_3d

Vision alignment — live control simulator

Gazebo SITL — BlueROV2 sim target

T1 — ArduSub SITL

T2 — Gazebo world

T3 — Manager + drive

Concepts & citations