Multi-View Detection
Work with synchronized multi-camera datasets. This guide uses the FLIR ADAS v2 dataset (RGB + thermal) and covers: import in image or video mode, browse synchronized views side by side, and annotate across modalities.
Prerequisites
- Pixano installed (see Installation)
- A data directory initialized with
pixano init ./my_data - The FLIR ADAS v2 dataset downloaded locally (video test sets)
This script uses the video test sets which contain time-synced RGB/thermal frame pairs documented in rgb_to_thermal_vid_map.json. The expected layout:
flir_adas_v2/ rgb_to_thermal_vid_map.json video_rgb_test/ coco.json data/ video-{vid}-frame-{n}-{hash}.jpg video_thermal_test/ coco.json data/ video-{vid}-frame-{n}-{hash}.jpgImport the dataset
FLIR ADAS v2 supports two import modes: image (individual frame pairs) and video (grouped by video sequence).
Image mode — static frame pairs
Generate a sample of individual RGB + thermal frame pairs:
python examples/flir/generate_sample.py ./flir_sample /path/to/flir_adas_v2 \ --mode image --num-samples 50The resulting folder:
flir_sample/ test/ rgb/ 000000.jpg 000001.jpg ... thermal/ 000000.jpg 000001.jpg ... metadata.jsonlEach line in metadata.jsonl pairs an RGB image with its thermal counterpart and includes bounding boxes detected in the thermal view:
{ "status": "validated", "views": { "rgb": "rgb/000000.jpg", "thermal": "thermal/000000.jpg" }, "entities": [ { "category": "person", "annotations": { "thermal": { "bbox": [0.12, 0.25, 0.15, 0.30] } } } ]}The dataset info for image mode (examples/flir/info.py:dataset_info):
from pixano.datasets import DatasetInfofrom pixano.datasets.workspaces import WorkspaceTypefrom pixano.schemas import BBox, Entity, Image, Record
class FLIREntity(Entity): category: str = ""
dataset_info = DatasetInfo( name="FLIR Static Sample", description="Sample import for FLIR static image pairs.", workspace=WorkspaceType.IMAGE, record=Record, entity=FLIREntity, bbox=BBox, views={"rgb": Image, "thermal": Image},)Note views has two entries: "rgb" and "thermal", both of type Image. This tells Pixano to display them side by side.
Import:
pixano data import ./my_data ./flir_sample \ --info examples/flir/info.py:dataset_infoVideo mode — synchronized sequences
Generate a sample grouped by video:
python examples/flir/generate_sample.py ./flir_video_sample /path/to/flir_adas_v2 \ --mode video --num-samples 5The resulting folder:
flir_video_sample/ test/ rgb/ video_000/*.jpg video_001/*.jpg thermal/ video_000/*.jpg video_001/*.jpg bboxes/ video_000/*.json video_001/*.json metadata.jsonlEach metadata entry uses glob patterns for frame sequences and per-frame annotation files:
{ "status": "validated", "views": { "rgb": { "path": "rgb/video_000/*.jpg", "fps": 30 }, "thermal": { "path": "thermal/video_000/*.jpg", "fps": 30 } }, "annotation_files": { "bbox": "bboxes/video_000/*.json" }}The dataset info for video mode (examples/flir/info.py:video_dataset_info):
from pixano.datasets import DatasetInfofrom pixano.datasets.workspaces import WorkspaceTypefrom pixano.schemas import ( BBox, Entity, EntityDynamicState, Record, SequenceFrame, Tracklet,)
class FLIREntity(Entity): category: str = ""
class FLIREntityDynamicState(EntityDynamicState): pass
video_dataset_info = DatasetInfo( name="FLIR Dynamic Sample", description="Sample import for FLIR dynamic synchronized frame sequences.", workspace=WorkspaceType.VIDEO, record=Record, entity=FLIREntity, entity_dynamic_state=FLIREntityDynamicState, tracklet=Tracklet, bbox=BBox, views={"rgb": SequenceFrame, "thermal": SequenceFrame},)Import:
pixano data import ./my_data ./flir_video_sample \ --info examples/flir/info.py:video_dataset_infoLaunch the server
pixano server run ./my_dataExplore in the UI
Multi-view image mode
When you open an item with multiple views, Pixano tiles the images side by side. You can:
- Pan and zoom each view independently.
- Drag views to rearrange them.
- Double-click a view to bring it to the front.
Annotations are per-view: a bounding box on the thermal image is separate from one on the RGB image, but both can be linked to the same entity.
Multi-view video mode
In video mode, the timeline synchronizes all views. When you play or step through frames, all views update together. The Video Inspector shows tracks across all views — tracklet bars are stacked by view on each track row.
You can annotate objects in both the RGB and thermal views. Create a bounding box in the thermal view, then switch to the RGB view and add another annotation for the same entity. Pixano links them through the shared entity reference.
Annotate
All annotation tools work the same as in single-view mode:
- Bounding boxes — click and drag on any view.
- Polygons and keypoints — available per view.
- Smart segmentation — if SAM embeddings are precomputed for a view, the magic wand tool works on that view.
In video mode, the associate tool lets you merge tracks across frames, just as with single-view video datasets.
Next steps
- Entity Linking — link image regions to text spans
- Video Object Tracking — single-view video with SAM
- API Reference — full Python API documentation