Key Concepts
This page explains how Pixano organizes data. Understanding these concepts helps you design datasets, write import scripts, and get the most out of the UI.
Architecture overview
Pixano has three layers:
- Python backend — manages datasets: creation, queries, updates, statistics. You can use it standalone for scripting.
- REST API — a FastAPI server that exposes the backend over HTTP.
- Web UI — a Svelte/TypeScript app for browsing, annotating, and running AI-assisted tools.
Run the full stack with pixano server run, or use the Python backend directly in your own scripts.
How data is organized
A Pixano dataset is a collection of tables stored in a LanceDB database. Each table belongs to one of six schema groups:
| Group | Purpose | Schema classes |
|---|---|---|
| Record | One row per dataset sample. Holds metadata, split, and workflow status. | Record |
| View | Media files attached to a record. | Image, Video, SequenceFrame, Text |
| Entity | An object observable in one or more views (e.g. a car, a person). | Entity |
| Annotation | Labels attached to entities (bounding boxes, masks, keypoints, etc.). | BBox, CompressedRLE, KeyPoints, TextSpan, Message, Tracklet |
| EntityDynamicState | Per-frame state of an entity in video (e.g. visibility, pose). | EntityDynamicState |
| Embedding | Vectors for semantic search. | ViewEmbedding |
All schema classes live in pixano.schemas and are Pydantic models backed by LanceDB. You can subclass them to add custom fields.
Example: an object detection dataset
Consider an image dataset for detecting vehicles:
- Each record represents one image in the dataset (with a split like
trainorval). - The record has one view — the image file itself.
- The image contains several entities — physical objects you want to label (a car, a truck, a bicycle).
- Each entity has one or more annotations — e.g. a bounding box with coordinates.
- You can later add model predictions (YOLO, GroundingDINO, etc.) as additional annotations — provenance fields on the annotation let you distinguish human labels from model outputs.
- You compute embeddings (e.g. OpenCLIP) so the dataset becomes searchable — those go in the embedding table.
DatasetInfo
Every dataset is defined by a DatasetInfo that declares its name, workspace type, and schema:
from pixano.datasets import DatasetInfofrom pixano.datasets.workspaces import WorkspaceTypefrom pixano.schemas import BBox, Entity, Image, Record
class Vehicle(Entity): category: str = ""
dataset_info = DatasetInfo( name="Vehicle Detection", workspace=WorkspaceType.IMAGE, record=Record, entity=Vehicle, bbox=BBox, views={"image": Image},)The workspace selects the UI layout. The schema types you pass (record, entity, bbox, views, etc.) determine which tables are created in the database.
Custom fields
Each schema group maps to a base class you can subclass to add domain-specific fields:
from pixano.schemas import Entity
class DetectedObject(Entity): category: str = "" is_difficult: bool = False confidence: float = 0.0Fields must have default values (Pixano uses Pydantic under the hood).
Record fields
The base Record class provides these fields out of the box:
| Field | Type | Default | Description |
|---|---|---|---|
id | str | "" | Unique identifier for the record |
split | str | "default" | Dataset split (e.g. train, val, test) |
status | str | "new" | Workflow status: new, inProgress, inReview, validated |
comment | str | "" | Free-text note from the annotator |
created_at | datetime | now | Creation timestamp |
updated_at | datetime | now | Last modification timestamp |
The status field is central to annotation campaigns: filter the dataset explorer by status to see which items still need work, and use the dashboard to track overall progress. You can set status in metadata.jsonl during import (e.g. "status": "validated" for ground truth).
Annotation provenance
Provenance is tracked directly on annotations via three built-in fields: source_type (one of model, human, ground_truth, other), source_name, and source_metadata (a JSON string). This lets you distinguish human annotations from model predictions without a separate table.
Workspace types
The workspace type tells Pixano what kind of data you’re working with and which UI layout to display:
| Workspace | Data | UI layout |
|---|---|---|
IMAGE | Single image per item | Image viewer with annotation tools |
VIDEO | Frame sequence per item | Video player with timeline and track inspector |
IMAGE_VQA | Image + Q&A conversations | Image viewer + conversation panel |
IMAGE_TEXT_ENTITY_LINKING | Image + text | Image viewer + text panel with span selection |
Directory structure
Pixano organizes data around three directories, created by pixano init:
my_data/ library/ # All datasets (each is a LanceDB database) media/ # Images, videos, and other files referenced by views models/ # Model files for inference (e.g. SAM weights)Each dataset inside library/ has this layout:
library/<dataset>/ info.json # Name, description, workspace type, schema features_values.json # Value constraints for fields stats.json # Dataset statistics preview.png # Thumbnail for the UI db/ # LanceDB tables (one per schema)Python API
from pathlib import Pathfrom pixano.datasets import Dataset
# Open a datasetds = Dataset(Path("./my_data/library/my_dataset"))
# Read recordsrecords = ds.get_records(limit=20, skip=0)
# Read a specific tableimages = ds.get_data("image", limit=5)
# Add annotationsds.add_data("bboxes", [bbox1, bbox2])
# Semantic search (requires precomputed embeddings)records, distances, record_ids = ds.semantic_search("car on road", "image_embedding", limit=50)Next steps
Pick a use case to see these concepts in action:
- Object Detection — images + bounding boxes with pre-annotation
- Video Object Tracking — video frames + tracklets + SAM
- Multi-View Detection — synchronized multi-camera views
- Entity Linking — image + text annotation
- Visual Question Answering — image + conversation Q&A