Key Concepts

This page explains how Pixano organizes data. Understanding these concepts helps you design datasets, write import scripts, and get the most out of the UI.

Architecture overview

Pixano has three layers:

Python backend — manages datasets: creation, queries, updates, statistics. You can use it standalone for scripting.
REST API — a FastAPI server that exposes the backend over HTTP.
Web UI — a Svelte/TypeScript app for browsing, annotating, and running AI-assisted tools.

Run the full stack with pixano server run, or use the Python backend directly in your own scripts.

How data is organized

A Pixano dataset is a collection of tables stored in a LanceDB database. Each table belongs to one of six schema groups:

Group	Purpose	Schema classes
Record	One row per dataset sample. Holds metadata, split, and workflow status.	`Record`
View	Media files attached to a record.	`Image`, `Video`, `SequenceFrame`, `Text`
Entity	An object observable in one or more views (e.g. a car, a person).	`Entity`
Annotation	Labels attached to entities (bounding boxes, masks, keypoints, etc.).	`BBox`, `CompressedRLE`, `KeyPoints`, `TextSpan`, `Message`, `Tracklet`
EntityDynamicState	Per-frame state of an entity in video (e.g. visibility, pose).	`EntityDynamicState`
Embedding	Vectors for semantic search.	`ViewEmbedding`

All schema classes live in pixano.schemas and are Pydantic models backed by LanceDB. You can subclass them to add custom fields.

Example: an object detection dataset

Consider an image dataset for detecting vehicles:

Each record represents one image in the dataset (with a split like train or val).
The record has one view — the image file itself.
The image contains several entities — physical objects you want to label (a car, a truck, a bicycle).
Each entity has one or more annotations — e.g. a bounding box with coordinates.
You can later add model predictions (YOLO, GroundingDINO, etc.) as additional annotations — provenance fields on the annotation let you distinguish human labels from model outputs.
You compute embeddings (e.g. OpenCLIP) so the dataset becomes searchable — those go in the embedding table.

DatasetInfo

Every dataset is defined by a DatasetInfo that declares its name, workspace type, and schema:

from pixano.datasets import DatasetInfo
from pixano.datasets.workspaces import WorkspaceType
from pixano.schemas import BBox, Entity, Image, Record


class Vehicle(Entity):
    category: str = ""


dataset_info = DatasetInfo(
    name="Vehicle Detection",
    workspace=WorkspaceType.IMAGE,
    record=Record,
    entity=Vehicle,
    bbox=BBox,
    views={"image": Image},
)

The workspace selects the UI layout. The schema types you pass (record, entity, bbox, views, etc.) determine which tables are created in the database.

Custom fields

Each schema group maps to a base class you can subclass to add domain-specific fields:

from pixano.schemas import Entity

class DetectedObject(Entity):
    category: str = ""
    is_difficult: bool = False
    confidence: float = 0.0

Fields must have default values (Pixano uses Pydantic under the hood).

Record fields

The base Record class provides these fields out of the box:

Field	Type	Default	Description
`id`	`str`	`""`	Unique identifier for the record
`split`	`str`	`"default"`	Dataset split (e.g. `train`, `val`, `test`)
`status`	`str`	`"new"`	Workflow status: `new`, `inProgress`, `inReview`, `validated`
`comment`	`str`	`""`	Free-text note from the annotator
`created_at`	`datetime`	now	Creation timestamp
`updated_at`	`datetime`	now	Last modification timestamp

The status field is central to annotation campaigns: filter the dataset explorer by status to see which items still need work, and use the dashboard to track overall progress. You can set status in metadata.jsonl during import (e.g. "status": "validated" for ground truth).

Annotation provenance

No separate Source table

Provenance is tracked directly on annotations via three built-in fields: source_type (one of model, human, ground_truth, other), source_name, and source_metadata (a JSON string). This lets you distinguish human annotations from model predictions without a separate table.

Workspace types

The workspace type tells Pixano what kind of data you’re working with and which UI layout to display:

Workspace	Data	UI layout
`IMAGE`	Single image per item	Image viewer with annotation tools
`VIDEO`	Frame sequence per item	Video player with timeline and track inspector
`IMAGE_VQA`	Image + Q&A conversations	Image viewer + conversation panel
`IMAGE_TEXT_ENTITY_LINKING`	Image + text	Image viewer + text panel with span selection

Directory structure

Pixano organizes data around three directories, created by pixano init:

my_data/
  library/   # All datasets (each is a LanceDB database)
  media/     # Images, videos, and other files referenced by views
  models/    # Model files for inference (e.g. SAM weights)

Each dataset inside library/ has this layout:

library/<dataset>/
  info.json               # Name, description, workspace type, schema
  features_values.json    # Value constraints for fields
  stats.json              # Dataset statistics
  preview.png             # Thumbnail for the UI
  db/                     # LanceDB tables (one per schema)

Python API

from pathlib import Path
from pixano.datasets import Dataset

# Open a dataset
ds = Dataset(Path("./my_data/library/my_dataset"))

# Read records
records = ds.get_records(limit=20, skip=0)

# Read a specific table
images = ds.get_data("image", limit=5)

# Add annotations
ds.add_data("bboxes", [bbox1, bbox2])

# Semantic search (requires precomputed embeddings)
records, distances, record_ids = ds.semantic_search("car on road", "image_embedding", limit=50)

Next steps

Pick a use case to see these concepts in action:

Object Detection — images + bounding boxes with pre-annotation
Video Object Tracking — video frames + tracklets + SAM
Multi-View Detection — synchronized multi-camera views
Entity Linking — image + text annotation
Visual Question Answering — image + conversation Q&A