`pixano.inference.types` ¶

Pixano's inference types.

This module defines Pixano's own types for inference operations, independent of any specific inference backend.

`CompressedRLEData(size, counts)` `dataclass` ¶

Compressed RLE mask data.

Attributes:

Name	Type	Description
`size`	`list[int]`	Mask size as [height, width].
`counts`	`bytes`	Mask RLE encoding as bytes.

`from_dict(data)` `classmethod` ¶

Create from dictionary.

Source code in pixano/inference/types.py

@classmethod
def from_dict(cls, data: dict[str, Any]) -> "CompressedRLEData":
    """Create from dictionary."""
    counts = data["counts"]
    if isinstance(counts, str):
        counts = counts.encode("utf-8")
    return cls(size=data["size"], counts=counts)

`to_dict()` ¶

Convert to dictionary.

Source code in pixano/inference/types.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary."""
    return {
        "size": self.size,
        "counts": self.counts.decode("utf-8") if isinstance(self.counts, bytes) else self.counts,
    }

`ImageMaskGenerationInput(image, model, image_embedding=None, high_resolution_features=None, reset_predictor=True, points=None, labels=None, boxes=None, num_multimask_outputs=3, multimask_output=True, return_image_embedding=False)` `dataclass` ¶

Input for image mask generation.

Attributes:

Name	Type	Description
`image`	`str`	Image as base64 string or URL.
`model`	`str`	Model name to use.
`image_embedding`	`NDArrayData \| None`	Pre-computed image embedding (optional).
`high_resolution_features`	`list[NDArrayData] \| None`	Pre-computed high-res features (optional).
`reset_predictor`	`bool`	Whether to reset predictor state for new image.
`points`	`list[list[list[int]]] \| None`	Points for mask generation [num_prompts, num_points, 2].
`labels`	`list[list[int]] \| None`	Labels for points [num_prompts, num_points].
`boxes`	`list[list[int]] \| None`	Bounding boxes [num_prompts, 4].
`num_multimask_outputs`	`int`	Number of masks to generate per prompt.
`multimask_output`	`bool`	Whether to return multiple masks per prompt.
`return_image_embedding`	`bool`	Whether to return computed embeddings.

`ImageMaskGenerationOutput(masks, scores, image_embedding=None, high_resolution_features=None)` `dataclass` ¶

Output for image mask generation.

Attributes:

Name	Type	Description
`masks`	`list[list[CompressedRLEData]]`	Generated masks [num_prompts, num_masks].
`scores`	`NDArrayData`	Confidence scores.
`image_embedding`	`NDArrayData \| None`	Computed image embedding (if requested).
`high_resolution_features`	`list[NDArrayData] \| None`	Computed features (if requested).

`ImageMaskGenerationResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')` `dataclass` ¶

Complete result of image mask generation.

Attributes:

Name	Type	Description
`data`	`ImageMaskGenerationOutput`	The output data.
`timestamp`	`datetime`	When the inference completed.
`processing_time`	`float`	Time taken in seconds.
`metadata`	`dict[str, Any]`	Additional metadata from the model.
`id`	`str`	Unique identifier for the inference request.
`status`	`str`	Status of the inference ("SUCCESS", "FAILURE").

`ImageZeroShotDetectionInput(image, model, classes, box_threshold=0.5, text_threshold=0.5)` `dataclass` ¶

Input for zero-shot object detection.

Attributes:

Name	Type	Description
`image`	`str`	Image as base64 string or URL.
`model`	`str`	Model name to use.
`classes`	`list[str] \| str`	List of class names to detect.
`box_threshold`	`float`	Confidence threshold for boxes.
`text_threshold`	`float`	Confidence threshold for text matching.

`ImageZeroShotDetectionOutput(boxes, scores, classes)` `dataclass` ¶

Output for zero-shot object detection.

Attributes:

Name	Type	Description
`boxes`	`list[list[int]]`	Detected bounding boxes as [x1, y1, x2, y2].
`scores`	`list[float]`	Confidence scores for each detection.
`classes`	`list[str]`	Class names for each detection.

`ImageZeroShotDetectionResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')` `dataclass` ¶

Complete result of zero-shot object detection.

Attributes:

Name	Type	Description
`data`	`ImageZeroShotDetectionOutput`	The output data.
`timestamp`	`datetime`	When the inference completed.
`processing_time`	`float`	Time taken in seconds.
`metadata`	`dict[str, Any]`	Additional metadata from the model.
`id`	`str`	Unique identifier for the inference request.
`status`	`str`	Status of the inference ("SUCCESS", "FAILURE").

`InferenceTask` ¶

Bases: str, Enum

Tasks supported by Pixano inference providers.

`ModelConfig(name, task, path=None, config=dict(), processor_config=dict())` `dataclass` ¶

Configuration for instantiating a model.

Attributes:

Name	Type	Description
`name`	`str`	Name of the model.
`task`	`str`	Task of the model.
`path`	`Path \| str \| None`	Path to the model dump.
`config`	`dict[str, Any]`	Configuration of the model.
`processor_config`	`dict[str, Any]`	Configuration of the processor.

`to_dict()` ¶

Convert to dictionary.

Source code in pixano/inference/types.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary."""
    return {
        "name": self.name,
        "task": self.task,
        "path": str(self.path) if self.path else None,
        "config": self.config,
        "processor_config": self.processor_config,
    }

`ModelInfo(name, task, model_path=None, model_class=None, provider=None)` `dataclass` ¶

Information about an available model.

Attributes:

Name	Type	Description
`name`	`str`	Name of the model.
`task`	`str`	Task the model can perform.
`model_path`	`str \| None`	Path to the model weights (optional).
`model_class`	`str \| None`	Class name of the model (optional).
`provider`	`str \| None`	Provider backend for the model (optional).

`NDArrayData(values, shape)` `dataclass` ¶

N-dimensional array data.

Attributes:

Name	Type	Description
`values`	`list[float]`	Flat list of values.
`shape`	`list[int]`	Shape of the array.

`from_dict(data)` `classmethod` ¶

Create from dictionary.

Source code in pixano/inference/types.py

@classmethod
def from_dict(cls, data: dict[str, Any]) -> "NDArrayData":
    """Create from dictionary."""
    return cls(values=data["values"], shape=data["shape"])

`to_dict()` ¶

Convert to dictionary.

Source code in pixano/inference/types.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary."""
    return {"values": self.values, "shape": self.shape}

`ProviderCapabilities(tasks, supports_batching=False, supports_streaming=False, max_image_size=None)` `dataclass` ¶

What a provider can do.

Attributes:

Name	Type	Description
`tasks`	`list[InferenceTask]`	List of supported inference tasks.
`supports_batching`	`bool`	Whether the provider supports batch processing.
`supports_streaming`	`bool`	Whether the provider supports streaming responses.
`max_image_size`	`int \| None`	Maximum supported image size (optional).

`ServerInfo(app_name, app_version, app_description, num_cpus, num_gpus, num_nodes, gpus_used, gpu_to_model, models, models_to_task)` `dataclass` ¶

Information about the inference server.

Attributes:

Name	Type	Description
`app_name`	`str`	Application name.
`app_version`	`str`	Application version string.
`app_description`	`str`	Application description.
`num_cpus`	`int \| None`	Number of CPUs available (None if unknown).
`num_gpus`	`int`	Number of GPUs available.
`num_nodes`	`int`	Number of nodes in the cluster.
`gpus_used`	`list[int]`	List of GPU indices currently in use.
`gpu_to_model`	`dict[str, str]`	Mapping of GPU index to model name.
`models`	`list[str]`	List of loaded model names.
`models_to_task`	`dict[str, str]`	Mapping of model names to their tasks.

`TextImageConditionalGenerationInput(model, prompt, images=None, max_new_tokens=100, temperature=1.0)` `dataclass` ¶

Input for text-image conditional generation.

Attributes:

Name	Type	Description
`model`	`str`	Model name to use.
`prompt`	`str \| list[dict[str, Any]]`	Prompt as string or list of message dicts.
`images`	`list[str \| Path] \| None`	Optional list of image paths/base64 strings.
`max_new_tokens`	`int`	Maximum tokens to generate.
`temperature`	`float`	Sampling temperature.

`TextImageConditionalGenerationOutput(generated_text, usage, generation_config=dict())` `dataclass` ¶

Output for text-image conditional generation.

Attributes:

Name	Type	Description
`generated_text`	`str`	The generated text response.
`usage`	`UsageInfo`	Token usage information.
`generation_config`	`dict[str, Any]`	Generation configuration used.

`TextImageConditionalGenerationResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')` `dataclass` ¶

Complete result of text-image conditional generation.

Attributes:

Name	Type	Description
`data`	`TextImageConditionalGenerationOutput`	The output data.
`timestamp`	`datetime`	When the inference completed.
`processing_time`	`float`	Time taken in seconds.
`metadata`	`dict[str, Any]`	Additional metadata from the model.
`id`	`str`	Unique identifier for the inference request.
`status`	`str`	Status of the inference ("SUCCESS", "FAILURE").

`UsageInfo(prompt_tokens, completion_tokens, total_tokens)` `dataclass` ¶

Token usage information.

Attributes:

Name	Type	Description
`prompt_tokens`	`int`	Number of tokens in the prompt.
`completion_tokens`	`int`	Number of tokens generated.
`total_tokens`	`int`	Total tokens used.

`VideoMaskGenerationInput(video, model, objects_ids, frame_indexes, points=None, labels=None, boxes=None)` `dataclass` ¶

Input for video mask generation.

Attributes:

Name	Type	Description
`video`	`list[str]`	List of frame images as base64 or URLs.
`model`	`str`	Model name to use.
`objects_ids`	`list[int]`	IDs for each object to track.
`frame_indexes`	`list[int]`	Frame indices for prompts.
`points`	`list[list[list[int]]] \| None`	Points for mask generation.
`labels`	`list[list[int]] \| None`	Labels for points.
`boxes`	`list[list[int]] \| None`	Bounding boxes.

`VideoMaskGenerationOutput(objects_ids, frame_indexes, masks)` `dataclass` ¶

Output for video mask generation.

Attributes:

Name	Type	Description
`objects_ids`	`list[int]`	IDs of tracked objects.
`frame_indexes`	`list[int]`	Frame indices for each mask.
`masks`	`list[CompressedRLEData]`	Generated masks for each frame.

`VideoMaskGenerationResult(data, status, timestamp, processing_time, metadata, id='')` `dataclass` ¶

Complete result of video mask generation.

Attributes:

Name	Type	Description
`data`	`VideoMaskGenerationOutput`	The output data.
`status`	`str`	Status of the inference ("SUCCESS", "FAILURE").
`timestamp`	`datetime`	When the inference completed.
`processing_time`	`float`	Time taken in seconds.
`metadata`	`dict[str, Any]`	Additional metadata from the model.
`id`	`str`	Unique identifier for the inference request.

pixano.inference.types ¶

CompressedRLEData(size, counts) dataclass ¶

from_dict(data) classmethod ¶

to_dict() ¶

ImageMaskGenerationInput(image, model, image_embedding=None, high_resolution_features=None, reset_predictor=True, points=None, labels=None, boxes=None, num_multimask_outputs=3, multimask_output=True, return_image_embedding=False) dataclass ¶

ImageMaskGenerationOutput(masks, scores, image_embedding=None, high_resolution_features=None) dataclass ¶

ImageMaskGenerationResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS') dataclass ¶

ImageZeroShotDetectionInput(image, model, classes, box_threshold=0.5, text_threshold=0.5) dataclass ¶

ImageZeroShotDetectionOutput(boxes, scores, classes) dataclass ¶

ImageZeroShotDetectionResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS') dataclass ¶

InferenceTask ¶

ModelConfig(name, task, path=None, config=dict(), processor_config=dict()) dataclass ¶

to_dict() ¶

ModelInfo(name, task, model_path=None, model_class=None, provider=None) dataclass ¶

NDArrayData(values, shape) dataclass ¶

from_dict(data) classmethod ¶

to_dict() ¶

ProviderCapabilities(tasks, supports_batching=False, supports_streaming=False, max_image_size=None) dataclass ¶

ServerInfo(app_name, app_version, app_description, num_cpus, num_gpus, num_nodes, gpus_used, gpu_to_model, models, models_to_task) dataclass ¶

TextImageConditionalGenerationInput(model, prompt, images=None, max_new_tokens=100, temperature=1.0) dataclass ¶

TextImageConditionalGenerationOutput(generated_text, usage, generation_config=dict()) dataclass ¶

TextImageConditionalGenerationResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS') dataclass ¶

UsageInfo(prompt_tokens, completion_tokens, total_tokens) dataclass ¶

VideoMaskGenerationInput(video, model, objects_ids, frame_indexes, points=None, labels=None, boxes=None) dataclass ¶

VideoMaskGenerationOutput(objects_ids, frame_indexes, masks) dataclass ¶

VideoMaskGenerationResult(data, status, timestamp, processing_time, metadata, id='') dataclass ¶

`pixano.inference.types` ¶

`CompressedRLEData(size, counts)` `dataclass` ¶

`from_dict(data)` `classmethod` ¶

`to_dict()` ¶

`ImageMaskGenerationInput(image, model, image_embedding=None, high_resolution_features=None, reset_predictor=True, points=None, labels=None, boxes=None, num_multimask_outputs=3, multimask_output=True, return_image_embedding=False)` `dataclass` ¶

`ImageMaskGenerationOutput(masks, scores, image_embedding=None, high_resolution_features=None)` `dataclass` ¶

`ImageMaskGenerationResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')` `dataclass` ¶

`ImageZeroShotDetectionInput(image, model, classes, box_threshold=0.5, text_threshold=0.5)` `dataclass` ¶

`ImageZeroShotDetectionOutput(boxes, scores, classes)` `dataclass` ¶

`ImageZeroShotDetectionResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')` `dataclass` ¶

`InferenceTask` ¶

`ModelConfig(name, task, path=None, config=dict(), processor_config=dict())` `dataclass` ¶

`to_dict()` ¶

`ModelInfo(name, task, model_path=None, model_class=None, provider=None)` `dataclass` ¶

`NDArrayData(values, shape)` `dataclass` ¶

`from_dict(data)` `classmethod` ¶

`to_dict()` ¶

`ProviderCapabilities(tasks, supports_batching=False, supports_streaming=False, max_image_size=None)` `dataclass` ¶

`ServerInfo(app_name, app_version, app_description, num_cpus, num_gpus, num_nodes, gpus_used, gpu_to_model, models, models_to_task)` `dataclass` ¶

`TextImageConditionalGenerationInput(model, prompt, images=None, max_new_tokens=100, temperature=1.0)` `dataclass` ¶

`TextImageConditionalGenerationOutput(generated_text, usage, generation_config=dict())` `dataclass` ¶

`TextImageConditionalGenerationResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')` `dataclass` ¶

`UsageInfo(prompt_tokens, completion_tokens, total_tokens)` `dataclass` ¶

`VideoMaskGenerationInput(video, model, objects_ids, frame_indexes, points=None, labels=None, boxes=None)` `dataclass` ¶

`VideoMaskGenerationOutput(objects_ids, frame_indexes, masks)` `dataclass` ¶

`VideoMaskGenerationResult(data, status, timestamp, processing_time, metadata, id='')` `dataclass` ¶