pixano.inference.types
¶
Pixano's inference types.
This module defines Pixano's own types for inference operations, independent of any specific inference backend.
CompressedRLEData(size, counts)
dataclass
¶
ImageMaskGenerationInput(image, model, image_embedding=None, high_resolution_features=None, reset_predictor=True, points=None, labels=None, boxes=None, num_multimask_outputs=3, multimask_output=True, return_image_embedding=False)
dataclass
¶
Input for image mask generation.
Attributes:
| Name | Type | Description |
|---|---|---|
image |
str
|
Image as base64 string or URL. |
model |
str
|
Model name to use. |
image_embedding |
NDArrayData | None
|
Pre-computed image embedding (optional). |
high_resolution_features |
list[NDArrayData] | None
|
Pre-computed high-res features (optional). |
reset_predictor |
bool
|
Whether to reset predictor state for new image. |
points |
list[list[list[int]]] | None
|
Points for mask generation [num_prompts, num_points, 2]. |
labels |
list[list[int]] | None
|
Labels for points [num_prompts, num_points]. |
boxes |
list[list[int]] | None
|
Bounding boxes [num_prompts, 4]. |
num_multimask_outputs |
int
|
Number of masks to generate per prompt. |
multimask_output |
bool
|
Whether to return multiple masks per prompt. |
return_image_embedding |
bool
|
Whether to return computed embeddings. |
ImageMaskGenerationOutput(masks, scores, image_embedding=None, high_resolution_features=None)
dataclass
¶
Output for image mask generation.
Attributes:
| Name | Type | Description |
|---|---|---|
masks |
list[list[CompressedRLEData]]
|
Generated masks [num_prompts, num_masks]. |
scores |
NDArrayData
|
Confidence scores. |
image_embedding |
NDArrayData | None
|
Computed image embedding (if requested). |
high_resolution_features |
list[NDArrayData] | None
|
Computed features (if requested). |
ImageMaskGenerationResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')
dataclass
¶
Complete result of image mask generation.
Attributes:
| Name | Type | Description |
|---|---|---|
data |
ImageMaskGenerationOutput
|
The output data. |
timestamp |
datetime
|
When the inference completed. |
processing_time |
float
|
Time taken in seconds. |
metadata |
dict[str, Any]
|
Additional metadata from the model. |
id |
str
|
Unique identifier for the inference request. |
status |
str
|
Status of the inference ("SUCCESS", "FAILURE"). |
ImageZeroShotDetectionInput(image, model, classes, box_threshold=0.5, text_threshold=0.5)
dataclass
¶
Input for zero-shot object detection.
Attributes:
| Name | Type | Description |
|---|---|---|
image |
str
|
Image as base64 string or URL. |
model |
str
|
Model name to use. |
classes |
list[str] | str
|
List of class names to detect. |
box_threshold |
float
|
Confidence threshold for boxes. |
text_threshold |
float
|
Confidence threshold for text matching. |
ImageZeroShotDetectionOutput(boxes, scores, classes)
dataclass
¶
ImageZeroShotDetectionResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')
dataclass
¶
Complete result of zero-shot object detection.
Attributes:
| Name | Type | Description |
|---|---|---|
data |
ImageZeroShotDetectionOutput
|
The output data. |
timestamp |
datetime
|
When the inference completed. |
processing_time |
float
|
Time taken in seconds. |
metadata |
dict[str, Any]
|
Additional metadata from the model. |
id |
str
|
Unique identifier for the inference request. |
status |
str
|
Status of the inference ("SUCCESS", "FAILURE"). |
ModelConfig(name, task, path=None, config=dict(), processor_config=dict())
dataclass
¶
Configuration for instantiating a model.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Name of the model. |
task |
str
|
Task of the model. |
path |
Path | str | None
|
Path to the model dump. |
config |
dict[str, Any]
|
Configuration of the model. |
processor_config |
dict[str, Any]
|
Configuration of the processor. |
to_dict()
¶
Convert to dictionary.
ModelInfo(name, task, model_path=None, model_class=None, provider=None)
dataclass
¶
Information about an available model.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Name of the model. |
task |
str
|
Task the model can perform. |
model_path |
str | None
|
Path to the model weights (optional). |
model_class |
str | None
|
Class name of the model (optional). |
provider |
str | None
|
Provider backend for the model (optional). |
NDArrayData(values, shape)
dataclass
¶
ProviderCapabilities(tasks, supports_batching=False, supports_streaming=False, max_image_size=None)
dataclass
¶
What a provider can do.
Attributes:
| Name | Type | Description |
|---|---|---|
tasks |
list[InferenceTask]
|
List of supported inference tasks. |
supports_batching |
bool
|
Whether the provider supports batch processing. |
supports_streaming |
bool
|
Whether the provider supports streaming responses. |
max_image_size |
int | None
|
Maximum supported image size (optional). |
ServerInfo(app_name, app_version, app_description, num_cpus, num_gpus, num_nodes, gpus_used, gpu_to_model, models, models_to_task)
dataclass
¶
Information about the inference server.
Attributes:
| Name | Type | Description |
|---|---|---|
app_name |
str
|
Application name. |
app_version |
str
|
Application version string. |
app_description |
str
|
Application description. |
num_cpus |
int | None
|
Number of CPUs available (None if unknown). |
num_gpus |
int
|
Number of GPUs available. |
num_nodes |
int
|
Number of nodes in the cluster. |
gpus_used |
list[int]
|
List of GPU indices currently in use. |
gpu_to_model |
dict[str, str]
|
Mapping of GPU index to model name. |
models |
list[str]
|
List of loaded model names. |
models_to_task |
dict[str, str]
|
Mapping of model names to their tasks. |
TextImageConditionalGenerationInput(model, prompt, images=None, max_new_tokens=100, temperature=1.0)
dataclass
¶
Input for text-image conditional generation.
Attributes:
| Name | Type | Description |
|---|---|---|
model |
str
|
Model name to use. |
prompt |
str | list[dict[str, Any]]
|
Prompt as string or list of message dicts. |
images |
list[str | Path] | None
|
Optional list of image paths/base64 strings. |
max_new_tokens |
int
|
Maximum tokens to generate. |
temperature |
float
|
Sampling temperature. |
TextImageConditionalGenerationOutput(generated_text, usage, generation_config=dict())
dataclass
¶
TextImageConditionalGenerationResult(data, timestamp, processing_time, metadata, id='', status='SUCCESS')
dataclass
¶
Complete result of text-image conditional generation.
Attributes:
| Name | Type | Description |
|---|---|---|
data |
TextImageConditionalGenerationOutput
|
The output data. |
timestamp |
datetime
|
When the inference completed. |
processing_time |
float
|
Time taken in seconds. |
metadata |
dict[str, Any]
|
Additional metadata from the model. |
id |
str
|
Unique identifier for the inference request. |
status |
str
|
Status of the inference ("SUCCESS", "FAILURE"). |
UsageInfo(prompt_tokens, completion_tokens, total_tokens)
dataclass
¶
VideoMaskGenerationInput(video, model, objects_ids, frame_indexes, points=None, labels=None, boxes=None)
dataclass
¶
Input for video mask generation.
Attributes:
| Name | Type | Description |
|---|---|---|
video |
list[str]
|
List of frame images as base64 or URLs. |
model |
str
|
Model name to use. |
objects_ids |
list[int]
|
IDs for each object to track. |
frame_indexes |
list[int]
|
Frame indices for prompts. |
points |
list[list[list[int]]] | None
|
Points for mask generation. |
labels |
list[list[int]] | None
|
Labels for points. |
boxes |
list[list[int]] | None
|
Bounding boxes. |
VideoMaskGenerationOutput(objects_ids, frame_indexes, masks)
dataclass
¶
VideoMaskGenerationResult(data, status, timestamp, processing_time, metadata, id='')
dataclass
¶
Complete result of video mask generation.
Attributes:
| Name | Type | Description |
|---|---|---|
data |
VideoMaskGenerationOutput
|
The output data. |
status |
str
|
Status of the inference ("SUCCESS", "FAILURE"). |
timestamp |
datetime
|
When the inference completed. |
processing_time |
float
|
Time taken in seconds. |
metadata |
dict[str, Any]
|
Additional metadata from the model. |
id |
str
|
Unique identifier for the inference request. |