Labelbox Python API reference¶
Client¶
- class labelbox.client.Client(api_key=None, endpoint='https://api.labelbox.com/graphql', enable_experimental=False, app_url='https://app.labelbox.com', rest_endpoint='https://api.labelbox.com/api/v1')[source]¶
Bases:
object
A Labelbox client.
Contains info necessary for connecting to a Labelbox server (URL, authentication key). Provides functions for querying and creating top-level data objects (Projects, Datasets).
- __init__(api_key=None, endpoint='https://api.labelbox.com/graphql', enable_experimental=False, app_url='https://app.labelbox.com', rest_endpoint='https://api.labelbox.com/api/v1')[source]¶
Creates and initializes a Labelbox Client.
Logging is defaulted to level WARNING. To receive more verbose output to console, update logging.level to the appropriate level.
>>> logging.basicConfig(level = logging.INFO) >>> client = Client("<APIKEY>")
- Parameters:
api_key (str) – API key. If None, the key is obtained from the “LABELBOX_API_KEY” environment variable.
endpoint (str) – URL of the Labelbox server to connect to.
enable_experimental (bool) – Indicates whether or not to use experimental features
app_url (str) – host url for all links to the web app
- Raises:
labelbox.exceptions.AuthenticationError – If no api_key is provided as an argument or via the environment variable.
- assign_global_keys_to_data_rows(global_key_to_data_row_inputs: List[Dict[str, str]], timeout_seconds=60) Dict[str, Union[str, List[Any]]] [source]¶
Assigns global keys to data rows.
- Parameters:
global_key. (A list of dicts containing data_row_id and) –
- Returns:
Dictionary containing ‘status’, ‘results’ and ‘errors’.
’Status’ contains the outcome of this job. It can be one of ‘Success’, ‘Partial Success’, or ‘Failure’.
’Results’ contains the successful global_key assignments, including global_keys that have been sanitized to Labelbox standards.
’Errors’ contains global_key assignments that failed, along with the reasons for failure.
Examples
>>> global_key_data_row_inputs = [ {"data_row_id": "cl7asgri20yvo075b4vtfedjb", "global_key": "key1"}, {"data_row_id": "cl7asgri10yvg075b4pz176ht", "global_key": "key2"}, ] >>> job_result = client.assign_global_keys_to_data_rows(global_key_data_row_inputs) >>> print(job_result['status']) Partial Success >>> print(job_result['results']) [{'data_row_id': 'cl7tv9wry00hlka6gai588ozv', 'global_key': 'gk', 'sanitized': False}] >>> print(job_result['errors']) [{'data_row_id': 'cl7tpjzw30031ka6g4evqdfoy', 'global_key': 'gk"', 'error': 'Invalid global key'}]
- static build_catalog_query(data_rows: Union[UniqueIds, GlobalKeys])[source]¶
Given a list of data rows, builds a query that can be used to fetch the associated data rows from the catalog.
- Parameters:
data_rows – A list of data rows. Can be either UniqueIds or GlobalKeys.
Returns: A query that can be used to fetch the associated data rows from the catalog.
- clear_global_keys(global_keys: List[str], timeout_seconds=60) Dict[str, Union[str, List[Any]]] [source]¶
Clears global keys for the data rows tha correspond to the global keys provided.
- Parameters:
keys (A list of global) –
- Returns:
Dictionary containing ‘status’, ‘results’ and ‘errors’.
’Status’ contains the outcome of this job. It can be one of ‘Success’, ‘Partial Success’, or ‘Failure’.
’Results’ contains a list global keys that were successfully cleared.
’Errors’ contains a list of global_keys correspond to the data rows that could not be modified, accessed by the user, or not found.
Examples
>>> job_result = client.clear_global_keys(["key1","key2","notfoundkey"]) >>> print(job_result['status']) Partial Success >>> print(job_result['results']) ['key1', 'key2'] >>> print(job_result['errors']) [{'global_key': 'notfoundkey', 'error': 'Failed to find data row matching provided global key'}]
- create_dataset(iam_integration='DEFAULT', **kwargs) Dataset [source]¶
Creates a Dataset object on the server.
Attribute values are passed as keyword arguments.
- Parameters:
iam_integration (IAMIntegration) – Uses the default integration. Optionally specify another integration or set as None to not use delegated access
**kwargs – Keyword arguments with Dataset attribute values.
- Returns:
A new Dataset object.
- Raises:
InvalidAttributeError – If the Dataset type does not contain any of the attribute names given in kwargs.
Examples
Create a dataset >>> dataset = client.create_dataset(name=”<dataset_name>”) Create a dataset with description >>> dataset = client.create_dataset(name=”<dataset_name>”, description=”<dataset_description>”)
- create_feature_schema(normalized)[source]¶
- Creates a feature schema from normalized data.
>>> normalized = {'tool': 'polygon', 'name': 'cat', 'color': 'black'} >>> feature_schema = client.create_feature_schema(normalized)
- Or use the Tool or Classification objects. It is especially useful for complex tools.
>>> normalized = Tool(tool=Tool.Type.BBOX, name="cat", color = 'black').asdict() >>> feature_schema = client.create_feature_schema(normalized)
- Subclasses are also supported
>>> normalized = Tool( tool=Tool.Type.SEGMENTATION, name="cat", classifications=[ Classification( class_type=Classification.Type.TEXT, name="name" ) ] ) >>> feature_schema = client.create_feature_schema(normalized)
- More details can be found here:
https://github.com/Labelbox/labelbox-python/blob/develop/examples/basics/ontologies.ipynb
- Parameters:
normalized (dict) – A normalized tool or classification payload. See above for details
- Returns:
The created FeatureSchema.
- create_model(name, ontology_id) Model [source]¶
Creates a Model object on the server.
>>> model = client.create_model(<model_name>, <ontology_id>)
- Parameters:
name (string) – Name of the model
ontology_id (string) – ID of the related ontology
- Returns:
A new Model object.
- Raises:
InvalidAttributeError – If the Model type does not contain any of the attribute names given in kwargs.
- create_ontology(name, normalized, media_type=None) Ontology [source]¶
- Creates an ontology from normalized data
>>> normalized = {"tools" : [{'tool': 'polygon', 'name': 'cat', 'color': 'black'}], "classifications" : []} >>> ontology = client.create_ontology("ontology-name", normalized)
- Or use the ontology builder. It is especially useful for complex ontologies
>>> normalized = OntologyBuilder(tools=[Tool(tool=Tool.Type.BBOX, name="cat", color = 'black')]).asdict() >>> ontology = client.create_ontology("ontology-name", normalized)
To reuse existing feature schemas, use create_ontology_from_feature_schemas() More details can be found here: https://github.com/Labelbox/labelbox-python/blob/develop/examples/basics/ontologies.ipynb
- Parameters:
name (str) – Name of the ontology
normalized (dict) – A normalized ontology payload. See above for details.
media_type (MediaType or None) – Media type of a new ontology
- Returns:
The created Ontology
- create_ontology_from_feature_schemas(name, feature_schema_ids, media_type=None) Ontology [source]¶
Creates an ontology from a list of feature schema ids
- Parameters:
name (str) – Name of the ontology
feature_schema_ids (List[str]) – List of feature schema ids corresponding to top level tools and classifications to include in the ontology
media_type (MediaType or None) – Media type of a new ontology
- Returns:
The created Ontology
- create_project(**kwargs) Project [source]¶
Creates a Project object on the server.
Attribute values are passed as keyword arguments.
>>> project = client.create_project( name="<project_name>", description="<project_description>", media_type=MediaType.Image, queue_mode=QueueMode.Batch )
- Parameters:
name (str) – A name for the project
description (str) – A short summary for the project
media_type (MediaType) – The type of assets that this project will accept
queue_mode (Optional[QueueMode]) – The queue mode to use
quality_mode (Optional[QualityMode]) – The quality mode to use (e.g. Benchmark, Consensus). Defaults to Benchmark
- Returns:
A new Project object.
- Raises:
InvalidAttributeError – If the Project type does not contain any of the attribute names given in kwargs.
- delete_feature_schema_from_ontology(ontology_id: str, feature_schema_id: str) DeleteFeatureFromOntologyResult [source]¶
Deletes or archives a feature schema from an ontology. If the feature schema is a root level node with associated labels, it will be archived. If the feature schema is a nested node in the ontology and does not have associated labels, it will be deleted. If the feature schema is a nested node in the ontology and has associated labels, it will not be deleted.
- Parameters:
ontology_id (str) – The ID of the ontology.
feature_schema_id (str) – The ID of the feature schema.
- Returns:
The result of the feature schema removal.
- Return type:
DeleteFeatureFromOntologyResult
Example
>>> client.delete_feature_schema_from_ontology(<ontology_id>, <feature_schema_id>)
- delete_unused_feature_schema(feature_schema_id: str) None [source]¶
Deletes a feature schema if it is not used by any ontologies or annotations :param feature_schema_id: The id of the feature schema to delete :type feature_schema_id: str
Example
>>> client.delete_unused_feature_schema("cleabc1my012ioqvu5anyaabc")
- delete_unused_ontology(ontology_id: str) None [source]¶
Deletes an ontology if it is not used by any annotations :param ontology_id: The id of the ontology to delete :type ontology_id: str
Example
>>> client.delete_unused_ontology("cleabc1my012ioqvu5anyaabc")
- execute(query=None, params=None, data=None, files=None, timeout=60.0, experimental=False, error_log_key='message')[source]¶
Sends a request to the server for the execution of the given query.
Checks the response for errors and wraps errors in appropriate labelbox.exceptions.LabelboxError subtypes.
- Parameters:
query (str) – The query to execute.
params (dict) – Query parameters referenced within the query.
data (str) – json string containing the query to execute
files (dict) – file arguments for request
timeout (float) – Max allowed time for query execution, in seconds.
- Returns:
dict, parsed JSON response.
- Raises:
labelbox.exceptions.AuthenticationError – If authentication failed.
labelbox.exceptions.InvalidQueryError – If query is not syntactically or semantically valid (checked server-side).
labelbox.exceptions.ApiLimitError – If the server API limit was exceeded. See “How to import data” in the online documentation to see API limits.
labelbox.exceptions.TimeoutError – If response was not received in timeout seconds.
labelbox.exceptions.NetworkError – If an unknown error occurred most likely due to connection issues.
labelbox.exceptions.LabelboxError – If an unknown error of any kind occurred.
ValueError – If query and data are both None.
- get_catalog_slice(slice_id) CatalogSlice [source]¶
Fetches a Catalog Slice by ID.
- Parameters:
slice_id (str) – The ID of the Slice
- Returns:
CatalogSlice
- get_data_row(data_row_id)[source]¶
- Returns:
returns a single data row given the data row id
- Return type:
- get_data_row_by_global_key(global_key: str) DataRow [source]¶
Returns: DataRow: returns a single data row given the global key
- get_data_row_ids_for_external_ids(external_ids: List[str]) Dict[str, List[str]] [source]¶
Returns a list of data row ids for a list of external ids. There is a max of 1500 items returned at a time.
- Parameters:
external_ids – List of external ids to fetch data row ids for
- Returns:
A dict of external ids as keys and values as a list of data row ids that correspond to that external id.
- get_data_row_ids_for_global_keys(global_keys: List[str], timeout_seconds=60) Dict[str, Union[str, List[Any]]] [source]¶
Gets data row ids for a list of global keys.
Deprecation Notice: This function will soon no longer return ‘Deleted Data Rows’ as part of the ‘results’. Global keys for deleted data rows will soon be placed under ‘Data Row not found’ portion.
- Parameters:
keys (A list of global) –
- Returns:
Dictionary containing ‘status’, ‘results’ and ‘errors’.
’Status’ contains the outcome of this job. It can be one of ‘Success’, ‘Partial Success’, or ‘Failure’.
’Results’ contains a list of the fetched corresponding data row ids in the input order. For data rows that cannot be fetched due to an error, or data rows that do not exist, empty string is returned at the position of the respective global_key. More error information can be found in the ‘Errors’ section.
’Errors’ contains a list of global_keys that could not be fetched, along with the failure reason
Examples
>>> job_result = client.get_data_row_ids_for_global_keys(["key1","key2"]) >>> print(job_result['status']) Partial Success >>> print(job_result['results']) ['cl7tv9wry00hlka6gai588ozv', 'cl7tv9wxg00hpka6gf8sh81bj'] >>> print(job_result['errors']) [{'global_key': 'asdf', 'error': 'Data Row not found'}]
- get_data_row_metadata_ontology() DataRowMetadataOntology [source]¶
- Returns:
The ontology for Data Row Metadata for an organization
- Return type:
- get_dataset(dataset_id) Dataset [source]¶
Gets a single Dataset with the given ID.
>>> dataset = client.get_dataset("<dataset_id>")
- Parameters:
dataset_id (str) – Unique ID of the Dataset.
- Returns:
The sought Dataset.
- Raises:
labelbox.exceptions.ResourceNotFoundError – If there is no Dataset with the given ID.
- get_datasets(where=None) PaginatedCollection [source]¶
Fetches one or more datasets.
>>> datasets = client.get_datasets(where=(Dataset.name == "<dataset_name>") & (Dataset.description == "<dataset_description>"))
- Parameters:
where (Comparison, LogicalOperation or None) – The where clause for filtering.
- Returns:
PaginatedCollection of all datasets the user has access to or datasets matching the criteria specified.
- get_feature_schema(feature_schema_id)[source]¶
Fetches a feature schema. Only supports top level feature schemas.
- Parameters:
feature_schema_id (str) – The id of the feature schema to query for
- Returns:
FeatureSchema
- get_feature_schemas(name_contains) PaginatedCollection [source]¶
Fetches top level feature schemas with names that match the name_contains string
- Parameters:
name_contains (str) – search filter for a name of a root feature schema If present, results in a case insensitive ‘like’ search for feature schemas If None, returns all top level feature schemas
- Returns:
PaginatedCollection of FeatureSchemas with names that match name_contains
- get_labeling_frontends(where=None) List[LabelingFrontend] [source]¶
Fetches all the labeling frontends.
>>> frontend = client.get_labeling_frontends(where=LabelingFrontend.name == "Editor")
- Parameters:
where (Comparison, LogicalOperation or None) – The where clause for filtering.
- Returns:
An iterable of LabelingFrontends (typically a PaginatedCollection).
- get_model(model_id) Model [source]¶
Gets a single Model with the given ID.
>>> model = client.get_model("<model_id>")
- Parameters:
model_id (str) – Unique ID of the Model.
- Returns:
The sought Model.
- Raises:
labelbox.exceptions.ResourceNotFoundError – If there is no Model with the given ID.
- get_model_run(model_run_id: str) ModelRun [source]¶
Gets a single ModelRun with the given ID.
>>> model_run = client.get_model_run("<model_run_id>")
- Parameters:
model_run_id (str) – Unique ID of the ModelRun.
- Returns:
A ModelRun object.
- get_model_slice(slice_id) ModelSlice [source]¶
Fetches a Model Slice by ID.
- Parameters:
slice_id (str) – The ID of the Slice
- Returns:
ModelSlice
- get_models(where=None) List[Model] [source]¶
Fetches all the models the user has access to.
>>> models = client.get_models(where=(Model.name == "<model_name>"))
- Parameters:
where (Comparison, LogicalOperation or None) – The where clause for filtering.
- Returns:
An iterable of Models (typically a PaginatedCollection).
- get_ontologies(name_contains) PaginatedCollection [source]¶
Fetches all ontologies with names that match the name_contains string.
- Parameters:
name_contains (str) – the string to search ontology names by
- Returns:
PaginatedCollection of Ontologies with names that match name_contains
- get_ontology(ontology_id) Ontology [source]¶
Fetches an Ontology by id.
- Parameters:
ontology_id (str) – The id of the ontology to query for
- Returns:
Ontology
- get_organization() Organization [source]¶
Gets the Organization DB object of the current user.
>>> organization = client.get_organization()
- get_project(project_id) Project [source]¶
Gets a single Project with the given ID.
>>> project = client.get_project("<project_id>")
- Parameters:
project_id (str) – Unique ID of the Project.
- Returns:
The sought Project.
- Raises:
labelbox.exceptions.ResourceNotFoundError – If there is no Project with the given ID.
- get_projects(where=None) PaginatedCollection [source]¶
Fetches all the projects the user has access to.
>>> projects = client.get_projects(where=(Project.name == "<project_name>") & (Project.description == "<project_description>"))
- Parameters:
where (Comparison, LogicalOperation or None) – The where clause for filtering.
- Returns:
PaginatedCollection of all projects the user has access to or projects matching the criteria specified.
- get_roles() List[Role] [source]¶
- Returns:
Provides information on available roles within an organization. Roles are used for user management.
- Return type:
Roles
- get_unused_feature_schemas(after: Optional[str] = None) List[str] [source]¶
Returns a list of unused feature schema ids :param after: The cursor to use for pagination :type after: str
- Returns:
A list of unused feature schema ids
Example
To get the first page of unused feature schema ids (100 at a time) >>> client.get_unused_feature_schemas() To get the next page of unused feature schema ids >>> client.get_unused_feature_schemas(“cleabc1my012ioqvu5anyaabc”)
- get_unused_ontologies(after: Optional[str] = None) List[str] [source]¶
Returns a list of unused ontology ids :param after: The cursor to use for pagination :type after: str
- Returns:
A list of unused ontology ids
Example
To get the first page of unused ontology ids (100 at a time) >>> client.get_unused_ontologies() To get the next page of unused ontology ids >>> client.get_unused_ontologies(“cleabc1my012ioqvu5anyaabc”)
- insert_feature_schema_into_ontology(feature_schema_id: str, ontology_id: str, position: int) None [source]¶
Inserts a feature schema into an ontology. If the feature schema is already in the ontology, it will be moved to the new position. :param feature_schema_id: The feature schema id to upsert :type feature_schema_id: str :param ontology_id: The id of the ontology to insert the feature schema into :type ontology_id: str :param position: The position number of the feature schema in the ontology :type position: int
Example
>>> client.insert_feature_schema_into_ontology("cleabc1my012ioqvu5anyaabc", "clefdvwl7abcgefgu3lyvcde", 2)
- is_feature_schema_archived(ontology_id: str, feature_schema_id: str) bool [source]¶
Returns true if a feature schema is archived in the specified ontology, returns false otherwise.
- Parameters:
feature_schema_id (str) – The ID of the feature schema
ontology_id (str) – The ID of the ontology
- Returns:
bool
- run_foundry_app(model_run_name: str, data_rows: Union[UniqueIds, GlobalKeys], app_id: str) Task [source]¶
Run a foundry app
- Parameters:
model_run_name (str) – Name of a new model run to store app predictions in
data_rows (DataRowIds or GlobalKeys) – Data row identifiers to run predictions on
app_id (str) – Foundry app to run predictions with
- send_to_annotate_from_catalog(destination_project_id: str, task_queue_id: Optional[str], batch_name: str, data_rows: Union[UniqueIds, GlobalKeys], params: SendToAnnotateFromCatalogParams)[source]¶
Sends data rows from catalog to a specified project for annotation.
- Example usage:
>>> task = client.send_to_annotate_from_catalog( >>> destination_project_id=DESTINATION_PROJECT_ID, >>> task_queue_id=TASK_QUEUE_ID, >>> batch_name="batch_name", >>> data_rows=UniqueIds([DATA_ROW_ID]), >>> params={ >>> "source_project_id": >>> SOURCE_PROJECT_ID, >>> "override_existing_annotations_rule": >>> ConflictResolutionStrategy.OverrideWithAnnotations >>> }) >>> task.wait_till_done()
- Parameters:
destination_project_id – The ID of the project to send the data rows to.
task_queue_id – The ID of the task queue to send the data rows to. If not specified, the data rows will be sent to the Done workflow state.
batch_name – The name of the batch to create. If more than one batch is created, additional batches will be named with a monotonically increasing numerical suffix, starting at “_1”.
data_rows – The data rows to send to the project.
params – Additional parameters to configure the job. See SendToAnnotateFromCatalogParams for more details.
Returns: The created task for this operation.
- unarchive_feature_schema_node(ontology_id: str, root_feature_schema_id: str) None [source]¶
Unarchives a feature schema node in an ontology. Only root level feature schema nodes can be unarchived. :param ontology_id: The ID of the ontology :type ontology_id: str :param root_feature_schema_id: The ID of the root level feature schema :type root_feature_schema_id: str
- Returns:
None
- update_feature_schema_title(feature_schema_id: str, title: str) FeatureSchema [source]¶
Updates a title of a feature schema :param feature_schema_id: The id of the feature schema to update :type feature_schema_id: str :param title: The new title of the feature schema :type title: str
- Returns:
The updated feature schema
Example
>>> client.update_feature_schema_title("cleabc1my012ioqvu5anyaabc", "New Title")
- upsert_feature_schema(feature_schema: Dict) FeatureSchema [source]¶
Upserts a feature schema :param feature_schema: Dict representing the feature schema to upsert
- Returns:
The upserted feature schema
Example
Insert a new feature schema >>> tool = Tool(name=”tool”, tool=Tool.Type.BOUNDING_BOX, color=”#FF0000”) >>> client.upsert_feature_schema(tool.asdict()) Update an existing feature schema >>> tool = Tool(feature_schema_id=”cleabc1my012ioqvu5anyaabc”, name=”tool”, tool=Tool.Type.BOUNDING_BOX, color=”#FF0000”) >>> client.upsert_feature_schema(tool.asdict())
AssetAttachment¶
- class labelbox.schema.asset_attachment.AssetAttachment(client, field_values)[source]¶
Bases:
DbObject
Asset attachment provides extra context about an asset while labeling.
- attachment_type¶
IMAGE, VIDEO, IMAGE_OVERLAY, HTML, RAW_TEXT, TEXT_URL, or PDF_URL. TEXT attachment type is deprecated.
- Type:
str
- attachment_value¶
URL to an external file or a string of text
- Type:
str
- attachment_name¶
The name of the attachment
- Type:
str
Benchmark¶
- class labelbox.schema.benchmark.Benchmark(client, field_values)[source]¶
Bases:
DbObject
Represents a benchmark label.
The Benchmarks tool works by interspersing data to be labeled, for which there is a benchmark label, to each person labeling. These labeled data are compared against their respective benchmark and an accuracy score between 0 and 100 percent is calculated.
- created_at¶
- Type:
datetime
- last_activity¶
- Type:
datetime
- average_agreement¶
- Type:
float
- completed_count¶
- Type:
int
- created_by¶
ToOne relationship to User
- Type:
Relationship
- reference_label¶
ToOne relationship to Label
- Type:
Relationship
BulkImportRequest¶
- class labelbox.schema.bulk_import_request.BulkImportRequest(client, field_values)[source]¶
Bases:
DbObject
Represents the import job when importing annotations.
- name¶
- Type:
str
- state¶
FAILED, RUNNING, or FINISHED (Refers to the whole import job)
- Type:
Enum
- input_file_url¶
URL to your web-hosted NDJSON file
- Type:
str
- error_file_url¶
NDJSON that contains error messages for failed annotations
- Type:
str
- status_file_url¶
NDJSON that contains status for each annotation
- Type:
str
- created_at¶
UTC timestamp for date BulkImportRequest was created
- Type:
datetime
- project¶
ToOne relationship to Project
- Type:
Relationship
- created_by¶
ToOne relationship to User
- Type:
Relationship
- delete() None [source]¶
Deletes the import job and also any annotations created by this import.
- Returns:
None
- property errors: List[Dict[str, Any]]¶
Errors for each individual annotation uploaded. This is a subset of statuses
- Returns:
List of dicts containing error messages. Empty list means there were no errors See BulkImportRequest.statuses for more details.
This information will expire after 24 hours.
- property inputs: List[Dict[str, Any]]¶
Inputs for each individual annotation uploaded. This should match the ndjson annotations that you have uploaded.
- Returns:
Uploaded ndjson.
This information will expire after 24 hours.
- property statuses: List[Dict[str, Any]]¶
Status for each individual annotation uploaded.
- Returns:
A status for each annotation if the upload is done running. See below table for more details
Field
Description
uuid
Specifies the annotation for the status row.
dataRow
JSON object containing the Labelbox data row ID for the annotation.
status
Indicates SUCCESS or FAILURE.
errors
An array of error messages included when status is FAILURE. Each error has a name, message and optional (key might not exist) additional_info.
This information will expire after 24 hours.
- wait_until_done(sleep_time_seconds: int = 5) None [source]¶
Blocks import job until certain conditions are met.
Blocks until the BulkImportRequest.state changes either to BulkImportRequestState.FINISHED or BulkImportRequestState.FAILED, periodically refreshing object’s state.
- Parameters:
sleep_time_seconds (str) – a time to block between subsequent API calls
DataRow¶
- class labelbox.schema.data_row.DataRow(*args, **kwargs)[source]¶
Bases:
DbObject
,Updateable
,BulkDeletable
Internal Labelbox representation of a single piece of data (e.g. image, video, text).
- external_id¶
User-generated file name or identifier
- Type:
str
- global_key¶
User-generated globally unique identifier
- Type:
str
- row_data¶
Paths to local files are uploaded to Labelbox’s server. Otherwise, it’s treated as an external URL.
- Type:
str
- updated_at¶
- Type:
datetime
- created_at¶
- Type:
datetime
- media_attributes¶
generated media attributes for the data row
- Type:
dict
- metadata_fields¶
metadata associated with the data row
- Type:
list
- metadata¶
metadata associated with the data row as list of DataRowMetadataField. When importing Data Rows with metadata, use metadata_fields instead
- Type:
list
- dataset¶
ToOne relationship to Dataset
- Type:
Relationship
- created_by¶
ToOne relationship to User
- Type:
Relationship
- organization¶
ToOne relationship to Organization
- Type:
Relationship
- labels¶
ToMany relationship to Label
- Type:
Relationship
- attachments¶
- Type:
Relationship
- static bulk_delete(data_rows) None [source]¶
Deletes all the given DataRows.
- Parameters:
data_rows (list of DataRow) – The DataRows to delete.
- create_attachment(attachment_type, attachment_value, attachment_name=None) AssetAttachment [source]¶
- Adds an AssetAttachment to a DataRow.
Labelers can view these attachments while labeling.
>>> datarow.create_attachment("TEXT", "This is a text message")
- Parameters:
attachment_type (str) – Asset attachment type, must be one of: VIDEO, IMAGE, TEXT, IMAGE_OVERLAY (AssetAttachment.AttachmentType)
attachment_value (str) – Asset attachment value.
attachment_name (str) – (Optional) Asset attachment name.
- Returns:
AssetAttachment DB object.
- Raises:
ValueError – asset_type must be one of the supported types.
- export(data_rows: Optional[List[Union[str, DataRow]]] = None, global_keys: Optional[List[str]] = None, task_name: Optional[str] = None, params: Optional[CatalogExportParams] = None) ExportTask [source]¶
Creates a data rows export task with the given list, params and returns the task. :param client: client to use to make the export request :type client: Client :param data_rows: list of data row objects or data row ids to export :type data_rows: list of DataRow or str :param task_name: name of remote task :type task_name: str :param params: export params :type params: CatalogExportParams
>>> dataset = client.get_dataset(DATASET_ID) >>> task = DataRow.export( >>> data_rows=[data_row.uid for data_row in dataset.data_rows.list()], >>> # or a list of DataRow objects: data_rows = data_set.data_rows.list() >>> # or a list of global_keys=["global_key_1", "global_key_2"], >>> # Note that exactly one of: data_rows or global_keys parameters can be passed in at a time >>> # and if data rows ids is present, global keys will be ignored >>> params={ >>> "performance_details": False, >>> "label_details": True >>> }) >>> task.wait_till_done() >>> task.result
- static export_v2(client: Client, data_rows: Optional[List[Union[str, DataRow]]] = None, global_keys: Optional[List[str]] = None, task_name: Optional[str] = None, params: Optional[CatalogExportParams] = None) Task [source]¶
Creates a data rows export task with the given list, params and returns the task. :param client: client to use to make the export request :type client: Client :param data_rows: list of data row objects or data row ids to export :type data_rows: list of DataRow or str :param task_name: name of remote task :type task_name: str :param params: export params :type params: CatalogExportParams
>>> dataset = client.get_dataset(DATASET_ID) >>> task = DataRow.export_v2( >>> data_rows=[data_row.uid for data_row in dataset.data_rows.list()], >>> # or a list of DataRow objects: data_rows = data_set.data_rows.list() >>> # or a list of global_keys=["global_key_1", "global_key_2"], >>> # Note that exactly one of: data_rows or global_keys parameters can be passed in at a time >>> # and if data rows ids is present, global keys will be ignored >>> params={ >>> "performance_details": False, >>> "label_details": True >>> }) >>> task.wait_till_done() >>> task.result
- get_winning_label_id(project_id: str) Optional[str] [source]¶
- Retrieves the winning label ID, i.e. the one that was marked as the
best for a particular data row, in a project’s workflow.
- Parameters:
project_id (str) – ID of the project containing the data row
- update(**kwargs)[source]¶
Updates this DB object with new values. Values should be passed as key-value arguments with field names as keys: >>> db_object.update(name=”New name”, title=”A title”)
- Kwargs:
Key-value arguments defining which fields should be updated for which values. Keys must be field names in this DB object’s type.
- Raises:
InvalidAttributeError – if there exists a key in kwargs that’s not a field in this object type.
Dataset¶
- class labelbox.schema.dataset.Dataset(client, field_values)[source]¶
Bases:
DbObject
,Updateable
,Deletable
A Dataset is a collection of DataRows.
- name¶
- Type:
str
- description¶
- Type:
str
- updated_at¶
- Type:
datetime
- created_at¶
- Type:
datetime
- row_count¶
The number of rows in the dataset. Fetch the dataset again to update since this is cached.
- Type:
int
- created_by¶
ToOne relationship to User
- Type:
Relationship
- organization¶
ToOne relationship to Organization
- Type:
Relationship
- create_data_row(items=None, **kwargs) DataRow [source]¶
Creates a single DataRow belonging to this dataset.
>>> dataset.create_data_row(row_data="http://my_site.com/photos/img_01.jpg")
- Parameters:
items – Dictionary containing new DataRow data. At a minimum, must contain row_data or DataRow.row_data.
**kwargs – Key-value arguments containing new DataRow data. At a minimum, must contain row_data.
- Raises:
InvalidQueryError – If both dictionary and kwargs are provided as inputs
InvalidQueryError – If DataRow.row_data field value is not provided in kwargs.
InvalidAttributeError – in case the DB object type does not contain any of the field names given in kwargs.
- create_data_rows(items) Task [source]¶
Asynchronously bulk upload data rows
Use this instead of Dataset.create_data_rows_sync uploads for batches that contain more than 1000 data rows.
- Parameters:
items (iterable of (dict or str)) – See the docstring for Dataset._create_descriptor_file for more information
- Returns:
Task representing the data import on the server side. The Task can be used for inspecting task progress and waiting until it’s done.
- Raises:
InvalidQueryError – If the items parameter does not conform to the specification above or if the server did not accept the DataRow creation request (unknown reason).
ResourceNotFoundError – If unable to retrieve the Task for the import process. This could imply that the import failed.
InvalidAttributeError – If there are fields in items not valid for a DataRow.
ValueError – When the upload parameters are invalid
- create_data_rows_sync(items) None [source]¶
Synchronously bulk upload data rows.
Use this instead of Dataset.create_data_rows for smaller batches of data rows that need to be uploaded quickly. Cannot use this for uploads containing more than 1000 data rows. Each data row is also limited to 5 attachments.
- Parameters:
items (iterable of (dict or str)) – See the docstring for Dataset._create_descriptor_file for more information.
- Returns:
None. If the function doesn’t raise an exception then the import was successful.
- Raises:
InvalidQueryError – If the items parameter does not conform to the specification in Dataset._create_descriptor_file or if the server did not accept the DataRow creation request (unknown reason).
InvalidAttributeError – If there are fields in items not valid for a DataRow.
ValueError – When the upload parameters are invalid
- data_row_for_external_id(external_id) DataRow [source]¶
Convenience method for getting a single DataRow belonging to this Dataset that has the given external_id.
- Parameters:
external_id (str) – External ID of the sought DataRow.
- Returns:
A single DataRow with the given ID.
- Raises:
labelbox.exceptions.ResourceNotFoundError – If there is no DataRow in this DataSet with the given external ID, or if there are multiple DataRows for it.
- data_rows(from_cursor: Optional[str] = None, where: Optional[Comparison] = None) PaginatedCollection [source]¶
Custom method to paginate data_rows via cursor.
- Parameters:
from_cursor (str) – Cursor (data row id) to start from, if none, will start from the beginning
where (dict(str,str)) – Filter to apply to data rows. Where value is a data row column name and key is the value to filter on.
example – {‘external_id’: ‘my_external_id’} to get a data row with external_id = ‘my_external_id’
Note
Order of retrieval is newest data row first. Deleted data rows are not retrieved. Failed data rows are not retrieved. Data rows in progress maybe retrieved.
- data_rows_for_external_id(external_id, limit=10) List[DataRow] [source]¶
Convenience method for getting a multiple DataRow belonging to this Dataset that has the given external_id.
- Parameters:
external_id (str) – External ID of the sought DataRow.
limit (int) – The maximum number of data rows to return for the given external_id
- Returns:
A list of DataRow with the given ID.
- Raises:
labelbox.exceptions.ResourceNotFoundError – If there is no DataRow in this DataSet with the given external ID, or if there are multiple DataRows for it.
- export(task_name: Optional[str] = None, filters: Optional[DatasetExportFilters] = None, params: Optional[CatalogExportParams] = None) ExportTask [source]¶
Creates a dataset export task with the given params and returns the task.
>>> dataset = client.get_dataset(DATASET_ID) >>> task = dataset.export( >>> filters={ >>> "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "data_row_ids": [DATA_ROW_ID_1, DATA_ROW_ID_2, ...] # or global_keys: [DATA_ROW_GLOBAL_KEY_1, DATA_ROW_GLOBAL_KEY_2, ...] >>> }, >>> params={ >>> "performance_details": False, >>> "label_details": True >>> }) >>> task.wait_till_done() >>> task.result
- export_data_rows(timeout_seconds=120, include_metadata: bool = False) Generator [source]¶
Returns a generator that produces all data rows that are currently attached to this dataset.
Note: For efficiency, the data are cached for 30 minutes. Newly created data rows will not appear until the end of the cache period.
- Parameters:
timeout_seconds (float) – Max waiting time, in seconds.
include_metadata (bool) – True to return related DataRow metadata
- Returns:
Generator that yields DataRow objects belonging to this dataset.
- Raises:
LabelboxError – if the export fails or is unable to download within the specified time.
- export_v2(task_name: Optional[str] = None, filters: Optional[DatasetExportFilters] = None, params: Optional[CatalogExportParams] = None) Task [source]¶
Creates a dataset export task with the given params and returns the task.
>>> dataset = client.get_dataset(DATASET_ID) >>> task = dataset.export_v2( >>> filters={ >>> "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "data_row_ids": [DATA_ROW_ID_1, DATA_ROW_ID_2, ...] # or global_keys: [DATA_ROW_GLOBAL_KEY_1, DATA_ROW_GLOBAL_KEY_2, ...] >>> }, >>> params={ >>> "performance_details": False, >>> "label_details": True >>> }) >>> task.wait_till_done() >>> task.result
Label¶
- class labelbox.schema.label.Label(*args, **kwargs)[source]¶
Bases:
DbObject
,Updateable
,BulkDeletable
Label represents an assessment on a DataRow. For example one label could contain 100 bounding boxes (annotations).
- label¶
- Type:
str
- seconds_to_label¶
- Type:
float
- agreement¶
- Type:
float
- benchmark_agreement¶
- Type:
float
- is_benchmark_reference¶
- Type:
bool
- project¶
ToOne relationship to Project
- Type:
Relationship
- data_row¶
ToOne relationship to DataRow
- Type:
Relationship
- reviews¶
ToMany relationship to Review
- Type:
Relationship
- created_by¶
ToOne relationship to User
- Type:
Relationship
- static bulk_delete(labels) None [source]¶
Deletes all the given Labels.
- Parameters:
labels (list of Label) – The Labels to delete.
LabelingFrontend¶
- class labelbox.schema.labeling_frontend.LabelingFrontend(client, field_values)[source]¶
Bases:
DbObject
Label editor.
Represents an HTML / JavaScript UI that is used to generate labels. “Editor” is the default Labeling Frontend that comes in every organization. You can create new labeling frontends for an organization.
- name¶
- Type:
str
- description¶
- Type:
str
- iframe_url_path¶
- Type:
str
- projects¶
ToMany relationship to Project
- Type:
Relationship
LabelingFrontendOptions¶
- class labelbox.schema.labeling_frontend.LabelingFrontendOptions(client, field_values)[source]
Bases:
DbObject
Label interface options.
- customization_options
- Type:
str
- project
ToOne relationship to Project
- Type:
Relationship
- labeling_frontend
ToOne relationship to LabelingFrontend
- Type:
Relationship
- organization
ToOne relationship to Organization
- Type:
Relationship
LabelingParameterOverride¶
- class labelbox.schema.project.LabelingParameterOverride(client, field_values)[source]
Bases:
DbObject
Customizes the order of assets in the label queue.
- priority
A prioritization score.
- Type:
int
- number_of_labels
Number of times an asset should be labeled.
- Type:
int
Ontology¶
- class labelbox.schema.ontology.Ontology(*args, **kwargs)[source]¶
Bases:
DbObject
An ontology specifies which tools and classifications are available to a project. This is read only for now. .. attribute:: name
- type:
str
- description¶
- Type:
str
- updated_at¶
- Type:
datetime
- created_at¶
- Type:
datetime
- normalized¶
- Type:
json
- object_schema_count¶
- Type:
int
- classification_schema_count¶
- Type:
int
- projects¶
ToMany relationship to Project
- Type:
Relationship
- created_by¶
ToOne relationship to User
- Type:
Relationship
- class labelbox.schema.ontology.OntologyBuilder(tools: ~typing.List[~labelbox.schema.ontology.Tool] = <factory>, classifications: ~typing.List[~labelbox.schema.ontology.Classification] = <factory>)[source]¶
Bases:
object
A class to help create an ontology for a Project. This should be used for making Project ontologies from scratch. OntologyBuilder can also pull from an already existing Project’s ontology.
There are no required instantiation arguments.
To create an ontology, use the asdict() method after fully building your ontology within this class, and inserting it into project.setup() as the “labeling_frontend_options” parameter.
Example
builder = OntologyBuilder() … frontend = list(client.get_labeling_frontends())[0] project.setup(frontend, builder.asdict())
- tools¶
(list)
- Type:
List[labelbox.schema.ontology.Tool]
- classifications¶
(list)
- Type:
List[labelbox.schema.ontology.Classification]
Organization¶
- class labelbox.schema.organization.Organization(*args, **kwargs)[source]¶
Bases:
DbObject
An Organization is a group of Users.
It is associated with data created by Users within that Organization. Typically all Users within an Organization have access to data created by any User in the same Organization.
- updated_at¶
- Type:
datetime
- created_at¶
- Type:
datetime
- name¶
- Type:
str
- users¶
ToMany relationship to User
- Type:
Relationship
- projects¶
ToMany relationship to Project
- Type:
Relationship
- webhooks¶
ToMany relationship to Webhook
- Type:
Relationship
- create_resource_tag(tag: Dict[str, str]) ResourceTag [source]¶
- Creates a resource tag.
>>> tag = {'text': 'tag-1', 'color': 'ffffff'}
- Parameters:
tag (dict) – A resource tag {‘text’: ‘tag-1’, ‘color’: ‘fffff’}
- Returns:
The created resource tag.
- get_default_iam_integration() Optional[IAMIntegration] [source]¶
Returns the default IAM integration for the organization. Will return None if there are no default integrations for the org.
- get_iam_integrations() List[IAMIntegration] [source]¶
Returns all IAM Integrations for an organization
- get_resource_tags() List[ResourceTag] [source]¶
Returns all resource tags for an organization
- invite_limit() InviteLimit [source]¶
Retrieve invite limits for the org This already accounts for users currently in the org Meaining that used = users + invites, remaining = limit - (users + invites)
- Returns:
InviteLimit
- invite_user(email: str, role: Role, project_roles: Optional[List[ProjectRole]] = None) Invite [source]¶
Invite a new member to the org. This will send the user an email invite
- Parameters:
email (str) – email address of the user to invite
role (Role) – Role to assign to the user
project_roles (Optional[List[ProjectRoles]]) – List of project roles to assign to the User (if they have a project based org role).
- Returns:
Invite for the user
Notes
- Multiple invites can be sent for the same email. This can only be resolved in the UI for now.
Future releases of the SDK will support the ability to query and revoke invites to solve this problem (and/or checking on the backend)
Some server side response are unclear (e.g. if the user invites themself None is returned which the SDK raises as a LabelboxError )
Project¶
- class labelbox.schema.project.Project(client, field_values)[source]¶
Bases:
DbObject
,Updateable
,Deletable
A Project is a container that includes a labeling frontend, an ontology, datasets and labels.
- name¶
- Type:
str
- description¶
- Type:
str
- updated_at¶
- Type:
datetime
- created_at¶
- Type:
datetime
- setup_complete¶
- Type:
datetime
- last_activity_time¶
- Type:
datetime
- queue_mode¶
- Type:
string
- auto_audit_number_of_labels¶
- Type:
int
- auto_audit_percentage¶
- Type:
float
- created_by¶
ToOne relationship to User
- Type:
Relationship
- organization¶
ToOne relationship to Organization
- Type:
Relationship
- labeling_frontend¶
ToOne relationship to LabelingFrontend
- Type:
Relationship
- labeling_frontend_options¶
ToMany relationship to LabelingFrontendOptions
- Type:
Relationship
- labeling_parameter_overrides¶
ToMany relationship to LabelingParameterOverride
- Type:
Relationship
- webhooks¶
ToMany relationship to Webhook
- Type:
Relationship
- benchmarks¶
ToMany relationship to Benchmark
- Type:
Relationship
- ontology¶
ToOne relationship to Ontology
- Type:
Relationship
- batches() PaginatedCollection [source]¶
Fetch all batches that belong to this project
- Returns:
A PaginatedCollection of `Batch`es
- bulk_import_requests() PaginatedCollection [source]¶
Returns bulk import request objects which are used in model-assisted labeling. These are returned with the oldest first, and most recent last.
- create_batch(name: str, data_rows: Optional[List[Union[str, DataRow]]] = None, priority: int = 5, consensus_settings: Optional[Dict[str, float]] = None, global_keys: Optional[List[str]] = None)[source]¶
- Creates a new batch for a project. One of global_keys or data_rows must be provided, but not both. A
maximum of 100,000 data rows can be added to a batch.
- Parameters:
name – a name for the batch, must be unique within a project
data_rows – Either a list of DataRows or Data Row ids.
global_keys – global keys for data rows to add to the batch.
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}
Returns: the created batch
- create_batches(name_prefix: str, data_rows: Optional[List[Union[str, DataRow]]] = None, global_keys: Optional[List[str]] = None, priority: int = 5, consensus_settings: Optional[Dict[str, float]] = None) CreateBatchesTask [source]¶
Creates batches for a project from a list of data rows. One of global_keys or data_rows must be provided, but not both. When more than 100k data rows are specified and thus multiple batches are needed, the specific batch that each data row will be placed in is undefined.
Batches will be created with the specified name prefix and a unique suffix. The suffix will be a 4-digit number starting at 0000. For example, if the name prefix is “batch” and 3 batches are created, the names will be “batch0000”, “batch0001”, and “batch0002”. This method will throw an error if a batch with the same name already exists.
- Parameters:
name_prefix – a prefix for the batch names, must be unique within a project
data_rows – Either a list of DataRows or Data Row ids.
global_keys – global keys for data rows to add to the batch.
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}
Returns: a task for the created batches
- create_batches_from_dataset(name_prefix: str, dataset_id: str, priority: int = 5, consensus_settings: Optional[Dict[str, float]] = None) CreateBatchesTask [source]¶
Creates batches for a project from a dataset, selecting only the data rows that are not already added to the project. When the dataset contains more than 100k data rows and multiple batches are needed, the specific batch that each data row will be placed in is undefined. Note that data rows may not be immediately available for a project after being added to a dataset; use the _wait_until_data_rows_are_processed method to ensure that data rows are available before creating batches.
Batches will be created with the specified name prefix and a unique suffix. The suffix will be a 4-digit number starting at 0000. For example, if the name prefix is “batch” and 3 batches are created, the names will be “batch0000”, “batch0001”, and “batch0002”. This method will throw an error if a batch with the same name already exists.
- Parameters:
name_prefix – a prefix for the batch names, must be unique within a project
dataset_id – the id of the dataset to create batches from
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}
Returns: a task for the created batches
- enable_model_assisted_labeling(toggle: bool = True) bool [source]¶
Turns model assisted labeling either on or off based on input
- Parameters:
toggle (bool) – True or False boolean
- Returns:
True if toggled on or False if toggled off
- export(task_name: Optional[str] = None, filters: Optional[ProjectExportFilters] = None, params: Optional[ProjectExportParams] = None) ExportTask [source]¶
Creates a project export task with the given params and returns the task.
>>> task = project.export( >>> filters={ >>> "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "data_row_ids": [DATA_ROW_ID_1, DATA_ROW_ID_2, ...] # or global_keys: [DATA_ROW_GLOBAL_KEY_1, DATA_ROW_GLOBAL_KEY_2, ...] >>> "batch_ids": [BATCH_ID_1, BATCH_ID_2, ...] >>> }, >>> params={ >>> "performance_details": False, >>> "label_details": True >>> }) >>> task.wait_till_done() >>> task.result
- export_issues(status=None) str [source]¶
Calls the server-side Issues exporting that returns the URL to that payload.
- Parameters:
status (string) – valid values: Open, Resolved
- Returns:
URL of the data file with this Project’s issues.
- export_labels(download=False, timeout_seconds=1800, **kwargs) Optional[Union[str, List[Dict[Any, Any]]]] [source]¶
Calls the server-side Label exporting that generates a JSON payload, and returns the URL to that payload.
Will only generate a new URL at a max frequency of 30 min.
- Parameters:
download (bool) – Returns the url if False
timeout_seconds (float) – Max waiting time, in seconds.
start (str) – Earliest date for labels, formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
end (str) – Latest date for labels, formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
last_activity_start (str) – Will include all labels that have had any updates to data rows, issues, comments, metadata, or reviews since this timestamp. formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
last_activity_end (str) – Will include all labels that do not have any updates to data rows, issues, comments, metadata, or reviews after this timestamp. formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
- Returns:
URL of the data file with this Project’s labels. If the server didn’t generate during the timeout_seconds period, None is returned.
- export_queued_data_rows(timeout_seconds=120, include_metadata: bool = False) List[Dict[str, str]] [source]¶
Returns all data rows that are currently enqueued for this project.
- Parameters:
timeout_seconds (float) – Max waiting time, in seconds.
include_metadata (bool) – True to return related DataRow metadata
- Returns:
Data row fields for all data rows in the queue as json
- Raises:
LabelboxError – if the export fails or is unable to download within the specified time.
- export_v2(task_name: Optional[str] = None, filters: Optional[ProjectExportFilters] = None, params: Optional[ProjectExportParams] = None) Task [source]¶
Creates a project export task with the given params and returns the task.
For more information visit: https://docs.labelbox.com/docs/exports-v2#export-from-a-project-python-sdk
>>> task = project.export_v2( >>> filters={ >>> "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "data_row_ids": [DATA_ROW_ID_1, DATA_ROW_ID_2, ...] # or global_keys: [DATA_ROW_GLOBAL_KEY_1, DATA_ROW_GLOBAL_KEY_2, ...] >>> "batch_ids": [BATCH_ID_1, BATCH_ID_2, ...] >>> }, >>> params={ >>> "performance_details": False, >>> "label_details": True >>> }) >>> task.wait_till_done() >>> task.result
- extend_reservations(queue_type) int [source]¶
Extends all the current reservations for the current user on the given queue type. :param queue_type: Either “LabelingQueue” or “ReviewQueue” :type queue_type: str
- Returns:
int, the number of reservations that were extended.
- get_queue_mode() QueueMode [source]¶
Provides the queue mode used for this project.
Deprecation notice: This method is deprecated and will be removed in a future version. To obtain the queue mode of a project, simply refer to the queue_mode attribute of a Project.
For more information, visit https://docs.labelbox.com/reference/migrating-to-workflows#upcoming-changes
Returns: the QueueMode for this project
- get_resource_tags() List[ResourceTag] [source]¶
Returns tags for a project
- label_generator(timeout_seconds=600, **kwargs)[source]¶
Download text and image annotations, or video annotations.
For a mixture of text/image and video, use project.export_labels()
- Returns:
LabelGenerator for accessing labels
- labeler_performance() PaginatedCollection [source]¶
Returns the labeler performances for this Project.
- Returns:
A PaginatedCollection of LabelerPerformance objects.
- labels(datasets=None, order_by=None) PaginatedCollection [source]¶
Custom relationship expansion method to support limited filtering.
- Parameters:
datasets (iterable of Dataset) – Optional collection of Datasets whose Labels are sought. If not provided, all Labels in this Project are returned.
order_by (None or (Field, Field.Order)) – Ordering clause.
- members() PaginatedCollection [source]¶
Fetch all current members for this project
- Returns:
A PaginatedCollection of `ProjectMember`s
- move_data_rows_to_task_queue(data_row_ids: Union[UniqueIds, GlobalKeys], task_queue_id: str)[source]¶
- move_data_rows_to_task_queue(data_row_ids: List[str], task_queue_id: str)
Moves data rows to the specified task queue.
- Parameters:
data_row_ids – a list of data row ids to be moved. This can be a list of strings or a DataRowIdentifiers object DataRowIdentifier objects are lists of ids or global keys. A DataIdentifier object can be a UniqueIds or GlobalKeys class.
task_queue_id – the task queue id to be moved to, or None to specify the “Done” queue
- Returns:
None if successful, or a raised error on failure
- review_metrics(net_score) int [source]¶
Returns this Project’s review metrics.
- Parameters:
net_score (None or Review.NetScore) – Indicates desired metric.
- Returns:
int, aggregation count of reviews for given net_score.
- set_labeling_parameter_overrides(data: List[Tuple[Union[DataRow, UniqueId, GlobalKey], int]]) bool [source]¶
Adds labeling parameter overrides to this project.
- See information on priority here:
https://docs.labelbox.com/en/configure-editor/queue-system#reservation-system
>>> project.set_labeling_parameter_overrides([ >>> (data_row_id1, 2), (data_row_id2, 1)]) or >>> project.set_labeling_parameter_overrides([ >>> (data_row_gk1, 2), (data_row_gk2, 1)])
- Parameters:
data (iterable) –
An iterable of tuples. Each tuple must contain either (DataRow, DataRowPriority<int>) or (DataRowIdentifier, priority<int>) for the new override. DataRowIdentifier is an object representing a data row id or a global key. A DataIdentifier object can be a UniqueIds or GlobalKeys class. NOTE - passing whole DatRow is deprecated. Please use a DataRowIdentifier instead.
- Priority:
- Data will be labeled in priority order.
A lower number priority is labeled first.
All signed 32-bit integers are accepted, from -2147483648 to 2147483647.
- Priority is not the queue position.
The position is determined by the relative priority.
- E.g. [(data_row_1, 5,1), (data_row_2, 2,1), (data_row_3, 10,1)]
will be assigned in the following order: [data_row_2, data_row_1, data_row_3]
- The priority only effects items in the queue.
Assigning a priority will not automatically add the item back into the queue.
- Returns:
bool, indicates if the operation was a success.
- setup(labeling_frontend, labeling_frontend_options) None [source]¶
Finalizes the Project setup.
- Parameters:
labeling_frontend (LabelingFrontend) – Which UI to use to label the data.
labeling_frontend_options (dict or str) – Labeling frontend options, a.k.a. project ontology. If given a dict it will be converted to str using json.dumps.
- setup_editor(ontology) None [source]¶
Sets up the project using the Pictor editor.
- Parameters:
ontology (Ontology) – The ontology to attach to the project
- task_queues() List[TaskQueue] [source]¶
Fetch all task queues that belong to this project
- Returns:
A List of `TaskQueue`s
- update(**kwargs)[source]¶
Updates this project with the specified attributes
- Parameters:
kwargs – a dictionary containing attributes to be upserted
Note that the queue_mode cannot be changed after a project has been created.
Additionally, the quality setting cannot be changed after a project has been created. The quality mode for a project is inferred through the following attributes:
- Benchmark:
auto_audit_number_of_labels = 1 and auto_audit_percentage = 1.0
- Consensus:
auto_audit_number_of_labels > 1 or auto_audit_percentage <= 1.0
Attempting to switch between benchmark and consensus modes is an invalid operation and will result in an error.
- update_data_row_labeling_priority(data_rows: Union[UniqueIds, GlobalKeys], priority: int) bool [source]¶
- update_data_row_labeling_priority(data_rows: List[str], priority: int) bool
Updates labeling parameter overrides to this project in bulk. This method allows up to 1 million data rows to be updated at once.
- See information on priority here:
https://docs.labelbox.com/en/configure-editor/queue-system#reservation-system
- Parameters:
data_rows – a list of data row ids to update priorities for. This can be a list of strings or a DataRowIdentifiers object DataRowIdentifier objects are lists of ids or global keys. A DataIdentifier object can be a UniqueIds or GlobalKeys class.
priority (int) – Priority for the new override. See above for more information.
- Returns:
bool, indicates if the operation was a success.
- update_project_resource_tags(resource_tag_ids: List[str]) List[ResourceTag] [source]¶
Creates project resource tags
- Parameters:
resource_tag_ids –
- Returns:
a list of ResourceTag ids that was created.
- upload_annotations(name: str, annotations: Union[str, Path, Iterable[Dict]], validate: bool = False) BulkImportRequest [source]¶
Uploads annotations to a new Editor project.
- Parameters:
name (str) – name of the BulkImportRequest job
annotations (str or Path or Iterable) – url that is publicly accessible by Labelbox containing an ndjson file OR local path to an ndjson file OR iterable of annotation rows
validate (bool) – Whether or not to validate the payload before uploading.
- Returns:
BulkImportRequest
- upsert_instructions(instructions_file: str) None [source]¶
Uploads instructions to the UI. Running more than once will replace the instructions
- Parameters:
instructions_file (str) – Path to a local file. * Must be a pdf or html file
- Raises:
ValueError –
project must be setup * instructions file must have a “.pdf” or “.html” extension
Review¶
- class labelbox.schema.review.Review(client, field_values)[source]¶
Bases:
DbObject
,Deletable
,Updateable
Reviewing labeled data is a collaborative quality assurance technique.
A Review object indicates the quality of the assigned Label. The aggregated review numbers can be obtained on a Project object.
- created_at¶
- Type:
datetime
- updated_at¶
- Type:
datetime
- score¶
- Type:
float
- created_by¶
ToOne relationship to User
- Type:
Relationship
- organization¶
ToOne relationship to Organization
- Type:
Relationship
- project¶
ToOne relationship to Project
- Type:
Relationship
- label¶
ToOne relationship to Label
- Type:
Relationship
Task¶
- class labelbox.schema.task.Task(client, field_values)[source]¶
Bases:
DbObject
Represents a server-side process that might take a longer time to process. Allows the Task state to be updated and checked on the client side.
- updated_at¶
- Type:
datetime
- created_at¶
- Type:
datetime
- name¶
- Type:
str
- status¶
- Type:
str
- completion_percentage¶
- Type:
float
- created_by¶
ToOne relationship to User
- Type:
Relationship
- organization¶
ToOne relationship to Organization
- Type:
Relationship
- property created_data_rows: Optional[Dict[str, Any]]¶
Fetch data rows which successfully created for an import task.
- property errors: Optional[Dict[str, Any]]¶
Fetch the error associated with an import task.
- property failed_data_rows: Optional[Dict[str, Any]]¶
Fetch data rows which failed to be created for an import task.
- property result: Union[List[Dict[str, Any]], Dict[str, Any]]¶
Fetch the result for an import task.
- wait_till_done(timeout_seconds: float = 300.0, check_frequency: float = 2.0) None [source]¶
Waits until the task is completed. Periodically queries the server to update the task attributes.
- Parameters:
timeout_seconds (float) – Maximum time this method can block, in seconds. Defaults to five minutes.
check_frequency (float) – Frequency of queries to server to update the task attributes, in seconds. Defaults to two seconds. Minimal value is two seconds.
Task Queue¶
User¶
- class labelbox.schema.user.User(client, field_values)[source]¶
Bases:
DbObject
A User is a registered Labelbox user (for example you) associated with data they create or import and an Organization they belong to.
- updated_at¶
- Type:
datetime
- created_at¶
- Type:
datetime
- email¶
- Type:
str
- name¶
- Type:
str
- nickname¶
- Type:
str
- intercom_hash¶
- Type:
str
- picture¶
- Type:
str
- is_viewer¶
- Type:
bool
- is_external_viewer¶
- Type:
bool
- organization¶
ToOne relationship to Organization
- Type:
Relationship
- created_tasks¶
ToMany relationship to Task
- Type:
Relationship
- projects¶
ToMany relationship to Project
- Type:
Relationship
- remove_from_project(project: Project) None [source]¶
Removes a User from a project. Only used for project based users. Project based user means their org role is “NONE”
- Parameters:
project (Project) – Project to remove user from
- update_org_role(role: Role) None [source]¶
Updated the `User`s organization role.
See client.get_roles() to get all valid roles If you a user is converted from project level permissions to org level permissions and then convert back, their permissions will remain for each individual project
- Parameters:
role (Role) – The role that you want to set for this user.
Webhook¶
- class labelbox.schema.webhook.Webhook(client, field_values)[source]¶
Bases:
DbObject
,Updateable
Represents a server-side rule for sending notifications to a web-server whenever one of several predefined actions happens within a context of a Project or an Organization.
- updated_at¶
- Type:
datetime
- created_at¶
- Type:
datetime
- url¶
- Type:
str
- topics¶
LABEL_CREATED, LABEL_UPDATED, LABEL_DELETED
- Type:
str
- status¶
ACTIVE, INACTIVE, REVOKED
- Type:
str
- static create(client, topics, url, secret, project) Webhook [source]¶
Creates a Webhook.
- Parameters:
client (Client) – The Labelbox client used to connect to the server.
topics (list of str) – A list of topics this Webhook should get notifications for. Must be one of Webhook.Topic
url (str) – The URL to which notifications should be sent by the Labelbox server.
secret (str) – A secret key used for signing notifications.
project (Project or None) – The project for which notifications should be sent. If None notifications are sent for all events in your organization.
- Returns:
A newly created Webhook.
- Raises:
ValueError – If the topic is not one of Topic or status is not one of Status
- Information on configuring your server can be found here (this is where the url points to and the secret is set).
Exceptions¶
- exception labelbox.exceptions.ApiLimitError(message, cause=None)[source]¶
Bases:
LabelboxError
Raised when the user performs too many requests in a short period of time.
- exception labelbox.exceptions.AuthenticationError(message, cause=None)[source]¶
Bases:
LabelboxError
Raised when an API key fails authentication.
- exception labelbox.exceptions.AuthorizationError(message, cause=None)[source]¶
Bases:
LabelboxError
Raised when a user is unauthorized to perform the given request.
- exception labelbox.exceptions.ConfidenceNotSupportedException[source]¶
Bases:
Exception
Raised when confidence is specified for unsupported annotation type
- exception labelbox.exceptions.CustomMetricsNotSupportedException[source]¶
Bases:
Exception
Raised when custom_metrics is specified for unsupported annotation type
- exception labelbox.exceptions.InternalServerError(message, cause=None)[source]¶
Bases:
LabelboxError
Nondescript prisma or 502 related errors.
Meant to be retryable.
TODO: these errors need better messages from platform
- exception labelbox.exceptions.InvalidAttributeError(db_object_type, field)[source]¶
Bases:
LabelboxError
Raised when a field (name or Field instance) is not valid or found for a specific DB object type.
- exception labelbox.exceptions.InvalidQueryError(message, cause=None)[source]¶
Bases:
LabelboxError
Indicates a malconstructed or unsupported query (either by GraphQL in general or by Labelbox specifically). This can be the result of either client or server side query validation.
- exception labelbox.exceptions.LabelboxError(message, cause=None)[source]¶
Bases:
Exception
Base class for exceptions.
- exception labelbox.exceptions.MALValidationError(message, cause=None)[source]¶
Bases:
LabelboxError
Raised when user input is invalid for MAL imports.
- exception labelbox.exceptions.MalformedQueryException[source]¶
Bases:
Exception
Raised when the user submits a malformed query.
- exception labelbox.exceptions.NetworkError(cause)[source]¶
Bases:
LabelboxError
Raised when an HTTPError occurs.
- exception labelbox.exceptions.OperationNotAllowedException[source]¶
Bases:
Exception
Raised when user does not have permissions to a resource or has exceeded usage limit
- exception labelbox.exceptions.ProcessingWaitTimeout[source]¶
Bases:
Exception
Raised when waiting for the data rows to be processed takes longer than allowed
- exception labelbox.exceptions.ResourceConflict(message, cause=None)[source]¶
Bases:
LabelboxError
Exception raised when a given resource conflicts with another.
- exception labelbox.exceptions.ResourceCreationError(message, cause=None)[source]¶
Bases:
LabelboxError
Indicates that a resource could not be created in the server side due to a validation or transaction error
- exception labelbox.exceptions.ResourceNotFoundError(db_object_type, params)[source]¶
Bases:
LabelboxError
Exception raised when a given resource is not found.
- exception labelbox.exceptions.TimeoutError(message, cause=None)[source]¶
Bases:
LabelboxError
Raised when a request times-out.
- exception labelbox.exceptions.UuidError(message, cause=None)[source]¶
Bases:
LabelboxError
Raised when there are repeat Uuid’s in bulk import request.
- exception labelbox.exceptions.ValidationFailedError(message, cause=None)[source]¶
Bases:
LabelboxError
Exception raised for when a GraphQL query fails validation (query cost, etc.) E.g. a query that is too expensive, or depth is too deep.
Pagination¶
- class labelbox.pagination.PaginatedCollection(client: Client, query: str, params: Dict[str, Union[str, int]], dereferencing: Union[List[str], Dict[str, Any]], obj_class: Union[Type[DbObject], Callable[[Any, Any], Any]], cursor_path: Optional[List[str]] = None, experimental: bool = False)[source]¶
Bases:
object
An iterable collection of database objects (Projects, Labels, etc…).
Implements automatic (transparent to the user) paginated fetching during iteration. Intended for use by library internals and not by the end user. For a list of attributes see __init__(…) documentation. The params of __init__ map exactly to object attributes.
- __init__(client: Client, query: str, params: Dict[str, Union[str, int]], dereferencing: Union[List[str], Dict[str, Any]], obj_class: Union[Type[DbObject], Callable[[Any, Any], Any]], cursor_path: Optional[List[str]] = None, experimental: bool = False)[source]¶
Creates a PaginatedCollection.
- Parameters:
client (labelbox.Client) – the client used for fetching data from DB.
query (str) – Base query used for pagination. It must contain two ‘%d’ placeholders, the first for pagination ‘skip’ clause and the second for the ‘first’ clause.
params (dict) – Query parameters.
dereferencing (iterable) – An iterable of str defining the keypath that needs to be dereferenced in the query result in order to reach the paginated objects of interest.
obj_class (type) – The class of object to be instantiated with each dict containing db values.
cursor_path – If not None, this is used to find the cursor
experimental – Used to call experimental endpoints
Enums¶
- class labelbox.schema.enums.AnnotationImportState(value)[source]¶
Bases:
Enum
State of the import job when importing annotations (RUNNING, FAILED, or FINISHED).
State
Description
RUNNING
Indicates that the import job is not done yet.
FAILED
Indicates the import job failed. Check AnnotationImport.errors for more information
FINISHED
Indicates the import job is no longer running. Check AnnotationImport.statuses for more information
- class labelbox.schema.enums.BulkImportRequestState(value)[source]¶
Bases:
Enum
State of the import job when importing annotations (RUNNING, FAILED, or FINISHED).
If you are not usinig MEA continue using BulkImportRequest. AnnotationImports are in beta and will change soon.
State
Description
RUNNING
Indicates that the import job is not done yet.
FAILED
Indicates the import job failed. Check BulkImportRequest.errors for more information
FINISHED
Indicates the import job is no longer running. Check BulkImportRequest.statuses for more information
- class labelbox.schema.enums.CollectionJobStatus(value)[source]¶
Bases:
Enum
Status of an asynchronous job over a collection.
State
Description
SUCCESS
Indicates job has successfully processed entire collection of data
PARTIAL SUCCESS
Indicates some data in the collection has succeeded and other data have failed
FAILURE
Indicates job has failed to process entire collection of data
ModelRun¶
- class labelbox.schema.model_run.ModelRun(client, field_values)[source]¶
Bases:
DbObject
- add_predictions(name: str, predictions: Union[str, Path, Iterable[Dict], Iterable[Label]]) MEAPredictionImport [source]¶
Uploads predictions to a new Editor project.
- Parameters:
name (str) – name of the AnnotationImport job
predictions (str or Path or Iterable) – url that is publicly accessible by Labelbox containing an ndjson file OR local path to an ndjson file OR iterable of annotation rows
- Returns:
AnnotationImport
- delete_model_run_data_rows(data_row_ids: List[str])[source]¶
Deletes data rows from Model Runs.
- Parameters:
data_row_ids (list) – List of data row ids to delete from the Model Run.
- Returns:
Query execution success.
- export(task_name: Optional[str] = None, params: Optional[ModelRunExportParams] = None) ExportTask [source]¶
Creates a model run export task with the given params and returns the task.
>>> export_task = export("my_export_task", params={"media_attributes": True})
- export_labels(download: bool = False, timeout_seconds: int = 600) Optional[Union[str, List[Dict[Any, Any]]]] [source]¶
Experimental. To use, make sure client has enable_experimental=True.
Fetches Labels from the ModelRun
- Parameters:
download (bool) – Returns the url if False
- Returns:
URL of the data file with this ModelRun’s labels. If download=True, this instead returns the contents as NDJSON format. If the server didn’t generate during the timeout_seconds period, None is returned.
- export_v2(task_name: Optional[str] = None, params: Optional[ModelRunExportParams] = None) Task [source]¶
Creates a model run export task with the given params and returns the task.
>>> export_task = export_v2("my_export_task", params={"media_attributes": True})
- get_config() Dict[str, Any] [source]¶
Gets Model Run’s training metadata :returns: training metadata as a dictionary
- reset_config() Dict[str, Any] [source]¶
Resets Model Run’s training metadata config :returns: Model Run id and reset training metadata
- send_to_annotate_from_model(destination_project_id: str, task_queue_id: Optional[str], batch_name: str, data_rows: Union[UniqueIds, GlobalKeys], params: SendToAnnotateFromModelParams) Task [source]¶
Sends data rows from a model run to a project for annotation.
- Example Usage:
>>> task = model_run.send_to_annotate_from_model( >>> destination_project_id=DESTINATION_PROJECT_ID, >>> batch_name="batch", >>> data_rows=UniqueIds([DATA_ROW_ID]), >>> task_queue_id=TASK_QUEUE_ID, >>> params={}) >>> task.wait_till_done()
- Parameters:
destination_project_id – The ID of the project to send the data rows to.
task_queue_id – The ID of the task queue to send the data rows to. If not specified, the data rows will be sent to the Done workflow state.
batch_name – The name of the batch to create. If more than one batch is created, additional batches will be named with a monotonically increasing numerical suffix, starting at “_1”.
data_rows – The data rows to send to the project.
params – Additional parameters for this operation. See SendToAnnotateFromModelParams for details.
Returns: The created task for this operation.
- update_config(config: Dict[str, Any]) Dict[str, Any] [source]¶
Updates the Model Run’s training metadata config :param config: A dictionary of keys and values :type config: dict
- Returns:
Model Run id and updated training metadata
- upsert_data_rows(data_row_ids=None, global_keys=None, timeout_seconds=3600)[source]¶
Adds data rows to a Model Run without any associated labels :param data_row_ids: data row ids to add to model run :type data_row_ids: list :param global_keys: global keys for data rows to add to model run :type global_keys: list :param timeout_seconds: Max waiting time, in seconds. :type timeout_seconds: float
- Returns:
ID of newly generated async task
- upsert_labels(label_ids: Optional[List[str]] = None, project_id: Optional[str] = None, timeout_seconds=3600)[source]¶
Adds data rows and labels to a Model Run
- Parameters:
label_ids (list) – label ids to insert
project_id (string) – project uuid, all project labels will be uploaded Either label_ids OR project_id is required but NOT both
timeout_seconds (float) – Max waiting time, in seconds.
- Returns:
ID of newly generated async task
- upsert_predictions_and_send_to_project(name: str, predictions: Union[str, Path, Iterable[Dict]], project_id: str, priority: Optional[int] = 5) MEAPredictionImport [source]¶
- Provides a convenient way to execute the following steps in a single function call:
Upload predictions to a Model
Create a batch from data rows that had predictions assocated with them
Attach the batch to a project
Add those same predictions to the project as MAL annotations
Note that partial successes are possible. If it is important that all stages are successful then check the status of each individual task with task.errors. E.g.
>>> mea_import_job, batch, mal_import_job = upsert_predictions_and_send_to_project(name, predictions, project_id) >>> # handle mea import job successfully created (check for job failure or partial failures) >>> print(mea_import_job.status, mea_import_job.errors) >>> if batch is None: >>> # Handle batch creation failure >>> if mal_import_job is None: >>> # Handle mal_import_job creation failure >>> else: >>> # handle mal import job successfully created (check for job failure or partial failures) >>> print(mal_import_job.status, mal_import_job.errors)
- Parameters:
name (str) – name of the AnnotationImport job as well as the name of the batch import
predictions (Iterable) – iterable of annotation rows
project_id (str) – id of the project to import into
priority (int) – priority of the job
- Returns:
Tuple[MEAPredictionImport, Batch, MEAToMALPredictionImport] If any of these steps fail the return value will be None.
Model¶
- class labelbox.schema.model.Model(client, field_values)[source]¶
Bases:
DbObject
A model represents a program that has been trained and can make predictions on new data. .. attribute:: name
- type:
str
- model_runs¶
ToMany relationship to ModelRun
- Type:
Relationship
DataRowMetadata¶
- class labelbox.schema.data_row_metadata.DataRowMetadataKind(value)[source]¶
Bases:
Enum
An enumeration.
- class labelbox.schema.data_row_metadata.DataRowMetadataOntology(client)[source]¶
Bases:
object
Ontology for data row metadata
Metadata provides additional context for a data rows. Metadata is broken into two classes reserved and custom. Reserved fields are defined by Labelbox and used for creating specific experiences in the platform.
>>> mdo = client.get_data_row_metadata_ontology()
- bulk_delete(deletes: List[DeleteDataRowMetadata]) List[DataRowMetadataBatchResponse] [source]¶
Delete metadata from a datarow by specifiying the fields you want to remove
>>> delete = DeleteDataRowMetadata( >>> data_row_id=UniqueId("datarow-id"), >>> fields=[ >>> "schema-id-1", >>> "schema-id-2" >>> ... >>> ] >>> ) >>> mdo.batch_delete([metadata])
>>> delete = DeleteDataRowMetadata( >>> data_row_id=GlobalKey("global-key"), >>> fields=[ >>> "schema-id-1", >>> "schema-id-2" >>> ... >>> ] >>> ) >>> mdo.batch_delete([metadata])
>>> delete = DeleteDataRowMetadata( >>> data_row_id="global-key", >>> fields=[ >>> "schema-id-1", >>> "schema-id-2" >>> ... >>> ] >>> ) >>> mdo.batch_delete([metadata])
- Parameters:
deletes – Data row and schema ids to delete For data row, we support UniqueId, str, and GlobalKey. If you pass a str, we will assume it is a UniqueId Do not pass a mix of data row ids and global keys in the same list
- Returns:
list of unsuccessful deletions. An empty list means all data rows were successfully deleted.
- bulk_export(data_row_ids: List[str]) List[DataRowMetadata] [source]¶
- bulk_export(data_row_ids: Union[UniqueIds, GlobalKeys]) List[DataRowMetadata]
Exports metadata for a list of data rows
>>> mdo.bulk_export([data_row.uid for data_row in data_rows])
- Parameters:
data_row_ids – List of data data rows to fetch metadata for. This can be a list of strings or a DataRowIdentifiers object
class. (DataRowIdentifier objects are lists of ids or global keys. A DataIdentifier object can be a UniqueIds or GlobalKeys) –
- Returns:
A list of DataRowMetadata. There will be one DataRowMetadata for each data_row_id passed in. This is true even if the data row does not have any meta data. Data rows without metadata will have empty fields.
- bulk_upsert(metadata: List[DataRowMetadata]) List[DataRowMetadataBatchResponse] [source]¶
Upsert metadata to a list of data rows
You may specify data row by either data_row_id or global_key
>>> metadata = DataRowMetadata( >>> data_row_id="datarow-id", # Alternatively, set global_key="global-key" >>> fields=[ >>> DataRowMetadataField(schema_id="schema-id", value="my-message"), >>> ... >>> ] >>> ) >>> mdo.batch_upsert([metadata])
- Parameters:
metadata – List of DataRow Metadata to upsert
- Returns:
list of unsuccessful upserts. An empty list means the upload was successful.
- create_schema(name: str, kind: DataRowMetadataKind, options: Optional[List[str]] = None) DataRowMetadataSchema [source]¶
Create metadata schema
>>> mdo.create_schema(name, kind, options)
- Parameters:
name (str) – Name of metadata schema
kind (DataRowMetadataKind) – Kind of metadata schema as DataRowMetadataKind
options (List[str]) – List of Enum options
- Returns:
Created metadata schema as DataRowMetadataSchema
- Raises:
KeyError – When provided name is not a valid custom metadata
- delete_schema(name: str) bool [source]¶
Delete metadata schema
>>> mdo.delete_schema(name)
- Parameters:
name – Name of metadata schema to delete
- Returns:
True if deletion is successful, False if unsuccessful
- Raises:
KeyError – When provided name is not a valid custom metadata
- get_by_name(name: str) Union[DataRowMetadataSchema, Dict[str, DataRowMetadataSchema]] [source]¶
Get metadata by name
>>> mdo.get_by_name(name)
- Parameters:
name (str) – Name of metadata schema
- Returns:
Metadata schema as DataRowMetadataSchema or dict, in case of Enum metadata
- Raises:
KeyError – When provided name is not presented in neither reserved nor custom metadata list
- parse_metadata(unparsed: List[Dict[str, List[Union[str, Dict]]]]) List[DataRowMetadata] [source]¶
Parse metadata responses
>>> mdo.parse_metadata([metadata])
- Parameters:
unparsed – An unparsed metadata export
- Returns:
List of DataRowMetadata
- Return type:
metadata
- parse_metadata_fields(unparsed: List[Dict[str, Dict]]) List[DataRowMetadataField] [source]¶
Parse metadata fields as list of DataRowMetadataField
>>> mdo.parse_metadata_fields([metadata_fields])
- Parameters:
unparsed – An unparsed list of metadata represented as a dict containing ‘schemaId’ and ‘value’
- Returns:
List of DataRowMetadataField
- Return type:
metadata
- parse_upsert_metadata(metadata_fields) List[Dict[str, Any]] [source]¶
- Converts either DataRowMetadataField or a dictionary representation
of DataRowMetadataField into a validated, flattened dictionary of metadata fields that are used to create data row metadata. Used internally in Dataset.create_data_rows()
- Parameters:
metadata_fields – List of DataRowMetadataField or a dictionary representation of DataRowMetadataField
- Returns:
List of dictionaries representing a flattened view of metadata fields
- refresh_ontology()[source]¶
Update the DataRowMetadataOntology instance with the latest metadata ontology schemas
- update_enum_option(name: str, option: str, new_option: str) DataRowMetadataSchema [source]¶
Update Enum metadata schema option
>>> mdo.update_enum_option(name, option, new_option)
- Parameters:
name (str) – Name of metadata schema to update
option (str) – Name of Enum option to update
new_option (str) – New name of Enum option
- Returns:
Updated metadata schema as DataRowMetadataSchema
- Raises:
KeyError – When provided name is not a valid custom metadata
- update_schema(name: str, new_name: str) DataRowMetadataSchema [source]¶
Update metadata schema
>>> mdo.update_schema(name, new_name)
- Parameters:
name (str) – Current name of metadata schema
new_name (str) – New name of metadata schema
- Returns:
Updated metadata schema as DataRowMetadataSchema
- Raises:
KeyError – When provided name is not a valid custom metadata
AnnotationImport¶
- class labelbox.schema.annotation_import.AnnotationImport(client, field_values)[source]¶
Bases:
DbObject
- property errors: List[Dict[str, Any]]¶
Errors for each individual annotation uploaded. This is a subset of statuses
- Returns:
List of dicts containing error messages. Empty list means there were no errors See AnnotationImport.statuses for more details.
This information will expire after 24 hours.
- property inputs: List[Dict[str, Any]]¶
Inputs for each individual annotation uploaded. This should match the ndjson annotations that you have uploaded. :returns: Uploaded ndjson.
This information will expire after 24 hours.
- property statuses: List[Dict[str, Any]]¶
Status for each individual annotation uploaded.
- Returns:
A status for each annotation if the upload is done running. See below table for more details
Field
Description
uuid
Specifies the annotation for the status row.
dataRow
JSON object containing the Labelbox data row ID for the annotation.
status
Indicates SUCCESS or FAILURE.
errors
An array of error messages included when status is FAILURE. Each error has a name, message and optional (key might not exist) additional_info.
This information will expire after 24 hours.
- wait_until_done(sleep_time_seconds: int = 10, show_progress: bool = False) None [source]¶
Blocks import job until certain conditions are met. Blocks until the AnnotationImport.state changes either to AnnotationImportState.FINISHED or AnnotationImportState.FAILED, periodically refreshing object’s state. :param sleep_time_seconds: a time to block between subsequent API calls :type sleep_time_seconds: int :param show_progress: should show progress bar :type show_progress: bool
- class labelbox.schema.annotation_import.LabelImport(client, field_values)[source]¶
Bases:
AnnotationImport
- classmethod create_from_file(client: Client, project_id: str, name: str, path: str) LabelImport [source]¶
Create a label import job from a file of annotations
- Parameters:
client – Labelbox Client for executing queries
project_id – Project to import labels into
name – Name of the import job. Can be used to reference the task later
path – Path to ndjson file containing annotations
- Returns:
LabelImport
- classmethod create_from_objects(client: labelbox.Client, project_id: str, name: str, labels: Union[List[Dict[str, Any]], List[Label]]) LabelImport [source]¶
Create a label import job from an in memory dictionary
- Parameters:
client – Labelbox Client for executing queries
project_id – Project to import labels into
name – Name of the import job. Can be used to reference the task later
labels – List of labels
- Returns:
LabelImport
- classmethod create_from_url(client: Client, project_id: str, name: str, url: str) LabelImport [source]¶
Create a label annotation import job from a url The url must point to a file containing label annotations.
- Parameters:
client – Labelbox Client for executing queries
project_id – Project to import labels into
name – Name of the import job. Can be used to reference the task later
url – Url pointing to file to upload
- Returns:
LabelImport
- classmethod from_name(client: Client, project_id: str, name: str, as_json: bool = False) LabelImport [source]¶
Retrieves an label import job.
- Parameters:
client – Labelbox Client for executing queries
project_id – ID used for querying import jobs
name – Name of the import job.
- Returns:
LabelImport
- property parent_id: str¶
Identifier for this import. Used to refresh the status
- class labelbox.schema.annotation_import.MALPredictionImport(client, field_values)[source]¶
Bases:
AnnotationImport
- classmethod create_from_file(client: Client, project_id: str, name: str, path: str) MALPredictionImport [source]¶
Create an MAL prediction import job from a file of annotations
- Parameters:
client – Labelbox Client for executing queries
project_id – Project to import labels into
name – Name of the import job. Can be used to reference the task later
path – Path to ndjson file containing annotations
- Returns:
MALPredictionImport
- classmethod create_from_objects(client: labelbox.Client, project_id: str, name: str, predictions: Union[List[Dict[str, Any]], List[Label]]) MALPredictionImport [source]¶
Create an MAL prediction import job from an in memory dictionary
- Parameters:
client – Labelbox Client for executing queries
project_id – Project to import labels into
name – Name of the import job. Can be used to reference the task later
predictions – List of prediction annotations
- Returns:
MALPredictionImport
- classmethod create_from_url(client: Client, project_id: str, name: str, url: str) MALPredictionImport [source]¶
Create an MAL prediction import job from a url The url must point to a file containing prediction annotations.
- Parameters:
client – Labelbox Client for executing queries
project_id – Project to import labels into
name – Name of the import job. Can be used to reference the task later
url – Url pointing to file to upload
- Returns:
MALPredictionImport
- classmethod from_name(client: Client, project_id: str, name: str, as_json: bool = False) MALPredictionImport [source]¶
Retrieves an MAL import job.
- Parameters:
client – Labelbox Client for executing queries
project_id – ID used for querying import jobs
name – Name of the import job.
- Returns:
MALPredictionImport
- property parent_id: str¶
Identifier for this import. Used to refresh the status
- class labelbox.schema.annotation_import.MEAPredictionImport(client, field_values)[source]¶
Bases:
AnnotationImport
- classmethod create_from_file(client: Client, model_run_id: str, name: str, path: str) MEAPredictionImport [source]¶
Create an MEA prediction import job from a file of annotations
- Parameters:
client – Labelbox Client for executing queries
model_run_id – Model run to import labels into
name – Name of the import job. Can be used to reference the task later
path – Path to ndjson file containing annotations
- Returns:
MEAPredictionImport
- classmethod create_from_objects(client: labelbox.Client, model_run_id: str, name, predictions: Union[List[Dict[str, Any]], List[Label]]) MEAPredictionImport [source]¶
Create an MEA prediction import job from an in memory dictionary
- Parameters:
client – Labelbox Client for executing queries
model_run_id – Model run to import labels into
name – Name of the import job. Can be used to reference the task later
predictions – List of prediction annotations
- Returns:
MEAPredictionImport
- classmethod create_from_url(client: Client, model_run_id: str, name: str, url: str) MEAPredictionImport [source]¶
Create an MEA prediction import job from a url The url must point to a file containing prediction annotations.
- Parameters:
client – Labelbox Client for executing queries
model_run_id – Model run to import labels into
name – Name of the import job. Can be used to reference the task later
url – Url pointing to file to upload
- Returns:
MEAPredictionImport
- classmethod from_name(client: Client, model_run_id: str, name: str, as_json: bool = False) MEAPredictionImport [source]¶
Retrieves an MEA import job.
- Parameters:
client – Labelbox Client for executing queries
model_run_id – ID used for querying import jobs
name – Name of the import job.
- Returns:
MEAPredictionImport
- property parent_id: str¶
Identifier for this import. Used to refresh the status
- class labelbox.schema.annotation_import.MEAToMALPredictionImport(client, field_values)[source]¶
Bases:
AnnotationImport
- classmethod create_for_model_run_data_rows(client: Client, model_run_id: str, data_row_ids: List[str], project_id: str, name: str) MEAToMALPredictionImport [source]¶
Create an MEA to MAL prediction import job from a list of data row ids of a specific model run
- Parameters:
client – Labelbox Client for executing queries
data_row_ids – A list of data row ids
model_run_id – model run id
- Returns:
MEAToMALPredictionImport
- classmethod from_name(client: Client, project_id: str, name: str, as_json: bool = False) MEAToMALPredictionImport [source]¶
Retrieves an MEA to MAL import job.
- Parameters:
client – Labelbox Client for executing queries
project_id – ID used for querying import jobs
name – Name of the import job.
- Returns:
MALPredictionImport
- property parent_id: str¶
Identifier for this import. Used to refresh the status
Batch¶
- class labelbox.schema.batch.Batch(client, project_id, *args, failed_data_row_ids=[], **kwargs)[source]¶
Bases:
DbObject
A Batch is a group of data rows submitted to a project for labeling
- name¶
- Type:
str
- created_at¶
- Type:
datetime
- updated_at¶
- Type:
datetime
- deleted¶
- Type:
bool
- created_by¶
ToOne relationship to User
- Type:
Relationship
- delete() None [source]¶
Deletes the given batch.
Note: Batch deletion for batches that has labels is forbidden.
- Parameters:
batch (Batch) – Batch to remove queued data rows from
- delete_labels(set_labels_as_template=False) None [source]¶
Deletes labels that were created for data rows in the batch.
- Parameters:
batch (Batch) – Batch to remove queued data rows from
set_labels_as_template (bool) – When set to true, the deleted labels will be kept as templates.
- export_data_rows(timeout_seconds=120, include_metadata: bool = False) Generator [source]¶
Returns a generator that produces all data rows that are currently in this batch.
Note: For efficiency, the data are cached for 30 minutes. Newly created data rows will not appear until the end of the cache period.
- Parameters:
timeout_seconds (float) – Max waiting time, in seconds.
include_metadata (bool) – True to return related DataRow metadata
- Returns:
Generator that yields DataRow objects belonging to this batch.
- Raises:
LabelboxError – if the export fails or is unable to download within the specified time.
- project() Project [source]¶
Returns Project which this Batch belongs to
- Raises:
LabelboxError – if the project is not found
ResourceTag¶
Slice¶
- class labelbox.schema.slice.CatalogSlice(client, field_values)[source]¶
Bases:
Slice
Represents a Slice used for filtering data rows in Catalog.
- export(task_name: Optional[str] = None, params: Optional[CatalogExportParams] = None) ExportTask [source]¶
Creates a slice export task with the given params and returns the task. >>> slice = client.get_catalog_slice(“SLICE_ID”) >>> task = slice.export( >>> params={“performance_details”: False, “label_details”: True} >>> ) >>> task.wait_till_done() >>> task.result
- export_v2(task_name: Optional[str] = None, params: Optional[CatalogExportParams] = None) Task [source]¶
Creates a slice export task with the given params and returns the task. >>> slice = client.get_catalog_slice(“SLICE_ID”) >>> task = slice.export_v2( >>> params={“performance_details”: False, “label_details”: True} >>> ) >>> task.wait_till_done() >>> task.result
- get_data_row_identifiers() PaginatedCollection [source]¶
Fetches all data row ids and global keys (where defined) that match this Slice
- Returns:
A PaginatedCollection of Slice.DataRowIdAndGlobalKey
- get_data_row_ids() PaginatedCollection [source]¶
Fetches all data row ids that match this Slice
- Returns:
A PaginatedCollection of mapping of data row ids to global keys
- class labelbox.schema.slice.ModelSlice(client, field_values)[source]¶
Bases:
Slice
Represents a Slice used for filtering data rows in Model.
- get_data_row_identifiers(model_run_id: str) PaginatedCollection [source]¶
Fetches all data row ids and global keys (where defined) that match this Slice
Params: model_run_id : str, required, uid or cuid of model run
- Returns:
A PaginatedCollection of Slice.DataRowIdAndGlobalKey
- get_data_row_ids(model_run_id: str) PaginatedCollection [source]¶
Fetches all data row ids that match this Slice
Params model_run_id: str, required, uid or cuid of model run
- Returns:
A PaginatedCollection of data row ids
- class labelbox.schema.slice.Slice(client, field_values)[source]¶
Bases:
DbObject
A Slice is a saved set of filters (saved query). This is an abstract class and should not be instantiated.
- name¶
- Type:
datetime
- description¶
- Type:
datetime
- created_at¶
- Type:
datetime
- updated_at¶
- Type:
datetime
- filter¶
- Type:
json
QualityMode¶
ExportTask¶
- class labelbox.schema.export_task.ExportTask(task: Task)[source]¶
Bases:
object
An adapter class for working with task objects, providing extended functionality and convenient access to task-related information.
This class wraps a Task object, allowing you to interact with tasks of this type. It offers methods to retrieve task results, errors, and metadata, as well as properties for accessing task details such as UID, status, and creation time.
- property completion_percentage¶
Returns the completion percentage of the task.
- property created_at¶
Returns the time the task was created.
- property created_by¶
Returns the user who created the task.
- property deleted¶
Returns whether the task is deleted.
- get_stream(converter: JsonConverter = JsonConverter(), stream_type: StreamType = StreamType.RESULT) Stream[JsonConverterOutput] [source]¶
- get_stream(converter: FileConverter, stream_type: StreamType = StreamType.RESULT) Stream[FileConverterOutput]
Returns the result of the task.
- get_total_file_size(stream_type: StreamType) Optional[int] [source]¶
Returns the total file size for a specific task.
- get_total_lines(stream_type: StreamType) Optional[int] [source]¶
Returns the total file size for a specific task.
- property metadata¶
Returns the metadata of the task.
- property name¶
Returns the name of the task.
- property organization¶
Returns the organization of the task.
- property result¶
Returns the result of the task.
- property status¶
Returns the status of the task.
- property type¶
Returns the type of the task.
- property uid¶
Returns the uid of the task.
- property updated_at¶
Returns the last time the task was updated.
- class labelbox.schema.export_task.FileConverter(file_path: str)[source]¶
Bases:
Converter
[FileConverterOutput
]Converts data to a file.
- convert(input_args: ConverterInputArgs) Iterator[FileConverterOutput] [source]¶
Converts the data. Returns an iterator that yields the converted data.
- Parameters:
current_offset – The global offset indicating the position of the data within the exported files. It represents a cumulative offset in characters across multiple files.
raw_data – The raw data to convert.
- Yields:
Iterator[OutputT] – The converted data.
- class labelbox.schema.export_task.FileConverterOutput(file_path: Path, total_size: int, total_lines: int, current_offset: int, current_line: int, bytes_written: int)[source]¶
Bases:
object
Output with statistics about the written file.
- class labelbox.schema.export_task.JsonConverter[source]¶
Bases:
Converter
[JsonConverterOutput
]Converts JSON data.
- convert(input_args: ConverterInputArgs) Iterator[JsonConverterOutput] [source]¶
Converts the data. Returns an iterator that yields the converted data.
- Parameters:
current_offset – The global offset indicating the position of the data within the exported files. It represents a cumulative offset in characters across multiple files.
raw_data – The raw data to convert.
- Yields:
Iterator[OutputT] – The converted data.
- class labelbox.schema.export_task.JsonConverterOutput(current_offset: int, current_line: int, json_str: str)[source]¶
Bases:
object
Output with the JSON string.
- class labelbox.schema.export_task.Stream(ctx: _TaskContext, reader: _Reader, converter: Converter)[source]¶
Bases:
Generic
[OutputT
]Streams data from a Reader.
Identifiables¶
Identifiable¶
- class labelbox.schema.identifiable.GlobalKey(key: str)[source]¶
Bases:
Identifiable
Represents a user generated id.
- class labelbox.schema.identifiable.Identifiable(key: str, id_type: str)[source]¶
Bases:
ABC
Base class for any object representing a unique identifier.
- class labelbox.schema.identifiable.UniqueId(key: str)[source]¶
Bases:
Identifiable
Represents a unique, internally generated id.
ConflictResolutionStrategy¶
FoundryClient¶
App¶
FoundryModel¶
SendToAnnotateParams¶
- class labelbox.schema.send_to_annotate_params.SendToAnnotateFromCatalogParams[source]¶
Bases:
TypedDict
Extra parameters for sending data rows to a project through catalog. At least one of source_model_run_id or source_project_id must be provided.
- Parameters:
source_model_run_id – Optional[str] - The model run to use for predictions. Defaults to None.
predictions_ontology_mapping – Optional[Dict[str, str]] - A mapping of feature schema ids to feature schema ids. Defaults to an empty dictionary.
source_project_id – Optional[str] - The project to use for predictions. Defaults to None.
annotations_ontology_mapping – Optional[Dict[str, str]] - A mapping of feature schema ids to feature schema ids. Defaults to an empty dictionary.
exclude_data_rows_in_project – Optional[bool] - Exclude data rows that are already in the project. Defaults to False.
override_existing_annotations_rule – Optional[ConflictResolutionStrategy] - The strategy defining how to handle conflicts in classifications between the data rows that already exist in the project and incoming predictions from the source model run or annotations from the source project. Defaults to ConflictResolutionStrategy.KEEP_EXISTING.
batch_priority – Optional[int] - The priority of the batch. Defaults to 5.
- class labelbox.schema.send_to_annotate_params.SendToAnnotateFromModelParams[source]¶
Bases:
TypedDict
Extra parameters for sending data rows to a project through a model run.
- Parameters:
predictions_ontology_mapping – Dict[str, str] - A mapping of feature schema ids to feature schema ids. Defaults to an empty dictionary.
exclude_data_rows_in_project – Optional[bool] - Exclude data rows that are already in the project. Defaults to False.
override_existing_annotations_rule – Optional[ConflictResolutionStrategy] - The strategy defining how to handle conflicts in classifications between the data rows that already exist in the project and incoming predictions from the source model run. Defaults to ConflictResolutionStrategy.KEEP_EXISTING.
batch_priority – Optional[int] - The priority of the batch. Defaults to 5.