Project

class labelbox.schema.project.Project(client, field_values)[source]

Bases: DbObject, Updateable, Deletable

A Project is a container that includes a labeling frontend, an ontology, datasets and labels.

name

Type:: str

description

Type:: str

updated_at

Type:: datetime

created_at

Type:: datetime

setup_complete

Type:: datetime

last_activity_time

Type:: datetime

queue_mode

Type:: string

auto_audit_number_of_labels

Type:: int

auto_audit_percentage

Type:: float

created_by

ToOne relationship to User

Type:: Relationship

organization

ToOne relationship to Organization

Type:: Relationship

labeling_frontend

ToOne relationship to LabelingFrontend

Type:: Relationship

labeling_frontend_options

ToMany relationship to LabelingFrontendOptions

Type:: Relationship

labeling_parameter_overrides

ToMany relationship to LabelingParameterOverride

Type:: Relationship

webhooks

ToMany relationship to Webhook

Type:: Relationship

benchmarks

ToMany relationship to Benchmark

Type:: Relationship

ontology

ToOne relationship to Ontology

Type:: Relationship

task_queues[source]

ToMany relationship to TaskQueue

Type:: Relationship

batches() → PaginatedCollection[source]

Fetch all batches that belong to this project

Returns:: A PaginatedCollection of `Batch`es

bulk_import_requests() → PaginatedCollection[source]: Returns bulk import request objects which are used in model-assisted labeling. These are returned with the oldest first, and most recent last.

create_batch(name: str, data_rows: List[str | DataRow] | None = None, priority: int = 5, consensus_settings: Dict[str, float] | None = None, global_keys: List[str] | None = None)[source]

Creates a new batch for a project. One of global_keys or data_rows must be provided, but not both. A: maximum of 100,000 data rows can be added to a batch.

Parameters:

name – a name for the batch, must be unique within a project
data_rows – Either a list of DataRows or Data Row ids.
global_keys – global keys for data rows to add to the batch.
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}

Returns: the created batch

create_batches(name_prefix: str, data_rows: List[str | DataRow] | None = None, global_keys: List[str] | None = None, priority: int = 5, consensus_settings: Dict[str, float] | None = None) → CreateBatchesTask[source]

Creates batches for a project from a list of data rows. One of global_keys or data_rows must be provided, but not both. When more than 100k data rows are specified and thus multiple batches are needed, the specific batch that each data row will be placed in is undefined.

Batches will be created with the specified name prefix and a unique suffix. The suffix will be a 4-digit number starting at 0000. For example, if the name prefix is “batch” and 3 batches are created, the names will be “batch0000”, “batch0001”, and “batch0002”. This method will throw an error if a batch with the same name already exists.

Parameters:

name_prefix – a prefix for the batch names, must be unique within a project
data_rows – Either a list of DataRows or Data Row ids.
global_keys – global keys for data rows to add to the batch.
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}

Returns: a task for the created batches

create_batches_from_dataset(name_prefix: str, dataset_id: str, priority: int = 5, consensus_settings: Dict[str, float] | None = None) → CreateBatchesTask[source]

Creates batches for a project from a dataset, selecting only the data rows that are not already added to the project. When the dataset contains more than 100k data rows and multiple batches are needed, the specific batch that each data row will be placed in is undefined. Note that data rows may not be immediately available for a project after being added to a dataset; use the _wait_until_data_rows_are_processed method to ensure that data rows are available before creating batches.

Batches will be created with the specified name prefix and a unique suffix. The suffix will be a 4-digit number starting at 0000. For example, if the name prefix is “batch” and 3 batches are created, the names will be “batch0000”, “batch0001”, and “batch0002”. This method will throw an error if a batch with the same name already exists.

Parameters:

name_prefix – a prefix for the batch names, must be unique within a project
dataset_id – the id of the dataset to create batches from
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}

Returns: a task for the created batches

enable_model_assisted_labeling(toggle: bool = True) → bool[source]

Turns model assisted labeling either on or off based on input

Parameters:: toggle (bool) – True or False boolean
Returns:: True if toggled on or False if toggled off

export(task_name: str | None = None, filters: ProjectExportFilters | None = None, params: ProjectExportParams | None = None) → ExportTask[source]

Creates a project export task with the given params and returns the task.

>>>     task = project.export(
>>>         filters={
>>>             "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"],
>>>             "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"],
>>>             "data_row_ids": [DATA_ROW_ID_1, DATA_ROW_ID_2, ...] # or global_keys: [DATA_ROW_GLOBAL_KEY_1, DATA_ROW_GLOBAL_KEY_2, ...]
>>>             "batch_ids": [BATCH_ID_1, BATCH_ID_2, ...]
>>>         },
>>>         params={
>>>             "performance_details": False,
>>>             "label_details": True
>>>         })
>>>     task.wait_till_done()
>>>     task.result

export_issues(status=None) → str[source]

Calls the server-side Issues exporting that returns the URL to that payload.

Parameters:: status (string) – valid values: Open, Resolved
Returns:: URL of the data file with this Project’s issues.

export_labels(download=False, timeout_seconds=1800, **kwargs) → str | List[Dict[Any, Any]] | None[source]

Calls the server-side Label exporting that generates a JSON payload, and returns the URL to that payload.

Will only generate a new URL at a max frequency of 30 min.

Parameters:

download (bool) – Returns the url if False
timeout_seconds (float) – Max waiting time, in seconds.
start (str) – Earliest date for labels, formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
end (str) – Latest date for labels, formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
last_activity_start (str) – Will include all labels that have had any updates to data rows, issues, comments, metadata, or reviews since this timestamp. formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
last_activity_end (str) – Will include all labels that do not have any updates to data rows, issues, comments, metadata, or reviews after this timestamp. formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”

Returns:

URL of the data file with this Project’s labels. If the server didn’t generate during the timeout_seconds period, None is returned.

export_queued_data_rows(timeout_seconds=120, include_metadata: bool = False) → List[Dict[str, str]][source]

Returns all data rows that are currently enqueued for this project.

Parameters:

timeout_seconds (float) – Max waiting time, in seconds.
include_metadata (bool) – True to return related DataRow metadata

Returns:

Data row fields for all data rows in the queue as json

Raises:

LabelboxError – if the export fails or is unable to download within the specified time.

export_v2(task_name: str | None = None, filters: ProjectExportFilters | None = None, params: ProjectExportParams | None = None) → Task | ExportTask[source]

Creates a project export task with the given params and returns the task.

For more information visit: https://docs.labelbox.com/docs/exports-v2#export-from-a-project-python-sdk

>>>     task = project.export_v2(
>>>         filters={
>>>             "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"],
>>>             "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"],
>>>             "data_row_ids": [DATA_ROW_ID_1, DATA_ROW_ID_2, ...] # or global_keys: [DATA_ROW_GLOBAL_KEY_1, DATA_ROW_GLOBAL_KEY_2, ...]
>>>             "batch_ids": [BATCH_ID_1, BATCH_ID_2, ...]
>>>         },
>>>         params={
>>>             "performance_details": False,
>>>             "label_details": True
>>>         })
>>>     task.wait_till_done()
>>>     task.result

extend_reservations(queue_type) → int[source]

Extends all the current reservations for the current user on the given queue type. :param queue_type: Either “LabelingQueue” or “ReviewQueue” :type queue_type: str

Returns:: int, the number of reservations that were extended.

get_label_count() → int[source]: Returns: the total number of labels in this project.

get_queue_mode() → QueueMode[source]

Provides the queue mode used for this project.

Deprecation notice: This method is deprecated and will be removed in a future version. To obtain the queue mode of a project, simply refer to the queue_mode attribute of a Project.

For more information, visit https://docs.labelbox.com/reference/migrating-to-workflows#upcoming-changes

Returns: the QueueMode for this project

get_resource_tags() → List[ResourceTag][source]: Returns tags for a project

label_generator(timeout_seconds=600, **kwargs)[source]

Download text and image annotations, or video annotations.

For a mixture of text/image and video, use project.export_labels()

Returns:: LabelGenerator for accessing labels

labeler_performance() → PaginatedCollection[source]

Returns the labeler performances for this Project.

Returns:: A PaginatedCollection of LabelerPerformance objects.

labels(datasets=None, order_by=None) → PaginatedCollection[source]

Custom relationship expansion method to support limited filtering.

Parameters:

datasets (iterable of Dataset) – Optional collection of Datasets whose Labels are sought. If not provided, all Labels in this Project are returned.
order_by (None or (Field, Field.Order)) – Ordering clause.

members() → PaginatedCollection[source]

Fetch all current members for this project

Returns:: A PaginatedCollection of `ProjectMember`s

move_data_rows_to_task_queue(data_row_ids: UniqueIds | GlobalKeys, task_queue_id: str)[source]

move_data_rows_to_task_queue(data_row_ids: List[str], task_queue_id: str)

Moves data rows to the specified task queue.

Parameters:

data_row_ids – a list of data row ids to be moved. This can be a list of strings or a DataRowIdentifiers object DataRowIdentifier objects are lists of ids or global keys. A DataIdentifier object can be a UniqueIds or GlobalKeys class.
task_queue_id – the task queue id to be moved to, or None to specify the “Done” queue

Returns:

None if successful, or a raised error on failure

review_metrics(net_score) → int[source]

Returns this Project’s review metrics.

Parameters:: net_score (None or Review.NetScore) – Indicates desired metric.
Returns:: int, aggregation count of reviews for given net_score.

set_labeling_parameter_overrides(data: List[Tuple[DataRow | UniqueId | GlobalKey, int]]) → bool[source]

Adds labeling parameter overrides to this project.

See information on priority here:

https://docs.labelbox.com/en/configure-editor/queue-system#reservation-system

>>> project.set_labeling_parameter_overrides([
>>>     (data_row_id1, 2), (data_row_id2, 1)])
or
>>> project.set_labeling_parameter_overrides([
>>>     (data_row_gk1, 2), (data_row_gk2, 1)])

Parameters:

data (iterable) –

An iterable of tuples. Each tuple must contain either (DataRow, DataRowPriority<int>) or (DataRowIdentifier, priority<int>) for the new override. DataRowIdentifier is an object representing a data row id or a global key. A DataIdentifier object can be a UniqueIds or GlobalKeys class. NOTE - passing whole DatRow is deprecated. Please use a DataRowIdentifier instead.

Priority:

Data will be labeled in priority order.
- A lower number priority is labeled first.
- All signed 32-bit integers are accepted, from -2147483648 to 2147483647.
Priority is not the queue position.
- The position is determined by the relative priority.
- E.g. [(data_row_1, 5,1), (data_row_2, 2,1), (data_row_3, 10,1)]
  will be assigned in the following order: [data_row_2, data_row_1, data_row_3]
The priority only effects items in the queue.
- Assigning a priority will not automatically add the item back into the queue.

Returns:

bool, indicates if the operation was a success.

setup(labeling_frontend, labeling_frontend_options) → None[source]

Finalizes the Project setup.

Parameters:

labeling_frontend (LabelingFrontend) – Which UI to use to label the data.
labeling_frontend_options (dict or str) – Labeling frontend options, a.k.a. project ontology. If given a dict it will be converted to str using json.dumps.

setup_editor(ontology) → None[source]

Sets up the project using the Pictor editor.

Parameters:: ontology (Ontology) – The ontology to attach to the project

task_queues() → List[TaskQueue][source]

Fetch all task queues that belong to this project

Returns:: A List of `TaskQueue`s

update(**kwargs)[source]

Updates this project with the specified attributes

Parameters:: kwargs – a dictionary containing attributes to be upserted

Note that the queue_mode cannot be changed after a project has been created.

Additionally, the quality setting cannot be changed after a project has been created. The quality mode for a project is inferred through the following attributes:

Benchmark:: auto_audit_number_of_labels = 1 and auto_audit_percentage = 1.0
Consensus:: auto_audit_number_of_labels > 1 or auto_audit_percentage <= 1.0

Attempting to switch between benchmark and consensus modes is an invalid operation and will result in an error.

update_data_row_labeling_priority(data_rows: UniqueIds | GlobalKeys, priority: int) → bool[source]

update_data_row_labeling_priority(data_rows: List[str], priority: int) → bool

Updates labeling parameter overrides to this project in bulk. This method allows up to 1 million data rows to be updated at once.

See information on priority here:: https://docs.labelbox.com/en/configure-editor/queue-system#reservation-system

Parameters:

data_rows – a list of data row ids to update priorities for. This can be a list of strings or a DataRowIdentifiers object DataRowIdentifier objects are lists of ids or global keys. A DataIdentifier object can be a UniqueIds or GlobalKeys class.
priority (int) – Priority for the new override. See above for more information.

Returns:

bool, indicates if the operation was a success.

update_project_resource_tags(resource_tag_ids: List[str]) → List[ResourceTag][source]

Creates project resource tags

Parameters:: resource_tag_ids –
Returns:: a list of ResourceTag ids that was created.

upload_annotations(name: str, annotations: str | Path | Iterable[Dict], validate: bool = False) → BulkImportRequest[source]

Uploads annotations to a new Editor project.

Parameters:

name (str) – name of the BulkImportRequest job
annotations (str or Path or Iterable) – url that is publicly accessible by Labelbox containing an ndjson file OR local path to an ndjson file OR iterable of annotation rows
validate (bool) – Whether or not to validate the payload before uploading.

Returns:

BulkImportRequest

upsert_instructions(instructions_file: str) → None[source]

Uploads instructions to the UI. Running more than once will replace the instructions

Parameters:

instructions_file (str) – Path to a local file. * Must be a pdf or html file

Raises:

ValueError –

project must be setup * instructions file must have a “.pdf” or “.html” extension

class labelbox.schema.project.ProjectMember(client, field_values)[source]: Bases: DbObject