Project
- class labelbox.schema.project.Project(client, field_values)[source]
Bases:
DbObject
,Updateable
,Deletable
A Project is a container that includes a labeling frontend, an ontology, datasets and labels.
- name
- Type:
str
- description
- Type:
str
- updated_at
- Type:
datetime
- created_at
- Type:
datetime
- setup_complete
- Type:
datetime
- last_activity_time
- Type:
datetime
- queue_mode
- Type:
string
- auto_audit_number_of_labels
- Type:
int
- auto_audit_percentage
- Type:
float
- created_by
ToOne relationship to User
- Type:
Relationship
- organization
ToOne relationship to Organization
- Type:
Relationship
- labeling_frontend
ToOne relationship to LabelingFrontend
- Type:
Relationship
- labeling_frontend_options
ToMany relationship to LabelingFrontendOptions
- Type:
Relationship
- labeling_parameter_overrides
ToMany relationship to LabelingParameterOverride
- Type:
Relationship
- webhooks
ToMany relationship to Webhook
- Type:
Relationship
- benchmarks
ToMany relationship to Benchmark
- Type:
Relationship
- ontology
ToOne relationship to Ontology
- Type:
Relationship
- batches() PaginatedCollection [source]
Fetch all batches that belong to this project
- Returns:
A PaginatedCollection of `Batch`es
- bulk_import_requests() PaginatedCollection [source]
Returns bulk import request objects which are used in model-assisted labeling. These are returned with the oldest first, and most recent last.
- create_batch(name: str, data_rows: List[str | DataRow] | None = None, priority: int = 5, consensus_settings: Dict[str, float] | None = None, global_keys: List[str] | None = None)[source]
- Creates a new batch for a project. One of global_keys or data_rows must be provided, but not both. A
maximum of 100,000 data rows can be added to a batch.
- Parameters:
name – a name for the batch, must be unique within a project
data_rows – Either a list of DataRows or Data Row ids.
global_keys – global keys for data rows to add to the batch.
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}
Returns: the created batch
- create_batches(name_prefix: str, data_rows: List[str | DataRow] | None = None, global_keys: List[str] | None = None, priority: int = 5, consensus_settings: Dict[str, float] | None = None) CreateBatchesTask [source]
Creates batches for a project from a list of data rows. One of global_keys or data_rows must be provided, but not both. When more than 100k data rows are specified and thus multiple batches are needed, the specific batch that each data row will be placed in is undefined.
Batches will be created with the specified name prefix and a unique suffix. The suffix will be a 4-digit number starting at 0000. For example, if the name prefix is “batch” and 3 batches are created, the names will be “batch0000”, “batch0001”, and “batch0002”. This method will throw an error if a batch with the same name already exists.
- Parameters:
name_prefix – a prefix for the batch names, must be unique within a project
data_rows – Either a list of DataRows or Data Row ids.
global_keys – global keys for data rows to add to the batch.
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}
Returns: a task for the created batches
- create_batches_from_dataset(name_prefix: str, dataset_id: str, priority: int = 5, consensus_settings: Dict[str, float] | None = None) CreateBatchesTask [source]
Creates batches for a project from a dataset, selecting only the data rows that are not already added to the project. When the dataset contains more than 100k data rows and multiple batches are needed, the specific batch that each data row will be placed in is undefined. Note that data rows may not be immediately available for a project after being added to a dataset; use the _wait_until_data_rows_are_processed method to ensure that data rows are available before creating batches.
Batches will be created with the specified name prefix and a unique suffix. The suffix will be a 4-digit number starting at 0000. For example, if the name prefix is “batch” and 3 batches are created, the names will be “batch0000”, “batch0001”, and “batch0002”. This method will throw an error if a batch with the same name already exists.
- Parameters:
name_prefix – a prefix for the batch names, must be unique within a project
dataset_id – the id of the dataset to create batches from
priority – An optional priority for the Data Rows in the Batch. 1 highest -> 5 lowest
consensus_settings – An optional dictionary with consensus settings: {‘number_of_labels’: 3, ‘coverage_percentage’: 0.1}
Returns: a task for the created batches
- enable_model_assisted_labeling(toggle: bool = True) bool [source]
Turns model assisted labeling either on or off based on input
- Parameters:
toggle (bool) – True or False boolean
- Returns:
True if toggled on or False if toggled off
- export(task_name: str | None = None, filters: ProjectExportFilters | None = None, params: ProjectExportParams | None = None) ExportTask [source]
Creates a project export task with the given params and returns the task.
>>> task = project.export( >>> filters={ >>> "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "data_row_ids": [DATA_ROW_ID_1, DATA_ROW_ID_2, ...] # or global_keys: [DATA_ROW_GLOBAL_KEY_1, DATA_ROW_GLOBAL_KEY_2, ...] >>> "batch_ids": [BATCH_ID_1, BATCH_ID_2, ...] >>> }, >>> params={ >>> "performance_details": False, >>> "label_details": True >>> }) >>> task.wait_till_done() >>> task.result
- export_issues(status=None) str [source]
Calls the server-side Issues exporting that returns the URL to that payload.
- Parameters:
status (string) – valid values: Open, Resolved
- Returns:
URL of the data file with this Project’s issues.
- export_labels(download=False, timeout_seconds=1800, **kwargs) str | List[Dict[Any, Any]] | None [source]
Calls the server-side Label exporting that generates a JSON payload, and returns the URL to that payload.
Will only generate a new URL at a max frequency of 30 min.
- Parameters:
download (bool) – Returns the url if False
timeout_seconds (float) – Max waiting time, in seconds.
start (str) – Earliest date for labels, formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
end (str) – Latest date for labels, formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
last_activity_start (str) – Will include all labels that have had any updates to data rows, issues, comments, metadata, or reviews since this timestamp. formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
last_activity_end (str) – Will include all labels that do not have any updates to data rows, issues, comments, metadata, or reviews after this timestamp. formatted “YYYY-MM-DD” or “YYYY-MM-DD hh:mm:ss”
- Returns:
URL of the data file with this Project’s labels. If the server didn’t generate during the timeout_seconds period, None is returned.
- export_queued_data_rows(timeout_seconds=120, include_metadata: bool = False) List[Dict[str, str]] [source]
Returns all data rows that are currently enqueued for this project.
- Parameters:
timeout_seconds (float) – Max waiting time, in seconds.
include_metadata (bool) – True to return related DataRow metadata
- Returns:
Data row fields for all data rows in the queue as json
- Raises:
LabelboxError – if the export fails or is unable to download within the specified time.
- export_v2(task_name: str | None = None, filters: ProjectExportFilters | None = None, params: ProjectExportParams | None = None) Task | ExportTask [source]
Creates a project export task with the given params and returns the task.
For more information visit: https://docs.labelbox.com/docs/exports-v2#export-from-a-project-python-sdk
>>> task = project.export_v2( >>> filters={ >>> "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"], >>> "data_row_ids": [DATA_ROW_ID_1, DATA_ROW_ID_2, ...] # or global_keys: [DATA_ROW_GLOBAL_KEY_1, DATA_ROW_GLOBAL_KEY_2, ...] >>> "batch_ids": [BATCH_ID_1, BATCH_ID_2, ...] >>> }, >>> params={ >>> "performance_details": False, >>> "label_details": True >>> }) >>> task.wait_till_done() >>> task.result
- extend_reservations(queue_type) int [source]
Extends all the current reservations for the current user on the given queue type. :param queue_type: Either “LabelingQueue” or “ReviewQueue” :type queue_type: str
- Returns:
int, the number of reservations that were extended.
- get_queue_mode() QueueMode [source]
Provides the queue mode used for this project.
Deprecation notice: This method is deprecated and will be removed in a future version. To obtain the queue mode of a project, simply refer to the queue_mode attribute of a Project.
For more information, visit https://docs.labelbox.com/reference/migrating-to-workflows#upcoming-changes
Returns: the QueueMode for this project
- get_resource_tags() List[ResourceTag] [source]
Returns tags for a project
- label_generator(timeout_seconds=600, **kwargs)[source]
Download text and image annotations, or video annotations.
For a mixture of text/image and video, use project.export_labels()
- Returns:
LabelGenerator for accessing labels
- labeler_performance() PaginatedCollection [source]
Returns the labeler performances for this Project.
- Returns:
A PaginatedCollection of LabelerPerformance objects.
- labels(datasets=None, order_by=None) PaginatedCollection [source]
Custom relationship expansion method to support limited filtering.
- Parameters:
datasets (iterable of Dataset) – Optional collection of Datasets whose Labels are sought. If not provided, all Labels in this Project are returned.
order_by (None or (Field, Field.Order)) – Ordering clause.
- members() PaginatedCollection [source]
Fetch all current members for this project
- Returns:
A PaginatedCollection of `ProjectMember`s
- move_data_rows_to_task_queue(data_row_ids: UniqueIds | GlobalKeys, task_queue_id: str)[source]
- move_data_rows_to_task_queue(data_row_ids: List[str], task_queue_id: str)
Moves data rows to the specified task queue.
- Parameters:
data_row_ids – a list of data row ids to be moved. This can be a list of strings or a DataRowIdentifiers object DataRowIdentifier objects are lists of ids or global keys. A DataIdentifier object can be a UniqueIds or GlobalKeys class.
task_queue_id – the task queue id to be moved to, or None to specify the “Done” queue
- Returns:
None if successful, or a raised error on failure
- review_metrics(net_score) int [source]
Returns this Project’s review metrics.
- Parameters:
net_score (None or Review.NetScore) – Indicates desired metric.
- Returns:
int, aggregation count of reviews for given net_score.
- set_labeling_parameter_overrides(data: List[Tuple[DataRow | UniqueId | GlobalKey, int]]) bool [source]
Adds labeling parameter overrides to this project.
- See information on priority here:
https://docs.labelbox.com/en/configure-editor/queue-system#reservation-system
>>> project.set_labeling_parameter_overrides([ >>> (data_row_id1, 2), (data_row_id2, 1)]) or >>> project.set_labeling_parameter_overrides([ >>> (data_row_gk1, 2), (data_row_gk2, 1)])
- Parameters:
data (iterable) –
An iterable of tuples. Each tuple must contain either (DataRow, DataRowPriority<int>) or (DataRowIdentifier, priority<int>) for the new override. DataRowIdentifier is an object representing a data row id or a global key. A DataIdentifier object can be a UniqueIds or GlobalKeys class. NOTE - passing whole DatRow is deprecated. Please use a DataRowIdentifier instead.
- Priority:
- Data will be labeled in priority order.
A lower number priority is labeled first.
All signed 32-bit integers are accepted, from -2147483648 to 2147483647.
- Priority is not the queue position.
The position is determined by the relative priority.
- E.g. [(data_row_1, 5,1), (data_row_2, 2,1), (data_row_3, 10,1)]
will be assigned in the following order: [data_row_2, data_row_1, data_row_3]
- The priority only effects items in the queue.
Assigning a priority will not automatically add the item back into the queue.
- Returns:
bool, indicates if the operation was a success.
- setup(labeling_frontend, labeling_frontend_options) None [source]
Finalizes the Project setup.
- Parameters:
labeling_frontend (LabelingFrontend) – Which UI to use to label the data.
labeling_frontend_options (dict or str) – Labeling frontend options, a.k.a. project ontology. If given a dict it will be converted to str using json.dumps.
- setup_editor(ontology) None [source]
Sets up the project using the Pictor editor.
- Parameters:
ontology (Ontology) – The ontology to attach to the project
- task_queues() List[TaskQueue] [source]
Fetch all task queues that belong to this project
- Returns:
A List of `TaskQueue`s
- update(**kwargs)[source]
Updates this project with the specified attributes
- Parameters:
kwargs – a dictionary containing attributes to be upserted
Note that the queue_mode cannot be changed after a project has been created.
Additionally, the quality setting cannot be changed after a project has been created. The quality mode for a project is inferred through the following attributes:
- Benchmark:
auto_audit_number_of_labels = 1 and auto_audit_percentage = 1.0
- Consensus:
auto_audit_number_of_labels > 1 or auto_audit_percentage <= 1.0
Attempting to switch between benchmark and consensus modes is an invalid operation and will result in an error.
- update_data_row_labeling_priority(data_rows: UniqueIds | GlobalKeys, priority: int) bool [source]
- update_data_row_labeling_priority(data_rows: List[str], priority: int) bool
Updates labeling parameter overrides to this project in bulk. This method allows up to 1 million data rows to be updated at once.
- See information on priority here:
https://docs.labelbox.com/en/configure-editor/queue-system#reservation-system
- Parameters:
data_rows – a list of data row ids to update priorities for. This can be a list of strings or a DataRowIdentifiers object DataRowIdentifier objects are lists of ids or global keys. A DataIdentifier object can be a UniqueIds or GlobalKeys class.
priority (int) – Priority for the new override. See above for more information.
- Returns:
bool, indicates if the operation was a success.
- update_project_resource_tags(resource_tag_ids: List[str]) List[ResourceTag] [source]
Creates project resource tags
- Parameters:
resource_tag_ids –
- Returns:
a list of ResourceTag ids that was created.
- upload_annotations(name: str, annotations: str | Path | Iterable[Dict], validate: bool = False) BulkImportRequest [source]
Uploads annotations to a new Editor project.
- Parameters:
name (str) – name of the BulkImportRequest job
annotations (str or Path or Iterable) – url that is publicly accessible by Labelbox containing an ndjson file OR local path to an ndjson file OR iterable of annotation rows
validate (bool) – Whether or not to validate the payload before uploading.
- Returns:
BulkImportRequest
- upsert_instructions(instructions_file: str) None [source]
Uploads instructions to the UI. Running more than once will replace the instructions
- Parameters:
instructions_file (str) – Path to a local file. * Must be a pdf or html file
- Raises:
ValueError –
project must be setup * instructions file must have a “.pdf” or “.html” extension