Embedding

class labelbox.schema.embedding.Embedding(client: AdvClient, *, id: str, name: str, custom: bool, dims: int)[source]

Bases: BaseModel

An Embedding is used to power similarity search in Catalog.

This model supports the representation of both Precomputed embeddings that Labelbox provides, and Custom embeddings which can be imported directly into Labelbox.

id

The ID of the embedding

Type:: str

name

The name of the embedding

Type:: str

dims

Refers to the size of the vector space in which words, phrases, or other entities are embedded

Type:: int

custom

Indicates whether the embedding is a Precomputed embedding or a Custom embedding

Type:: bool

delete()[source]: Delete a custom embedding. If the embedding does not exist or cannot be deleted, an AdvLibException is raised.

get_imported_vector_count() → int[source]

Return the # of vectors actually imported into Labelbox. This will give you an accurate count of the number of vectors written into the vector search system.

Returns:: The number of imported vectors.

import_vectors_from_file(path: str, callback: Callable[[Dict[str, Any]], None] | None = None)[source]

Import vectors into a given embedding from an NDJSON file. An NDJSON file consists of newline delimited JSON. Each line of the file is valid JSON, but the entire file itself is NOT. The format of the file looks like:

{“id”: DATAROW ID1, “vector”: [ array of floats ]}

{“id”: DATAROW ID2, “vector”: [ array of floats ]}

{“id”: DATAROW ID3, “vector”: [ array of floats ]}

The vectors are added to the system in an async manner and it may take up to a couple minutes before they are usable via similarity search. Note that you also need to upload at least 1000 vectors in order for similarity search to be activated.

Parameters:

path – The path to the NDJSON file.
callback – a callback function used get the status of each batch of lines uploaded.