Configure Your Project Settings

There are 3 steps to configuring your project:

  1. Choose a use case
  2. Choose a pipeline
  3. Specify advanced settings
2870

Choose a Use Case

Based on the type of data you uploaded, you will be given a selection of use cases to choose from.

There are several use cases we support.

Object Detection: This project type supports geometry based labels with the option to specify custom attributes or metadata for each label. These geometry annotations include bounding boxes, polygons, lines, points, and cuboids - which are annotated with unique IDs.

Semantic Segmentation: Use 2D semantic segmentation for pixel-wise annotation which is useful to classify both the things of the world (e.g discrete objects) amorphous regions (e.g., sky and ground). We support both partial and full semantic segmentation as well as instance masks for panoptix segmentation.

Object & Event Detection: Use this for geometry based object tracking and event tagging for videos. Use bounding boxes, polygons, lines, points, and cuboids to track objects and people frame to frame with unique and consistent object IDs. Create named events tied to specific frames and optionally add verbal or written descriptions for each event.

Named Entity Recognition: Use this to tag entities found in text using specific labels. For example, you could set up a task to label any mention of time or duration in a set of text. This project type also supports named semantic relationships between entities.

Entity Extraction: Extract key fields from your documents. The output includes a bounding box for each field along with a text transcription. We can also support named semantic relationship between fields like invoice amount to the amount listed.

Content Classification: Classify a data asset. Assess what category a piece of text/image/audio/video/etc falls into.

Content Collection: Gather information from a data asset. This could include classification, free form text gathering, ranking, and more.

Text Generation: Ask annotators to write a text response based on some task inputs that you provide. For example, write a summary based on the data assets provided.

Audio Transcription: Transcribe audio and video assets. Select a chunk of audio/video to get an auto-transcribed selection that you can edit. Specify different speakers for conversational dialogues. You can also add on NER labeling based on the transcribed task.

LiDAR Annotation: Use LIDAR annotation to annotate your point cloud data with 3D cuboids in Scale’s platform. While autonomous driving sequences is the primary use case we support, Scale’s platform can also ingest 3D point cloud reconstructions to support a wide range of 3D annotation projects across many use cases such as sidewalk robotics, drone delivery, AR/VR, or more.

📘

LIDAR annotation is available to Pro and Enterprise customers only.

If you’re interested in using Studio’s LIDAR annotation capabilities, please contact the team at [email protected].

For details about use cases, see use cases.

Choose Your Pipeline

For some use cases, you have the ability to specify the kind of pipeline you want to use.

Standard Pipeline: This pipeline uses 1 attempt for every task, with an additional number of reviews for each attempt. The project default is set to 0 reviews, but you can change the number of reviews if desired.

Consensus Pipeline: Choose this pipeline if you want to maximize your quality or if you have a subjective task which may have multiple right answers. Using a consensus pipeline will send each task to 3 different attempters and the majority answer will be the one that is delivered to you as the “final” response.

Specify Advanced Settings

Pipeline Setting Name Description
Shared by all pipelines Number of reviews Specify the number of reviews needed
Auto redo rejected tasks Automatically redo rejected audits with a 50% surcharge for the redos
Common errors updates Automatically convert rejected audits into common errors in instruction
Assign rejections Automatically assigns rejected tasks back to the previous labeler.
Consensus Pipeline Number of consensus attempts Specify the number of initial labeler responses to collect. A consensus result will be calculated based on these responses.
Entity Extraction Pipeline OCR Provider Specify which default OCR provider to use.

At any point in the future, you can also adjust these settings from your batches page. Just click "Settings" on the batches page once you have created your first project

2864