Cuboids

Given a cuboid entry in params.geometries, Scale will annotate your image or video with perspective cuboids and return the vertices of the cuboids. If camera intrinsics and extrinsics are provided as well, Scale will return scale-invariant 3D coordinates with respect to the camera, i.e. assuming the camera is at the origin. See https://scale.com/blog/3d-cuboids-annotations for a detailed explanation of how we can augment 2D cuboid responses.

Request Parameters

ParameterTypeDefaultDescription
objects_to_annotatearray[]A list of string or LabelDescription objects.
min_heightinteger0The minimum height in pixels of the cuboids you'd like to be made.
min_widthinteger0The minimum width in pixels of the cuboids you'd like to be made.
camera_intrinsicsobjectnullAn object that defines camera intrinsics, in format {fx: number, fy: number, cx: number, cy: number, scalefactor: number, skew: number} (skew defaults to 0, scalefactor defaults to 1). scalefactor is used if the image sent is of different dimensions from the original photo (if the attachment is half the original, set scalefactor to 2) to correct the focal lengths and offsets. Use in conjunction with camera_rotation_quaternion and camera_height to get perspective-corrected cuboids and 3d points.
camera_rotation_quaternionobjectnullObject that defines the rotation of the camera in relation to the world. Expressed as a quaternion, in format {w: number, x: number, y: number, z: number}. Use in conjunction with camera_intrinsics to get perspective-corrected cuboids and 3d points. Note that the z-axis of the camera frame represents the camera's optical axis. Use in conjunction with camera_intrinsics and camera_height to get perspective-corrected cuboids and 3d points.
camera_heightnumbernullThe height of camera above the ground, in meters. Use in conjunction with camera_rotation_quaternion and camera_intrinsics to get perspective-corrected cuboids and 3d points.
{
  ...
  "geometries": {
    "cuboid": {
      "objects_to_annotate": [
        "car"
      ],
      "min_height": 10,
      "min_width": 10,
      "camera_intrinsics": {
        "fx": 986.778503418,
        "fy": 984.4254150391,
        "cx": 961.078918457,
        "cy": 586.9694824219,
        "skew": 0,
        "scale_factor": 1
      },
      "camera_rotation_quaternion": {
        "w": 0.0197866653,
        "x": 0.0181939654,
        "y": 0.6981190587,
        "z": -0.715476937
      },
      "camera_height": -0.2993970777
    }
  },
  ...
}

Response Fields

KeyTypeDescription
uuidstringA computer-generated unique identifier for this annotation.

In video annotation tasks, this can be used to track the same object across frames.
typestringString to indicate geometry type: cuboid
labelstringThe label of this annotation, chosen from the objects_to_annotate array for its geometry. In video annotation tasks, any annotation objects with the same uuid will have the same label across all frames.
attributesobjectSee the Annotation Attributes section for more details about the attributes response field.
verticesarray of Vertex objectsA list of Vertex objects defining all visible vertices of the cuboid. See the Vertex section for more details.
edgesarray of Edge objectsA list of Edge objects defining the edges of the cuboid.. See the Edge section for more details.
points_2darray of {x, y} coordinate objectsIf camera_rotation_quaternion, camera_intrinsics, and camera_height were provided, contains projected 2D coordinates of all 8 vertices of the cuboid after perspective correction. See diagram below for the order that the points are returned in.
points_3darray of {x, y, z} coordinate objectsIf camera_rotation_quaternion, camera_intrinsics, and camera_height were provided, contains 3D coordinates (arbitrarily scaled, relative to the camera location) of all 8 vertices of the cuboid after perspective correction. See diagram below for the order that the points are returned in.
Points on the cuboid are returned in this order for both points_2d and points_3d:

       3-------2
      /|      /|
     / |     / |
    0-------1  |
    |  7----|--6
    | /     | /
    4-------5

Definition: Vertex

KeyTypeDescription
xnumberThe distance, in pixels, between the vertex and the left border of the image.
ynumberThe distance, in pixels, between the vertex and the top border of the image.
typestringAlways vertex.
descriptionstringAn enum describing the position of the vertex, which is one of:
face-topleft
face-bottomleft
face-topright
face-bottomright
side-topcorner
side-bottomcorner

Definition: Edge

KeyTypeDescription
x1numberThe distance, in pixels, between the first vertex of the edge and the left border of the image.
y1numberThe distance, in pixels, between the first vertex of the edge and the top border of the image.
x2numberThe distance, in pixels, between the second vertex of the edge and the left border of the image.
y2numberThe distance, in pixels, between the second vertex of the edge and the top border of the image.
typestringAlways edge.
descriptionstringAn enum describing the position of the edge, which is one of::
face-top
face-bottom
face-left
face-right
side-top
side-bottom
{
  ...,
  "response": {
    "annotations": [
      {
        "label": "car",
        "vertices": [
          {
            "description": "face-topleft",
            "y": 270,
            "x": 293,
            "type": "vertex"
          },
          {
            "description": "face-bottomleft",
            "y": 437,
            "x": 293,
            "type": "vertex"
          },
          {
            "description": "face-topright",
            "y": 270,
            "x": 471,
            "type": "vertex"
          },
          {
            "description": "face-bottomright",
            "y": 437,
            "x": 471,
            "type": "vertex"
          },
          {
            "description": "side-topcorner",
            "y": 286,
            "x": 607,
            "type": "vertex"
          },
          {
            "description": "side-bottomcorner",
            "y": 373,
            "x": 607,
            "type": "vertex"
          }
        ],
        "edges": [
          {
            "description": "face-top",
            "x1": 293,
            "y1": 270,
            "x2": 471,
            "y2": 270,
            "type": "edge"
          },
          {
            "description": "face-right",
            "x1": 471,
            "y1": 270,
            "x2": 471,
            "y2": 437,
            "type": "edge"
          },
          {
            "description": "face-bottom",
            "x1": 471,
            "y1": 437,
            "x2": 293,
            "y2": 437,
            "type": "edge"
          },
          {
            "description": "face-left",
            "x1": 293,
            "y1": 437,
            "x2": 293,
            "y2": 270,
            "type": "edge"
          },
          {
            "description": "side-top",
            "x1": 471,
            "y1": 270,
            "x2": 607,
            "y2": 286,
            "type": "edge"
          },
          {
            "description": "side-bottom",
            "x1": 471,
            "y1": 437,
            "x2": 607,
            "y2": 373,
            "type": "edge"
          }
        ],
        "points_2d": [
          {
            "y": 270,
            "x": 293
          },
          {
            "y": 437,
            "x": 293
          },
          {
            "y": 270,
            "x": 471
          },
          {
            "y": 437,
            "x": 471
          },
          {
            "y": 286,
            "x": 607
          },
          {
            "y": 373,
            "x": 607
          },
          {
            "y": 373,
            "x": 607
          },
          {
            "y": 373,
            "x": 607
          }
        ],
        "points_3d": [
          {
            "z": 0,
            "y": 270,
            "x": 293
          },
          {
            "z": 0,
            "y": 437,
            "x": 293
          },
          {
            "z": 0,
            "y": 270,
            "x": 471
          },
          {
            "z": 0,
            "y": 437,
            "x": 471
          },
          {
            "z": 0,
            "y": 286,
            "x": 607
          },
          {
            "z": 0,
            "y": 373,
            "x": 607
          },
          {
            "z": 0,
            "y": 373,
            "x": 607
          },
          {
            "z": 0,
            "y": 373,
            "x": 607
          }
        ],
      }
    ]
  },
  ...
}