Custom OCR allows customers to attach links to their OCR response for Microsoft CV and Amazon Textract on task creation. This feature allows customers to use their own OCR engine instead of the default one provided by Scale.
API OCR Attachment
Customers can create tasks via API and include metadata.customer_ocr_url pointing to their hosted URL and metadata.customer_ocr_type which is either microsoft_computer_vision or amazon_textract (creation, ocr processing).
Example Textract:
File: https://static.scale.com/uploads/studio-ocr-textract-test
Task Payload (note attachment doesn’t line up with OCR):
{
"callback_url": "[email protected]",
"attachments": [
"scaledata://625a3de80cd22c001d036d14/531e861b-c322-427e-a0e6-d7d93b448038"
],
"project": "Doc Project 2",
"instruction": "please label the things",
"metadata": {
"customer_ocr_url": "https://static.scale.com/uploads/studio-ocr-textract-test",
"customer_ocr_type": "amazon_textract"
}
}
Example MCV:
File: https://static.scale.com/uploads/studio-ocr-mcv-test
Task Payload (note attachment doesn’t line up with OCR):
{
"callback_url": "[email protected]",
"attachments": [
"scaledata://625a3de80cd22c001d036d14/531e861b-c322-427e-a0e6-d7d93b448038"
],
"project": "Doc Project 2",
"instruction": "please label the things",
"metadata": {
"customer_ocr_url": "https://static.scale.com/uploads/studio-ocr-mcv-test",
"customer_ocr_type": "microsoft_computer_vision"
}
}
Custom OCR allows customers to use their own OCR engine on Scale's platform, providing more flexibility and control over the OCR process. This feature can be implemented by including the customer_ocr_url and customer_ocr_type in the task payload when creating tasks via API.