Uploading blobs
Blobs must be uploaded to the PDS before a record can be created referencing that blob. Note that the server does not know the intended Lexicon when receiving an upload, so can only apply generic blob limits and restrictions at initial upload time, and then enforce Lexicon-defined limits later when the record is created.
Clients use the com.atproto.repo.uploadBlob endpoint on their PDS, which will return verified metadata in the form of a Lexicon blob object. Clients should set the HTTP Content-Type header and should set the Content-Length headers on the upload request. SDKs can handle this automatically:
const image = 'data:image/png;base64,...'
const { data } = await agent.uploadBlob(convertDataURIToUint8Array(image), {
encoding,
})
This data object could then be referenced in another record, e.g. as an embed.
Chunked transfer encoding may also be permitted for uploads. Servers may sniff the blob mimetype to validate against the declared Content-Type header, and either return a modified mimetype in the response, or reject the upload. See "Security Considerations". If the actual blob upload size differs from the Content-Length header, the server should reject the upload.
Garbage collecting
After a successful upload, blobs are placed in temporary storage. They are not accessible for download or distribution while in this state. Servers should "garbage collect" (delete) un-referenced temporary blobs after an appropriate time span (see implementation guidelines). Blobs which are in temporary storage should not be included in the listBlobs output.
Referencing blobs
The upload blob can now be referenced from records by including the returned blob metadata in a record. When processing record creation, the server extracts the set of all referenced blobs, and checks that they are either already referenced, or are in temporary storage. Once the record creation succeeds, the server makes the blob publicly accessible.
The same blob can be referenced by multiple records in the same repository. Re-uploading a blob which has already been stored and referenced results in no change to the existing blobs or records.
Creation of new individual records which reference a blob which does not exist should be rejected at the time of creation (or update). However, it is possible for servers to host repository records which reference blobs which are not available locally. For example, during a bulk repository import or account migration; data loss; or content deletion/removal for policy reasons.
Deleting blobs
When a record referencing blobs is deleted, the server checks if any other current records from the same repository reference the blob. If not, the blob is deleted along with the record.
When an account is deleted, all the hosted blobs are deleted, within some reasonable time frame. When an account is deactivated, takendown, or suspended, blobs should not be publicly accessible.
Servers may decide to make individual blobs inaccessible, separately from any account takedown or other account lifecycle events.
Web accessibility
Original blobs can be fetched from the PDS using the com.atproto.sync.getBlob endpoint. The server should return appropriate Content-Type and Content-Length HTTP headers. It is not a recommended or required pattern to serve media directly from the PDS to end-user browsers, and servers do not need to support or facilitate this use case. See "Security Considerations" for more.
Servers may have their own generic limits and policies for blobs, separate from any Lexicon-defined constraints. They might implement account-wide quotas on data storage; maximum blob sizes; content policies; etc. Any of these restrictions might be enforced at the initial upload. Server operators should be aware that limits and other restrictions may impact functionality with existing and future applications. To maximize interoperability, operators are recommended to prefer limits on overall account resource consumption (e.g., "total blob size" quota, not "per blob" size limits).
Some applications may have a long delay between blob upload and reference from a record. To maximize interoperability, server implementations and operators are recommended to allow several hours of grace time before "garbage collecting", with at least one hour a firm lower bound.
Related resources
- Images and Video
- Blob security
- Video handling
- Blob Specs
- Streamplace provides an implementation and related Lexicon for video streaming on Atproto.