Timon Harz

December 12, 2024

Sora System Card

Discover how OpenAI's Sora revolutionizes content creation by transforming text into lifelike, dynamic videos. From marketing to education, explore its vast potential across industries.

Introduction

Overview of Sora

Sora is OpenAI's video generation model, designed to turn text, image, and video inputs into a new video output. Users can create videos up to 1080p resolution (20 seconds max) in various formats, generate fresh content from text, or remix and enhance their own assets. The Featured and Recent feeds showcase community creations, providing inspiration for new ideas. Sora builds upon the advancements from DALL·E and GPT models, offering powerful tools for creative expression and storytelling.

As a diffusion model, Sora starts with a base video resembling static noise and gradually refines it by removing the noise over multiple steps. With the ability to process many frames at once, the model ensures subjects remain consistent even if they temporarily go out of view. Like GPT models, Sora uses a transformer architecture to achieve superior scaling performance.

Sora also leverages the recaptioning technique from DALL·E 3, which creates detailed captions for the visual training data. This allows the model to faithfully follow text instructions in the generated video.

In addition to generating videos from text, Sora can animate existing still images, adding lifelike movement and attention to detail. It can also extend or complete existing videos by filling in missing frames. Sora serves as a foundational technology for future models that can simulate and understand the real world, a step toward achieving AGI.

While Sora’s capabilities are groundbreaking, they may also introduce risks, such as the potential misuse of likenesses or the creation of misleading or harmful content. To ensure the safe deployment of Sora, we’ve built on the safety lessons from DALL·E's integration into ChatGPT and its API, along with safety measures for other OpenAI products. This system card outlines the mitigation strategies, external red teaming efforts, evaluations, and ongoing research aimed at refining these safeguards.

Model Data

As outlined in our technical report from February 2024, Sora draws inspiration from large language models (LLMs), which acquire generalist capabilities by training on vast, internet-scale datasets. The success of the LLM paradigm is partly due to the use of tokens that unify diverse text modalities—such as code, math, and various natural languages. With Sora, we explored how generative models for visual data could adopt similar benefits. While LLMs rely on text tokens, Sora uses visual patches. Previous research has shown that patches are an effective representation for visual data models, and we found them to be highly scalable and efficient for training generative models on diverse video and image types. To create these patches, we compress videos into a lower-dimensional latent space and then decompose this representation into spacetime patches.Sora was trained on a range of datasets, including publicly available data, proprietary data accessed through partnerships, and custom datasets developed in-house. These datasets include:Publicly available data, primarily sourced from industry-standard machine learning datasets and web crawls.Proprietary data from partnerships, such as collaborations with Shutterstock and Pond5 for AI-generated images and other dataset creation efforts.Human data, including feedback from AI trainers, red teamers, and employees.

Pretraining Filtering and Data Preprocessing

In addition to post-training mitigations, pretraining filtering provides an extra layer of protection by ensuring harmful or unwanted data is excluded from our datasets. All datasets undergo this filtering process before training, removing explicit, violent, or sensitive content (such as hate symbols). This filtering builds on the methods used to refine the datasets for other models, including DALL·E 2 and DALL·E 3.

Risk Identification And Deployment Preparation

We conducted a thorough process to understand both the potential for misuse and real-world creative applications, which informed Sora’s design and safety measures. After Sora’s announcement in February 2024, we collaborated with hundreds of visual artists, designers, and filmmakers from over 60 countries to gather feedback on how the model could be improved to best serve creative professionals. Additionally, we developed internal evaluations and worked with external red-teamers to identify risks and refine our safety and mitigation strategies.

The safety framework for Sora builds on insights gained from other models like DALL·E and ChatGPT, as well as custom safeguards tailored to the video product. Given the power of the tool, we are taking an iterative approach to safety, especially in areas where context is crucial or where we anticipate new risks associated with video content. Examples of this approach include age gating access to users over 18, restricting the use of likeness/face uploads, and implementing more stringent moderation thresholds for prompts and uploads involving minors at launch. We aim to continuously learn from user interactions with Sora to balance safety and maximize creative potential.

External Red Teaming

OpenAI partnered with external red teamers from nine countries to rigorously test Sora, identify weaknesses in its safety measures, and assess the risks associated with its new features. These red teamers had access to various iterations of the Sora product, beginning in September and continuing through December 2024. Over 15,000 generations were tested during this process, building on early 2024 testing with a version of Sora lacking production-level mitigations.

The red teamers focused on discovering novel risks associated with Sora’s capabilities and tested safety measures as they evolved. Their evaluations covered a broad range of problematic content—such as sexual, violent, and illegal material, as well as mis/disinformation—and explored adversarial tactics used to bypass safety systems. They also tested how Sora’s features could be exploited to weaken moderation safeguards. Furthermore, red teamers provided valuable feedback on biases and overall performance.

We tested Sora's text-to-video generation using both standard and adversarial prompts, examining a wide range of content types. The media upload function was tested with a variety of images and videos, including those of public figures, to assess the model’s ability to generate violative content. We also explored the use of modification tools—such as storyboards, recuts, remixes, and blends—to evaluate their potential for generating prohibited content.

Red Team Observations

Red teamers identified significant observations regarding both specific types of prohibited content and general adversarial tactics. For instance, they found that text prompts involving medical situations or science fiction/fantasy settings weakened safeguards against generating erotic and sexual content until further mitigations were implemented. Through adversarial tactics, such as using suggestive prompts or metaphors, they were able to bypass elements of the safety stack. Over numerous attempts, they identified prompt patterns and word choices that triggered safeguards and tested alternative phrasing to circumvent refusals. Red teamers then used the most concerning outputs as seed media for developing violative content that could not be created through single prompts. Jailbreak techniques occasionally proved effective in degrading safety policies, which helped us refine these protections.

Red teamers also tested media uploads and Sora’s tools (storyboards, recut, remix, and blend) using both publicly available and AI-generated media. This testing uncovered gaps in input and output filtering, prompting us to strengthen protections for media uploads, particularly for images of people. The testing also revealed the need for more robust classifier filtering to prevent non-violative media from being modified into prohibited content such as erotic material, violence, or deepfakes.

The feedback and data gathered by red teamers led to the development of additional safety mitigations and improvements to existing evaluations, as outlined in the Specific Risk Areas and Mitigations sections. This process enabled further tuning of our prompt filtering, blocklists, and classifier thresholds to ensure the model adhered to safety standards.

Learnings from Early Artist Access

Over the past nine months, we observed user feedback from over 500,000 model requests by 300+ users across 60+ countries. This data played a crucial role in refining model behavior and ensuring adherence to safety protocols. For example, feedback from artists highlighted the limitations of visible watermarks on their workflows, influencing our decision to allow paying users to download videos without the watermark, while still embedding C2PA data.

This early access program also emphasized that for Sora to serve as an expanded tool for storytelling and creative expression, it needed to provide more flexibility in sensitive areas compared to a general-purpose tool like ChatGPT. We expect that Sora will be used by artists, independent filmmakers, studios, and entertainment industry organizations as an integral part of their creative processes. By identifying both positive use cases and potential risks, we were able to pinpoint areas where more restrictive product-level mitigations were necessary to prevent misuse and harm.

Evaluations

We created internal evaluations focusing on key areas, such as nudity, deceptive election content, self-harm, and violence. These evaluations helped refine mitigations and informed our moderation thresholds. The evaluation framework combines input prompts given to the video generation model with classifiers applied to both transformed prompts and the final generated videos.

The input prompts for these evaluations were sourced from three primary channels: data collected during the early alpha phase (as detailed in Section 3.2), adversarial examples from red-team testers (as referenced in Section 3.1), and synthetic data generated using GPT-4. The alpha phase data provided insights into real-world usage scenarios, red teamers contributed examples of adversarial and edge-case content, and synthetic data allowed us to expand evaluation sets in areas like unintended racy content, where naturally occurring examples are limited.

Preparedness

Our preparedness framework evaluates whether frontier model capabilities introduce significant risks in four key categories: persuasion, cybersecurity, CBRN (chemical, biological, radiological, and nuclear), and model autonomy. We have no evidence that Sora poses significant risks related to cybersecurity, CBRN, or model autonomy. These risks are typically associated with models that interact with computer systems, scientific knowledge, or autonomous decision-making, which are beyond Sora’s scope as a video-generation tool.

However, Sora’s video generation capabilities could present potential risks in persuasion, such as impersonation, misinformation, or social engineering. To mitigate these risks, we have developed a range of safeguards, including measures to prevent the generation of likenesses of well-known public figures. Additionally, recognizing that context—whether a video is real or AI-generated—can be critical in determining its persuasive power, we have focused on building a comprehensive provenance approach. This includes metadata, watermarks, and fingerprinting to ensure the authenticity and traceability of generated content.

Sora Mitigation Stack

In addition to the specific risks and mitigations outlined below, several design and policy choices in Sora’s training and product development help broadly reduce the risk of harmful or unwanted outputs. These can be categorized into system and model-level technical mitigations, as well as product policies and user education.

System and Model Mitigations

Here are the primary safety mitigations we have in place before a user receives their requested output:

Text and Image Moderation via Multi-Modal Moderation Classifier

Our multi-modal moderation classifier, which powers the external Moderation API, is applied to identify text, image, or video prompts that may violate our usage policies. This system checks both input and output content for potential violations, and any violative prompts detected will result in a refusal. Learn more about our multi-modal moderation API here⁠.

Custom LLM Filtering

A key advantage of video generation technology is the ability to conduct asynchronous moderation checks without affecting the user experience. Since video generation takes a few seconds to process, this time window is used to run precise moderation checks. We have customized our own GPT to enhance moderation for specific topics, such as detecting third-party content and identifying deceptive material.

These multimodal filters apply to both image/video uploads and text prompts and outputs, enabling us to detect violating combinations across media types.

Image Output Classifiers

To address harmful content directly in outputs, Sora uses output classifiers, including filters specialized for NSFW content, minors, violence, and misuse of likenesses. If any of these classifiers are triggered, Sora may block videos before they are shown to the user.

Blocklists

We maintain extensive textual blocklists across various categories, drawing from our previous work with DALL·E 2 and DALL·E 3, proactive risk assessments, and feedback from early users.

Product Policies

Beyond the model and system safeguards designed to prevent the generation of violative content, we also have product-level policies in place to further mitigate the risk of misuse. Currently, we only offer Sora to users who are 18 or older and apply moderation filters to the content displayed in the Explore and Featured feeds.

We also clearly communicate our policy guidelines through in-product messages and publicly available educational materials on:

The unauthorized use of another person’s likeness and the prohibition of depicting real minors;
The creation of illegal content or content that violates intellectual property rights;
The generation of explicit and harmful content, including non-consensual intimate imagery, content used to bully, harass, or defame, and content promoting violence, hatred, or suffering; and
The creation and distribution of content intended to defraud, scam, or mislead others.

Some forms of misuse are addressed through our model and system mitigations, while others depend more on context. For example, a scene of a protest may be used for legitimate creative purposes, but the same scene, presented as a real current event, could be used as disinformation if paired with misleading claims.

Sora is designed to empower users to express a wide range of creative ideas and viewpoints. While it's not practical or advisable to prevent every form of contextually problematic content, we strive to balance freedom of expression with safety and responsibility.

Reporting and Enforcement

We provide users with the ability to report Sora videos they believe may violate our guidelines. To ensure effective monitoring, we combine automation with human review to track usage patterns. Enforcement mechanisms are in place to remove violative videos and penalize users accordingly. When violations occur, we notify the users and offer them the opportunity to provide feedback on the situation. We aim to track the effectiveness of these mitigations and refine them over time.

Specific Risk Areas and Mitigations

In addition to general safety measures, early testing and evaluation highlighted several key areas requiring focused attention.

Child Safety

OpenAI is fully committed to child safety and prioritizes the prevention, detection, and reporting of Child Sexual Abuse Material (CSAM) across all products, including Sora. Our efforts include responsibly sourcing datasets to protect against CSAM, collaborating with the National Center for Missing & Exploited Children (NCMEC) to prevent child sexual abuse, conducting red-teaming according to Thorn’s recommendations, and implementing rigorous CSAM scanning for both first-party and third-party users (API and Enterprise). Our safety stack includes mitigations developed for other OpenAI products such as ChatGPT and DALL·E, as well as additional measures specifically designed for Sora.

Input Classifiers

To ensure child safety, we apply multiple layers of input mitigations for text, image, and video content:

For all image and video uploads, we integrate with Safer, developed by Thorn, to detect matches with known CSAM. Confirmed matches are rejected and reported to NCMEC. We also use Thorn’s CSAM classifier to detect potentially new, unhashed CSAM content.
A multi-modal moderation classifier is employed to detect and moderate sexual content involving minors across text, images, and video inputs.
For Sora, we’ve created a specific classifier that analyzes both text and images to determine whether a person under 18 is depicted or if the accompanying caption references a minor. Any image-to-video requests featuring minors are rejected. For text-to-video generation, stricter moderation thresholds are applied for content related to sexual, violent, or self-harm topics when minors are involved.

Evaluation of Under-18 Classifier

We evaluate the performance of our under-18 classifier by testing it on a dataset of nearly 5,000 images across various categories, including child/adult and realistic/fictitious depictions. Our policy is to reject realistic images of children, while allowing fictitious representations such as animated or cartoon-style images, provided they are non-sexual. We take a cautious approach to content involving minors and will continue to refine this approach based on ongoing product use, ensuring a balance between creative expression and safety.

While our classifiers are highly accurate, occasional false positives may flag adult or non-realistic images of children. We also recognize that age prediction models may exhibit racial biases, underestimating the age of individuals from certain racial groups. We are committed to improving the accuracy of our classifier, reducing false positives, and addressing potential biases over the coming months.

Note on Precision and Recall

Precision is calculated as the percentage of "is_child" classifications that are accurate representations of realistic children.
Recall is calculated as the percentage of realistic child images that are accurately identified as "is_child" by the classifier.

Output Classifiers

Once our under-18 classifier detects references to minors in text input, we enforce strict moderation thresholds for sexual, violent, or self-harm content in the output. We use the following two output classifiers to ensure child safety:

Multi-modal Moderation Classifier: This classifier scans video outputs for unsafe content and rejects requests that may be especially sensitive.
DALL·E Image Classifier: We leverage the same classifier used in DALL·E to scan for child safety violations in image outputs.

Our output classifiers analyze two frames per second and block any video determined to be unsafe.

In addition to these automated classifiers, we incorporate human review as an extra layer of protection against potential child safety violations.

Product Policy

Our policies strictly prohibit the use of Sora for generating sexual content involving minors. Violations of child safety policies can result in the removal of content and the banning of users.

Nudity & Suggestive Content

A key concern with AI video generation is the creation of NSFW (Not Safe for Work) or NCII (Non-Consensual Intimate Imagery) content. Similar to DALL·E’s approach, Sora uses a multi-layered moderation strategy to block explicit content. This includes prompt transformations, image output classifiers, and blocklists to restrict suggestive content, particularly in age-appropriate outputs.

Our classifiers apply stricter thresholds for image uploads than for text-based prompts. Additionally, videos in the Explore section are filtered with heightened thresholds to ensure they are suitable for a broad audience.

Below are the results of our evaluations for nudity and suggestive content, assessing the effectiveness of our multi-layered mitigation approach across inputs and outputs. Based on these findings, we have refined our thresholds and applied stricter moderation for images with people.

Evaluation Explanation:

N = Total number of violating samples (~200 per category)
I = Total number of violating samples that pass input moderation checks
O = Total number of violating samples that pass output moderation checks
Accuracy at Input = (N - I) / N
Accuracy at Output (E2E) = (N - O) / N

Product Policy

Sora strictly prohibits the generation of explicit sexual content, including non-consensual intimate imagery. Violating these policies can lead to content removal and user penalties.

Deceptive Content: Likeness Misuse and Harmful Deepfakes

Sora’s moderation system for likeness-based prompts is designed to flag potentially harmful deepfakes. Videos involving recognizable individuals undergo close review. The Likeness Misuse filter identifies prompts that modify or depict individuals in potentially harmful or misleading ways. Additionally, Sora’s general prompt transformations help prevent the generation of unwanted likenesses of private individuals based on prompts containing their names.

Deceptive Content

Sora’s input and output classifiers are designed to prevent the generation of deceptive content related to elections, particularly content that depicts fraudulent, unethical, or illegal activities. Our evaluation metrics include classifiers that flag style or filtering techniques that could lead to misleading videos, thereby reducing the potential for real-world misuse in the context of elections.

Below are the evaluation results for our Deceptive Election Content filter, which helps identify instances where there is an intent to create prohibited content across various inputs (e.g., text and video). Our system also scans one frame per second of output video to detect potential violations.

Evaluation details:

N = ~500 (based on synthetic data prompts)

Investments in Provenance

Since many risks associated with Sora, such as harmful deepfake content, are context-dependent, we have prioritized the development of our provenance tools. While there is no single solution for provenance, we are committed to improving the provenance ecosystem to provide better context and transparency for content created with Sora.

Our provenance safety tools include the following for general availability:

C2PA Metadata: All assets will include verifiable origin metadata, following industry standards.
Visible Watermarks: Animated Sora watermarks will be included by default to make it clear to viewers that the content is AI-generated.
Internal Reverse Video Search Tool: This tool helps the OpenAI Intelligence & Investigation team confidently assess whether content was created using Sora.

Product Policy

Sora’s policies prohibit its use to defraud, scam, or mislead others, including the creation and distribution of disinformation. The use of another person’s likeness without permission is also strictly prohibited. Violations of these policies can lead to content removal and user penalties.

Artist Styles

When a user incorporates the name of a living artist into a prompt, the model may generate video content that resembles the artist’s style. While there is a longstanding tradition of creative works being inspired by other artists, we understand that some creators may have concerns about this. In response, we have taken a conservative approach in the current version of Sora, as we continue to learn how the tool is used by the creative community.

To address these concerns, we have implemented prompt re-writes that trigger when a user attempts to generate content in the style of a living artist.

As with other products, the Sora Editor uses a language model to rewrite submitted text, facilitating better prompts and ensuring compliance with our guidelines. This includes removing the names of public figures, grounding descriptions of people with specific attributes, and generically describing branded objects. Additionally, we maintain a range of blocklists, informed by previous work on DALL·E 2 and DALL·E 3, proactive risk discovery, and feedback from red-teamers and early users.

Future Work

OpenAI follows an iterative deployment strategy to ensure the responsible and effective roll-out of its products. This approach combines phased rollouts, ongoing testing, continuous monitoring, and user feedback to refine and improve performance and safety measures over time. Below are the planned steps for Sora’s iterative deployment:

Likeness Pilot

The ability to generate videos using uploaded photos or videos of real people as the “seed” presents potential misuse risks. We are taking a cautious, incremental approach to understand early patterns of use. Feedback from artists indicates that this is a valuable creative tool, but given its potential for abuse, we will not make it available to all users initially. Instead, it will be limited to a subset of users with active, in-depth monitoring to assess its value to the Sora community and adjust safety measures accordingly. During this pilot, uploads containing images of minors will not be permitted.

Provenance and Transparency Initiatives

Future iterations of Sora will enhance traceability through ongoing research into reverse embedding search tools and the continued implementation of transparency measures like C2PA. We are exploring potential partnerships with NGOs and research organizations to improve the provenance ecosystem and test our internal reverse image tool for Sora.

Expanding Representation in Outputs

We are committed to reducing output biases through prompt refinements, feedback loops, and identifying effective mitigations—while avoiding overcorrections that can be equally harmful. We recognize challenges like body image bias and demographic representation and will continue refining our approach to ensure balanced and inclusive outputs.

Continued Safety, Policy, and Ethical Alignment

OpenAI will continue to evaluate Sora to improve its adherence to OpenAI’s policies and safety standards. Planned improvements will focus on areas such as likeness safety and deceptive content, guided by evolving best practices and user feedback.

Acknowledgements

We extend our thanks to all of OpenAI's internal teams, including Comms, Comms Design, Global Affairs, Integrity, Intel & Investigations, Legal, Product Policy, Safety Systems, and User Ops, whose support was crucial in developing and implementing Sora’s safety mitigations and contributing to this system card.

We are also grateful to our group of Alpha artists and expert red teamers who provided feedback, tested our models in early development stages, and contributed to our risk assessments and evaluations. Participation in the testing process does not imply endorsement of OpenAI's deployment plans or policies.

Red Teaming Individuals (alphabetical)

Alexandra García Pérez, Arjun Singh Puri, Caroline Friedman Levy, Dani Madrid-Morales, Emily Lynell Edwards, Grant Brailsford, Herman Wasserman, Javier García Arredondo, Kate Turetsky, Kelly Bare, Matt Groh, Maximilian Müller, Naomi Hart, Nathan Heath, Patrick Caughey, Per Wikman Svahn, Rafael González-Vázquez, Sara Kingsley, Shelby Grossman, Vincent Nestler