API Reference
Moderations
Classify text content for safety violations
POST
The Moderations API helps detect potentially harmful content in text.
Create Moderation
POST /moderations
Classify a text to see if it violates OpenAI’s usage policies.
Request Body
The input text to classify
The moderation model to use (e.g., “text-moderation-latest”, “text-moderation-stable”)
Response
Returns a moderation object with classification results.
Example
Response Format
The response contains the following fields:
id
: Unique identifier for the moderation requestmodel
: The model used for moderationresults
: Array of result objects with the following properties:flagged
: Whether the content was flaggedcategories
: Object with boolean values for each categorycategory_scores
: Object with confidence scores for each category
Categories
The moderation model checks for the following categories:
- hate: Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste
- hate/threatening: Hateful content that also includes violence or serious harm towards the targeted group
- harassment: Content that expresses, incites, or promotes harassing language towards any target
- harassment/threatening: Harassment content that also includes violence or serious harm towards any target
- self-harm: Content that promotes, encourages, or depicts acts of self-harm
- self-harm/intent: Content where the speaker expresses that they are engaging or intend to engage in acts of self-harm
- self-harm/instructions: Content that encourages performing acts of self-harm
- sexual: Content meant to arouse sexual excitement
- sexual/minors: Sexual content that includes an individual who is under 18 years old
- violence: Content that depicts death, violence, or physical injury
- violence/graphic: Content that depicts death, violence, or physical injury in graphic detail
Authorizations
Enter your API key (starts with 'ek-')
Body
application/json
Response
200 - application/json
Success
The response is of type object
.