Create a dataset
Create a new dataset with JSON examples. Empty datasets are not allowed.
Payload Requirements
- The dataset name must be unique within the given space.
- Each item in
examples[]may contain any user-defined fields. - Do not include system-managed fields on input:
id,created_at,updated_at. Requests that contain these fields in any example will be rejected. - Each example must contain at least one property (i.e.,
{}is invalid).
Valid example (create)
{
"name": "my-dataset",
"space_id": "spc_123",
"examples": [
{
"question": "What is 2+2?",
"answer": "4",
"topic": "math"
},
{
"question": "What is the capital of Spain?",
"answer": "Madrid",
"topic": "geography"
},
]
}
Invalid example (‘id’ not allowed on create)
{
"name": "my-dataset",
"space_id": "spc_123",
"examples": [
{
"id": "ex_1",
"input": "Hello"
}
]
}
Authorizations
Most Arize AI endpoints require authentication. For those endpoints that require authentication, include your API key in the request header using the format
Body
Body containing dataset creation parameters
Response
A dataset object
A dataset is a structured collection of examples used to test and evaluate LLM applications. Datasets allow you to test models consistently across any real-world scenarios and edge cases, quickly identify regressions, and track measurable improvements.
Unique identifier for the dataset
Name of the dataset
Unique identifier for the space this dataset belongs to
Timestamp for when the dataset was created
Timestamp for the last update of the dataset
List of versions associated with this dataset