text-embedding-3-small. The same shape works for any HTTP embeddings endpoint; swap the client and model name to switch providers.
Code
- Python
- TypeScript
Input mapping
| Parameter | Bind to |
|---|---|
output | The model output to score, usually output. |
reference | The ground-truth string, usually reference. |
Output configuration
Continuous score in the range-1.0 to 1.0 (cosine similarity). Optimization direction: maximize.
In practice, OpenAI’s text-embedding-3 models produce non-negative similarities on natural-language pairs, so a 0.0 – 1.0 range with a low-end threshold (e.g. 0.7 for “close enough”) is also reasonable.
Runtime requirements
| Setting | Value |
|---|---|
| Sandbox | A hosted backend that matches your language. Python: E2B, Daytona — Python, Vercel Sandbox — Python, or Modal. TypeScript: Daytona — TypeScript or Vercel Sandbox — TypeScript (the local Deno sandbox is started with --no-npm and cannot install the openai package). |
| Dependencies | Python: openai. TypeScript: openai (npm). Add it under Dependencies when creating the sandbox configuration. |
| Internet access | Required — toggle Allow Internet Access on for the configuration. The sandbox must reach api.openai.com. |
| Environment variables | OPENAI_API_KEY — preferably set as a secret reference to a key in Settings → Secrets, not a literal value. |
Related
- Pairwise Evaluator — apply embedding distance to two candidate outputs and pick a winner.
- scikit-learn TF-IDF — a cheaper, offline alternative when embeddings are overkill.

