Skip to main content

Embeddings

An embedding is a vector that represents a piece of text, with the distance between vectors indicating similarity, which means closer distances mean more similar texts, while farther distances mean less similar texts.

note

The Cortex Embeddings feature is fully compatible with OpenAI's Embeddings API endpoints.

Usage

CLI


# Without Flag
cortex embeddings "Hello World"
# With model_id Flag
cortex embeddings [options] [model_id] "Hello World"

API

To generate an embedding, send a text string and the embedding model name (e.g., 'nomic-embed-text-v1.5.f16') to the Embeddings API endpoint. The Cortex-cpp server will return a list of floating-point numbers, which can be stored in a vector database for later use.


curl http://127.0.0.1:39281/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": "Your text string goes here",
"model": "nomic-embed-text-v1.5.f16",
"stream": false
}'

Capabilities

Batch Embeddings

Cortex's Embedding feature, powered by the llamacpp engine, offers an OpenAI-compatible endpoint. It supports processing multiple input data prompts simultaneously for batch embeddings.

Pre-configured Models

We provide a selection of pre-configured models designed to integrate seamlessly with embedding features. These optimized models include:

  • Mistral Instruct 7B Q4
  • Llama 3 8B Q4
  • Aya 23 8B Q4
info

For a complete list of models, please visit the Cortex Hub.

note

Learn more about Embeddings capabilities: