Skip to main content

Embeddings

warning

🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.

An embedding is a vector that represents a piece of text, with the distance between vectors indicating similarity, which means closer distances mean more similar texts, while farther distances mean less similar texts.

note

The Cortex Embeddings feature is fully compatible with OpenAI-compatible endpoints.

Usage​

CLI​


# Without Flag
cortex embeddings "Hello World"
# With model_id Flag
cortex embeddings [options] [model_id] "Hello World"

API​

To generate an embedding, send a text string and the embedding model name (e.g., 'nomic-embed-text-v1.5.f16') to the Embeddings API endpoint. The Cortex-cpp server will return a list of floating-point numbers, which can be stored in a vector database for later use.


curl http://127.0.0.1:3928/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": "Your text string goes here",
"model": "nomic-embed-text-v1.5.f16",
"stream": false
}'

Capabilities​

Batch Embeddings​

Cortex's Embedding feature, powered by the llamacpp engine, offers an OpenAI-compatible endpoint. It supports processing multiple input data prompts simultaneously for batch embeddings.

Pre-configured Models​

We provide a selection of pre-configured models designed to integrate seamlessly with embedding features. These optimized models include:

  • Mistral Instruct 7B Q4
  • Llama 3 8B Q4
  • Aya 23 8B Q4
info

For a complete list of models, please visit the Cortex Hub.

note

Learn more about Embeddings capabilities: