Embeddings
An embedding is a vector that represents a piece of text, with the distance between vectors indicating similarity, which means closer distances mean more similar texts, while farther distances mean less similar texts.
The Cortex Embeddings feature is fully compatible with OpenAI's Embeddings API endpoints.
Usage
CLI
# Without Flagcortex embeddings "Hello World"# With model_id Flagcortex embeddings [options] [model_id] "Hello World"
API
- Request Example
- Endpoint Response
To generate an embedding, send a text string and the embedding model name (e.g., 'nomic-embed-text-v1.5.f16') to the Embeddings API endpoint. The Cortex-cpp server will return a list of floating-point numbers, which can be stored in a vector database for later use.
curl http://127.0.0.1:39281/v1/embeddings \-H "Content-Type: application/json" \-d '{ "input": "Your text string goes here", "model": "nomic-embed-text-v1.5.f16", "stream": false}'
{ "data": [ { "embedding": [ 0.065036498010158539, 0.036638252437114716, -0.15189965069293976, ... (omitted for spacing) -0.021707100793719292, -0.010746118612587452, 0.0078709172084927559 ], "index": 0, "object": "embedding" } ], "model": "_", "object": "list", "usage": { "prompt_tokens": 0, "total_tokens": 0 }}
Capabilities
Batch Embeddings
Cortex's Embedding feature, powered by the llamacpp
engine, offers an OpenAI-compatible endpoint. It supports processing multiple input data prompts simultaneously for batch embeddings.
Pre-configured Models
We provide a selection of pre-configured models designed to integrate seamlessly with embedding features. These optimized models include:
- Mistral Instruct 7B Q4
- Llama 3 8B Q4
- Aya 23 8B Q4
For a complete list of models, please visit the Cortex Hub.
Learn more about Embeddings capabilities: