Skip to main content

Quickstart

warning

🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.

Installation​

To install Cortex, download the installer for your operating system from the following options:

Start Cortex.cpp Processes and API Server​

This command starts the Cortex.cpp API server at localhost:3928.


cortex

Run a Model​

This command downloads the default gguf model format from the Cortex Hub and starts the model.


cortex run mistral

info

All model files are stored in the ~users/cortex/models folder.

Using the Model​

CLI​


# CLI
cortex chat mistral

API​


curl http://localhost:3928/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "",
"messages": [
{
"role": "user",
"content": "Hello"
},
],
"model": "mistral",
"stream": true,
"max_tokens": 1,
"stop": [
null
],
"frequency_penalty": 1,
"presence_penalty": 1,
"temperature": 1,
"top_p": 1
}'

Cortex.js​


const resp = await cortex.chat.completions.create({
model: "mistral",
messages: [
{ role: "system", content: "You are a chatbot." },
{ role: "user", content: "What is the capital of the United States?" },
],
});

Cortex.py​


completion = client.chat.completions.create(
model=mistral,
messages=[
{
"role": "user",
"content": "Say this is a test",
},
],
)

Stop a Model​

This command stops the running model.


cortex models stop

Show the System State​

This command displays the running model and the hardware system status.


cortex ps

Run Different Model Variants​


# Run HuggingFace model with HuggingFace Repo
cortex run TheBloke/Mistral-7B-Instruct-v0.2-GGUF
# Run Mistral in ONNX format
cortex run mistral:onnx
# Run Mistral in TensorRT-LLM format
cortex run mistral:tensorrt-llm

info

Cortex.cpp is still in early development, so if you have any questions, please reach out to us: