Quickstart
warning
🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
Installation​
To install Cortex, download the installer for your operating system from the following options:
- Stable Version
Start Cortex.cpp Processes and API Server​
This command starts the Cortex.cpp API server at localhost:3928
.
cortex
Run a Model​
This command downloads the default gguf
model format from the Cortex Hub and starts the model.
cortex run mistral
info
All model files are stored in the ~users/cortex/models
folder.
Using the Model​
CLI​
# CLIcortex chat mistral
API​
curl http://localhost:3928/v1/chat/completions \-H "Content-Type: application/json" \-d '{ "model": "", "messages": [ { "role": "user", "content": "Hello" }, ], "model": "mistral", "stream": true, "max_tokens": 1, "stop": [ null ], "frequency_penalty": 1, "presence_penalty": 1, "temperature": 1, "top_p": 1}'
Cortex.js​
const resp = await cortex.chat.completions.create({ model: "mistral", messages: [ { role: "system", content: "You are a chatbot." }, { role: "user", content: "What is the capital of the United States?" }, ], });
Cortex.py​
completion = client.chat.completions.create( model=mistral, messages=[ { "role": "user", "content": "Say this is a test", }, ],)
Stop a Model​
This command stops the running model.
cortex models stop
Show the System State​
This command displays the running model and the hardware system status.
cortex ps
Run Different Model Variants​
# Run HuggingFace model with HuggingFace Repocortex run TheBloke/Mistral-7B-Instruct-v0.2-GGUF# Run Mistral in ONNX formatcortex run mistral:onnx# Run Mistral in TensorRT-LLM formatcortex run mistral:tensorrt-llm