Claude Code is an AI coding agent by Anthropic that reads code, edits files, and runs commands in the terminal.
Installation
curl -fsSL https://claude.ai/install.sh | bash
Open CC Switch and add a RuoLi configuration:
| Field | Value |
|---|
| API URL | https://ruoli.dev |
| API Key | Your Key |
Select Claude Code and click Apply.
The API URL for Claude Code is https://ruoli.dev, without /v1.
Switch Models
Beyond native Claude models, RuoLi routes to third-party models like gpt-5.5, GLM, Kimi, and DeepSeek. Open CC Switch → Edit Provider → Model Mapping, point each Claude alias at the target model, and save — no JSON editing required.
| Field | Purpose |
|---|
| Main Model | The default model Claude Code calls |
| Thinking Model | Used when extended / reasoning mode is on |
| Haiku Default Model | Target model for the haiku alias |
| Sonnet Default Model | Target model for the sonnet alias |
| Opus Default Model | Target model for the opus alias |
Leave these blank if your provider is natively Claude — only fill them in when you want to route aliases to a different model.
Claude Code injects a dynamic attribution header (containing session ID and more) into every request by default — this completely destroys upstream prompt caching, causing cache hit rates to collapse, costs to spike, and responses to slow down.Always disable it, whether you’re using native Claude models or third-party models like GLM / Kimi / DeepSeek.
In CC Switch → Edit Provider → Config JSON, add one line to env:
"CLAUDE_CODE_ATTRIBUTION_HEADER": "0"
The same applies to manual ~/.claude/settings.json — see the example below.
Manual Configuration
Edit ~/.claude/settings.json:
{
"env": {
"ANTHROPIC_BASE_URL": "https://ruoli.dev",
"ANTHROPIC_AUTH_TOKEN": "sk-YOUR-KEY",
"CLAUDE_CODE_ATTRIBUTION_HEADER": "0"
}
}
Or via environment variables:
export ANTHROPIC_BASE_URL=https://ruoli.dev
export ANTHROPIC_AUTH_TOKEN=sk-YOUR-KEY
export CLAUDE_CODE_ATTRIBUTION_HEADER=0
Configure Context Compression
Claude Code auto-compresses context at about 83% of the context window by default. If you’re using a 1M context window, you can adjust when compression triggers via CLAUDE_AUTOCOMPACT_PCT_OVERRIDE.
For example, to trigger compression at 180k tokens (180k / 1000k = 18%):
Add to the env block in ~/.claude/settings.json:
{
"env": {
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "18"
}
}
Or set a temporary environment variable:
export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=18
| Value | Trigger point (1M window) | Use case |
|---|
18 | ~180k tokens | Keep context clean frequently |
50 | ~500k tokens | Balance performance and context |
83 | ~830k tokens (default) | Maximize context utilization |
You can only set the threshold below 83%. Values above 83% are silently ignored.