Learn how to monitor your credit consumption, understand how tokens are counted, and optimize your requests to get the most out of your balance.
You can programmatically query your current credit balance at any time using the following endpoint:
GET https://llmapi.resayil.io/api/billing/subscription
Authorization: Bearer YOUR_API_KEY
A successful request returns a JSON response containing your subscription details and remaining credit balance:
{
"tier": "free",
"status": "active",
"expires_at": null,
"credits": 842.50
}
The credits field represents your currently available balance. Credits are deducted with each request based on token count and the model used.
The system measures your consumption in tokens. One token is approximately four characters in English, or roughly three-quarters of a word. Every request counts two types of tokens:
These values appear in the usage field of every response:
{
"usage": {
"prompt_tokens": 42,
"completion_tokens": 118,
"total_tokens": 160
}
}
Tokens are not deducted at a flat rate across all models. The system applies a per-model multiplier that reflects its operational cost:
| Model Type | Multiplier Range | Examples |
|---|---|---|
| Standard models | 0.5× – 1.5× | Mistral, Llama 3, Neural Chat |
| Premium models | 2× – 3.5× | GPT-4o, Claude 3.5, Gemini Pro |
Example: A request with 200 total tokens using a standard model at 1× multiplier deducts 200 credits. The same request through a premium model at 3× deducts 600 credits.
Note: You can find the exact multiplier for each model on the Available Models page.
LLM Resayil supports both full-response (non-streaming) and real-time streaming (stream: true) modes. Token counting is identical in both modes — the only difference is how you receive the data:
[DONE] signal.
Credit deduction occurs after the full response is complete in both cases. The total token count is not affected by the delivery method.
The system maintains a last_used_at timestamp for each API key, updated with per-minute granularity to minimise database write pressure. You can see this value on the API Keys page in your dashboard.
This indicator is useful for spotting unusual activity or verifying that your application is using the correct key. If you see a key that has not been used in a long time, consider revoking it and generating a fresh one.
Follow these practices to reduce your per-request cost without sacrificing quality:
Tip: Benchmark different models against your tasks and compare response quality to cost. In many cases a faster, cheaper standard model performs the job just as well.