Chat Completions

Generate model responses for a conversation.

Endpoint

POST

$ POST https://api.alltoken.ai/v1/chat/completions

JSON

1	{
2	"model": "deepseek-chat",
3	"messages": [
4	{ "role": "system", "content": "You are a helpful assistant." },
5	{ "role": "user", "content": "Hello!" }
6	],
7	"stream": false,
8	"temperature": 0.7,
9	"max_tokens": 1024
10	}

model (required) — model ID (e.g. "deepseek-chat")
messages (required) — array of message objects with role and content
stream — true for SSE streaming, false for complete response (default: false)
temperature — sampling temperature, 0-2 (default: 1)
top_p — nucleus sampling, 0-1 (default: 1)
max_tokens — maximum tokens to generate
frequency_penalty — penalize repeated tokens, -2 to 2 (default: 0)
presence_penalty — penalize topic repetition, -2 to 2 (default: 0)
tools — array of tool/function definitions for function calling
response_format — {"type": "json_object"} for guaranteed JSON output
web_search — true to enable web search (model-dependent)

JSON

1	{
2	"id": "chatcmpl-abc123",
3	"object": "chat.completion",
4	"created": 1700000000,
5	"model": "deepseek-chat",
6	"choices": [
7	{
8	"index": 0,
9	"message": {
10	"role": "assistant",
11	"content": "Hello! How can I help you today?"
12	},
13	"finish_reason": "stop"
14	}
15	],
16	"usage": {
17	"prompt_tokens": 12,
18	"completion_tokens": 10,
19	"total_tokens": 22
20	}
21	}