API Reference Documentation
This document describes the OpenAI-compatible AI service interfaces, including model management, chat completion, and text completion functionality.
Overview
Our AI API implements OpenAI-compatible interfaces that support:
Model listing and querying
Chat completion (streaming and non-streaming)
Text completion (streaming and non-streaming)
Administrator model management
Basic Information
Base URL:
https://apis.gradient.network/api/v1Authentication: Access Key
Content Type:
application/jsonAPI Version: v1
Authentication
Access Key Authentication
Authorization: Bearer your-access-key-hereAPI Endpoints
1. Model Management APIs
1.1 List All Models
Endpoint: GET /ai/models
Description: List of all available AI models (no authentication required)
Request Parameters: None
Response Example:
Error Codes:
200: Success500: Internal Server Error
2. Chat Completion API
2.1 Chat Completion
Endpoint: POST /ai/chat/completions
Description: Create a chat completion request, supporting both streaming and non-streaming responses
Authentication: Access Key
Request Parameters:
model
string
Yes
The ID of the model to use
messages
array
Yes
Array of conversation messages
stream
boolean
No
Whether to use streaming response, default false
max_tokens
integer
No
Maximum number of tokens to generate
temperature
number
No
Sampling temperature, 0-2, default 1
top_p
number
No
Nucleus sampling parameter, 0-1, default 1
n
integer
No
Number of responses to generate, default 1
stop
string/array
No
Stop generation tokens
presence_penalty
number
No
Presence penalty, -2.0 to 2.0, default 0
frequency_penalty
number
No
Frequency penalty, -2.0 to 2.0, default 0
logit_bias
object
No
Modify sampling probability for specified tokens
user
string
No
User identifier
Request Example:
Non-streaming Response Example:
Streaming Response Example:
Error Codes:
200: Success400: Bad Request401: Unauthorized402: Billing Check Failed404: Model Not Found429: Rate Limit Exceeded500: Internal Server Error
3. Text Completion API
3.1 Text Completion
Endpoint: POST /ai/completions
Description: Create a text completion request, supporting both streaming and non-streaming responses
Authentication: Access Key
Request Parameters:
model
string
Yes
The ID of the model to use
prompt
string/array
Yes
Prompt text
suffix
string
No
Suffix to append after inserted text
max_tokens
integer
No
Maximum number of tokens to generate
temperature
number
No
Sampling temperature, 0-2, default 1
top_p
number
No
Nucleus sampling parameter, 0-1, default 1
n
integer
No
Number of responses to generate, default 1
stream
boolean
No
Whether to use streaming response, default false
logprobs
integer
No
Return log probabilities for most likely tokens
echo
boolean
No
Whether to echo the prompt, default false
stop
string/array
No
Stop generation tokens
presence_penalty
number
No
Presence penalty, -2.0 to 2.0, default 0
frequency_penalty
number
No
Frequency penalty, -2.0 to 2.0, default 0
best_of
integer
No
Select from best candidates, default 1
logit_bias
object
No
Modify sampling probability for specified tokens
user
string
No
User identifier
Request Example:
Response Example:
Error Codes:
200: Success400: Bad Request401: Unauthorized402: Billing Check Failed404: Model Not Found429: Rate Limit Exceeded500: Internal Server Error
Error Code Details
Common Error Codes
400
400
Bad Request
Check request parameter format and required fields
401
401
Unauthorized
Provide valid Access Key or JWT Token
402
402
Billing Check Failed
Check account balance and billing status
403
403
Forbidden
Confirm user permissions and roles
404
404
Resource Not Found
Check if resource ID is correct
429
429
Rate Limit Exceeded
Reduce request frequency or contact administrator
500
500
Internal Server Error
Contact technical support
AI-Specific Error Codes
model_not_found
Specified model does not exist
Check model ID or use /ai/models to get available models
model_not_supported
Model does not support requested functionality
Check model capabilities or use other models
context_length_exceeded
Input exceeds model context length limit
Reduce input length or use models supporting longer context
invalid_parameters
Parameter values are invalid
Check parameter ranges and formats
billing_check_failed
Billing check failed
Check account balance and billing configuration
Usage Examples
Python Examples
JavaScript Examples
cURL Examples
Rate Limits and Quotas
Rate Limits
Free Users: 60 requests per minute
Paid Users: Based on plan, typically 1000-10000 requests per minute
Token Limits
Input Tokens: Based on model context length limits
Output Tokens: Based on model capabilities and billing limits
Concurrency Limits
Free Users: Maximum 3 concurrent requests
Paid Users: Based on plan, typically 10-100 concurrent requests
Best Practices
1. Error Handling
2. Streaming Processing
3. Retry Mechanism
Support
If you encounter issues during usage, please:
Review the error code descriptions in this document
Check request parameters and authentication information
Contact the technical support team
Check the system status page
Last updated

