curl http://localhost:18003/healthcurl -X POST "http://localhost:18003/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "什么是人工智能?"}
],
"max_tokens": 256,
"temperature": 0.7
}'curl -X POST "http://localhost:18003/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "介绍一下Python"}
],
"max_tokens": 300,
"temperature": 0.7,
"stream": true
}'curl -X POST "http://localhost:18003/v1/completions" \
-H "Content-Type: application/json" \
-d '{
"prompt": "人工智能的未来发展趋势是",
"max_tokens": 200,
"temperature": 0.8
}'curl -X POST "http://localhost:18003/v1/completions" \
-H "Content-Type: application/json" \
-d '{
"prompt": "写一首关于春天的诗",
"max_tokens": 150,
"temperature": 0.9,
"stream": true
}'curl -X POST "http://localhost:18003/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "你是一个有用的AI助手"},
{"role": "user", "content": "你好"},
{"role": "assistant", "content": "你好!有什么可以帮助你的吗?"},
{"role": "user", "content": "介绍一下机器学习"}
],
"max_tokens": 300
}'messages: 消息列表(聊天接口)prompt: 输入文本(补全接口)max_tokens: 最大生成token数(默认512)temperature: 采样温度(0.0-2.0,默认0.7)top_p: Nucleus采样(0.0-1.0,默认0.9)stream: 是否流式输出(true/false,默认false)stop: 停止序列(字符串或数组)