1734 lines
46 KiB
Markdown
1734 lines
46 KiB
Markdown
# GasFlux Web API 文档
|
||
|
||
## 概述
|
||
|
||
GasFlux Web API 是一个基于 Flask 的 RESTful API,用于上传数据文件、执行气体通量分析处理,并下载处理结果。该 API 支持异步处理,能够处理大量数据并提供实时状态监控。
|
||
|
||
## 快速开始
|
||
|
||
### 基础信息
|
||
|
||
- **基础 URL**: `http://localhost:5000`
|
||
- **认证**: 无需认证
|
||
- **数据格式**: JSON
|
||
- **文件大小限制**: 100MB
|
||
- **支持的文件类型**:
|
||
- 数据文件: `.xlsx`, `.xls`
|
||
- 配置文件: `.yaml`, `.yml`
|
||
|
||
### 完整工作流程示例
|
||
|
||
```python
|
||
import requests
|
||
import time
|
||
|
||
def process_gasflux_data():
|
||
# 1. 检查 API 健康状态
|
||
health_response = requests.get('http://localhost:5000/health')
|
||
print(f"API Status: {health_response.json()['status']}")
|
||
|
||
# 2. 上传数据文件
|
||
with open('data.xlsx', 'rb') as f:
|
||
files = {'file': f}
|
||
upload_response = requests.post('http://localhost:5000/upload', files=files)
|
||
|
||
result = upload_response.json()
|
||
task_id = result['job_id']
|
||
print(f"任务已创建: {task_id}")
|
||
|
||
# 3. 监控处理状态
|
||
while True:
|
||
status_response = requests.get(f'http://localhost:5000/task/{task_id}')
|
||
status = status_response.json()
|
||
|
||
print(f"Status: {status['status']} - {status['message']}")
|
||
|
||
if status['status'] == 'completed':
|
||
# 4. 下载结果文件
|
||
for result_file in status['results']:
|
||
download_url = f"http://localhost:5000{result_file['download_url']}"
|
||
download_response = requests.get(download_url)
|
||
|
||
with open(result_file['name'], 'wb') as f:
|
||
f.write(download_response.content)
|
||
print(f"Downloaded: {result_file['name']}")
|
||
break
|
||
elif status['status'] == 'failed':
|
||
print(f"Task failed: {status.get('error', '未知错误')}")
|
||
break
|
||
|
||
time.sleep(3) # 每3秒检查一次状态
|
||
```
|
||
|
||
## API 端点
|
||
|
||
### 🔍 监控和健康检查
|
||
|
||
#### 1. 获取健康状态
|
||
**端点**: `GET /health`
|
||
|
||
**描述**: 获取 API 的健康状态、系统信息和性能指标。健康检查会评估多个关键指标,当任何指标超出正常范围时,服务状态会标记为 `degraded`。
|
||
|
||
**健康检查逻辑**:
|
||
- **存储检查**: 验证上传和输出文件夹是否可写
|
||
- **负载检查**: 活跃任务数量超过20个时发出警告
|
||
- **错误率检查**: HTTP错误率超过10%时标记为不健康
|
||
- **综合评估**: 任何一项检查失败都会影响整体健康状态
|
||
|
||
**响应示例** - 健康状态 (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "健康检查完成",
|
||
"data": {
|
||
"status": "healthy",
|
||
"version": "1.0.0",
|
||
"timestamp": 1705257600.123,
|
||
"uptime": "2h 30m 15s",
|
||
"storage": {
|
||
"uploads_writable": true,
|
||
"outputs_writable": true
|
||
},
|
||
"tasks": {
|
||
"active_count": 2,
|
||
"total_tracked": 15,
|
||
"total_processed": 13,
|
||
"success_rate_percent": 92.31
|
||
},
|
||
"performance": {
|
||
"requests_per_second": 0.08,
|
||
"avg_response_time_ms": 234.56,
|
||
"error_rate_percent": 1.5
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**响应示例** - 不健康状态 (503):
|
||
```json
|
||
{
|
||
"code": 503,
|
||
"message": "服务不可用",
|
||
"data": {
|
||
"status": "degraded",
|
||
"version": "1.0.0",
|
||
"timestamp": 1705257600.123,
|
||
"uptime": "1h 30m 45s",
|
||
"storage": {
|
||
"uploads_writable": true,
|
||
"outputs_writable": true
|
||
},
|
||
"tasks": {
|
||
"active_count": 0,
|
||
"total_tracked": 5,
|
||
"total_processed": 3,
|
||
"success_rate_percent": 60.0
|
||
},
|
||
"performance": {
|
||
"requests_per_second": 0.12,
|
||
"avg_response_time_ms": 145.67,
|
||
"error_rate_percent": 50.0
|
||
},
|
||
"issues": [
|
||
"错误率过高 (50.0%)",
|
||
"活跃任务数量过多 (25)"
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
**状态码**:
|
||
- `200`: API 健康 (`status: "healthy"`)
|
||
- `503`: API 不健康 (`status: "degraded"`) - 服务不可用
|
||
|
||
**健康状态说明**:
|
||
- **healthy**: 所有检查通过,服务正常运行
|
||
- **degraded**: 部分检查失败,服务仍可运行但需要关注
|
||
- 错误率 > 10%: HTTP请求错误率过高
|
||
- 活跃任务 > 20: 系统负载过高
|
||
- 存储不可写: 文件系统权限问题
|
||
|
||
**字段说明**:
|
||
- `storage.uploads_writable`: 上传文件夹是否可写
|
||
- `storage.outputs_writable`: 输出文件夹是否可写
|
||
- `tasks.active_count`: 当前活跃的任务数量
|
||
- `performance.error_rate_percent`: HTTP请求错误率百分比
|
||
- `issues`: 当状态为degraded时的具体问题列表
|
||
|
||
---
|
||
|
||
#### 2. 获取系统统计信息
|
||
**端点**: `GET /stats`
|
||
|
||
**描述**: 获取详细的 API 统计信息、性能指标和系统监控数据,包括请求统计、任务状态、性能指标和系统资源使用情况。
|
||
|
||
**响应示例** (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "统计信息获取成功",
|
||
"data": {
|
||
"summary": {
|
||
"uptime_seconds": 3600.5,
|
||
"uptime_formatted": "1h 0m 0s",
|
||
"requests_total": 150,
|
||
"requests_per_second": 0.04,
|
||
"error_rate_percent": 2.0,
|
||
"active_tasks": 1
|
||
},
|
||
"requests": {
|
||
"by_method": {
|
||
"GET": 120,
|
||
"POST": 30
|
||
},
|
||
"by_status": {
|
||
"200": 145,
|
||
"400": 3,
|
||
"500": 2
|
||
},
|
||
"top_endpoints": {
|
||
"/task/abc123": 45,
|
||
"/health": 30,
|
||
"/": 25
|
||
}
|
||
},
|
||
"tasks": {
|
||
"total_created": 25,
|
||
"total_completed": 20,
|
||
"total_failed": 2,
|
||
"success_rate_percent": 90.91,
|
||
"by_status": {
|
||
"pending": 1,
|
||
"processing": 1,
|
||
"completed": 20,
|
||
"failed": 2
|
||
}
|
||
},
|
||
"performance": {
|
||
"avg_response_time_ms": 245.67,
|
||
"max_response_time_ms": 1250.34,
|
||
"min_response_time_ms": 12.45
|
||
},
|
||
"system": {
|
||
"memory_usage_percent": 45.2,
|
||
"memory_used_gb": 7.3,
|
||
"memory_total_gb": 16.0,
|
||
"disk_usage_percent": 23.1,
|
||
"disk_used_gb": 46.8,
|
||
"disk_total_gb": 203.2
|
||
},
|
||
"recent_tasks": [
|
||
{
|
||
"task_id": "abc123-def456",
|
||
"status": "completed",
|
||
"age_seconds": 45.2,
|
||
"message": "处理完成成功"
|
||
}
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
- `summary.uptime_seconds`: API运行时间(秒)
|
||
- `summary.uptime_formatted`: 格式化的运行时间(如 "1h 0m 0s")
|
||
- `summary.requests_total`: 总请求数
|
||
- `summary.requests_per_second`: 平均每秒请求数
|
||
- `summary.error_rate_percent`: 请求错误率百分比
|
||
- `summary.active_tasks`: 当前活跃任务数(pending或processing状态)
|
||
- `requests.by_method`: 按HTTP方法分组的请求统计
|
||
- `requests.by_status`: 按HTTP状态码分组的请求统计
|
||
- `requests.top_endpoints`: 请求最多的前10个端点
|
||
- `tasks.total_created`: 创建的总任务数
|
||
- `tasks.total_completed`: 完成的任务数
|
||
- `tasks.total_failed`: 失败的任务数
|
||
- `tasks.success_rate_percent`: 任务成功率百分比
|
||
- `tasks.by_status`: 按状态分组的任务统计
|
||
- `performance.avg_response_time_ms`: 平均响应时间(毫秒)
|
||
- `performance.max_response_time_ms`: 最大响应时间(毫秒)
|
||
- `performance.min_response_time_ms`: 最小响应时间(毫秒)
|
||
- `system.memory_usage_percent`: 内存使用率百分比
|
||
- `system.memory_used_gb`: 已用内存(GB)
|
||
- `system.memory_total_gb`: 总内存(GB)
|
||
- `system.disk_usage_percent`: 磁盘使用率百分比(输出目录所在磁盘)
|
||
- `system.disk_used_gb`: 已用磁盘空间(GB)
|
||
- `system.disk_total_gb`: 总磁盘空间(GB)
|
||
- `recent_tasks[]`: 最近20个任务的状态信息
|
||
|
||
---
|
||
|
||
#### 3. 重置统计信息
|
||
**端点**: `POST /stats/reset`
|
||
|
||
**描述**: 重置所有 API 统计数据(管理员功能)。
|
||
|
||
**响应示例** (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "统计信息重置成功",
|
||
"data": {
|
||
"timestamp": 1705257600.123
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
#### 4. 获取配置信息
|
||
**端点**: `GET /config`
|
||
|
||
**描述**: 获取当前应用配置信息和支持的环境变量。
|
||
|
||
**响应示例** (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "配置信息获取成功",
|
||
"data": {
|
||
"configuration": {
|
||
"host": "0.0.0.0",
|
||
"port": 5000,
|
||
"debug": false,
|
||
"base_dir": "/app",
|
||
"upload_folder": "/app/web_api_data/uploads",
|
||
"output_folder": "/app/web_api_data/outputs",
|
||
"max_content_length": 104857600,
|
||
"log_level": "INFO",
|
||
"log_file": "logs/gasflux_api.log",
|
||
"cors_origins": ["*"],
|
||
"task_cleanup_interval": 3600,
|
||
"max_task_age": 86400,
|
||
"threads": 8,
|
||
"connection_limit": 100,
|
||
"channel_timeout": 300
|
||
},
|
||
"environment_variables": {
|
||
"supported": [
|
||
"GASFLUX_HOST", "GASFLUX_PORT", "GASFLUX_DEBUG",
|
||
"GASFLUX_UPLOAD_FOLDER", "GASFLUX_OUTPUT_FOLDER",
|
||
"GASFLUX_MAX_CONTENT_LENGTH", "GASFLUX_LOG_LEVEL",
|
||
"GASFLUX_LOG_FILE", "GASFLUX_CORS_ORIGINS",
|
||
"GASFLUX_TASK_CLEANUP_INTERVAL", "GASFLUX_MAX_TASK_AGE",
|
||
"GASFLUX_THREADS", "GASFLUX_CONNECTION_LIMIT",
|
||
"GASFLUX_CHANNEL_TIMEOUT"
|
||
],
|
||
"current_values": {
|
||
"GASFLUX_HOST": "0.0.0.0",
|
||
"GASFLUX_PORT": "5000",
|
||
"GASFLUX_DEBUG": "false"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### 📤 文件上传和管理
|
||
|
||
#### 5. 文件上传和处理
|
||
**端点**: `POST /upload`
|
||
|
||
**描述**: 上传数据文件并启动异步处理任务。
|
||
|
||
**请求参数** (multipart/form-data):
|
||
- `file` (必需): 数据文件 (.xlsx 或 .xls 格式)
|
||
- `config` (可选): 配置文件 (.yaml 或 .yml 格式)
|
||
|
||
**请求示例** (cURL):
|
||
```bash
|
||
curl -X POST \
|
||
-F "file=@data.xlsx" \
|
||
-F "config=@config.yaml" \
|
||
http://localhost:5000/upload
|
||
```
|
||
|
||
**请求示例** (Python):
|
||
```python
|
||
import requests
|
||
|
||
files = {'file': open('data.xlsx', 'rb')}
|
||
config = {'config': open('config.yaml', 'rb')} # 可选
|
||
|
||
response = requests.post('http://localhost:5000/upload', files={**files, **config})
|
||
result = response.json()
|
||
print(f"Task ID: {result['job_id']}")
|
||
```
|
||
|
||
**成功响应** (202):
|
||
```json
|
||
{
|
||
"code": 202,
|
||
"message": "任务已接受并加入处理队列",
|
||
"data": {
|
||
"status": "accepted",
|
||
"job_id": "abc123-def456-ghi789",
|
||
"task_status_url": "/task/abc123-def456-ghi789"
|
||
}
|
||
}
|
||
```
|
||
|
||
**错误响应示例**:
|
||
|
||
- 文件类型不支持 (400):
|
||
```json
|
||
{
|
||
"code": 400,
|
||
"message": "无效的数据文件类型。只允许 .xlsx 和 .xls 格式。",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
- 文件过大 (413):
|
||
```json
|
||
{
|
||
"code": 413,
|
||
"message": "文件过大。最大尺寸为 100MB。",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
- 配置文件格式错误 (400):
|
||
```json
|
||
{
|
||
"code": 400,
|
||
"message": "无效的配置文件类型。只允许 .yaml 和 .yml 格式。",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
### 📊 任务管理和监控
|
||
|
||
#### 6. 查询任务状态
|
||
**端点**: `GET /task/{task_id}`
|
||
|
||
**描述**: 查询异步处理任务的当前状态和进度信息。
|
||
|
||
**路径参数**:
|
||
- `task_id`: 任务 ID (UUID 格式)
|
||
|
||
**响应示例** - 处理中 (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "任务查询成功",
|
||
"data": {
|
||
"task_id": "abc123-def456-ghi789",
|
||
"status": "processing",
|
||
"message": "GasFlux 分析完成,正在生成报告...",
|
||
"updated_at": 1705257600.123,
|
||
"progress": {
|
||
"stage": "report_generation",
|
||
"completed_steps": 4,
|
||
"total_steps": 5,
|
||
"estimated_time_remaining": 45
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**响应示例** - 处理完成 (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "任务查询成功",
|
||
"data": {
|
||
"task_id": "abc123-def456-ghi789",
|
||
"status": "completed",
|
||
"message": "处理完成成功",
|
||
"updated_at": 1705257600.123,
|
||
"processing_time_seconds": 125.67,
|
||
"results": [
|
||
{
|
||
"name": "08_34_01_5m.processed_ch4_report.html",
|
||
"rel_path": "abc123-def456-ghi789/08_34_01_5m.processed_ch4_report.html",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed_ch4_report.html",
|
||
"size": 245760,
|
||
"type": "report"
|
||
},
|
||
{
|
||
"name": "08_34_01_5m.processed_data.csv",
|
||
"rel_path": "abc123-def456-ghi789/08_34_01_5m.processed_data.csv",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed_data.csv",
|
||
"size": 153600,
|
||
"type": "data"
|
||
},
|
||
{
|
||
"name": "08_34_01_5m.processed_config.yaml",
|
||
"rel_path": "abc123-def456-ghi789/08_34_01_5m.processed_config.yaml",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed_config.yaml",
|
||
"size": 2048,
|
||
"type": "config"
|
||
},
|
||
{
|
||
"name": "08_34_01_5m.processed_output_vars.json",
|
||
"rel_path": "abc123-def456-ghi789/08_34_01_5m.processed_output_vars.json",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed_output_vars.json",
|
||
"size": 4096,
|
||
"type": "metadata"
|
||
}
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
**响应示例** - 处理失败 (200):
|
||
```json
|
||
{
|
||
"task_id": "abc123-def456-ghi789",
|
||
"status": "failed",
|
||
"message": "处理失败",
|
||
"updated_at": 1705257600.123,
|
||
"processing_time_seconds": 23.45,
|
||
"error": "处理失败: Invalid data format in column 'temperature'",
|
||
"error_details": {
|
||
"stage": "data_validation",
|
||
"error_code": "INVALID_DATA_FORMAT",
|
||
"traceback": "..."
|
||
}
|
||
}
|
||
```
|
||
|
||
**响应示例** - 任务不存在 (404):
|
||
```json
|
||
{
|
||
"code": 404,
|
||
"message": "任务未找到",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
**任务状态说明**:
|
||
- `pending`: 任务已排队,等待处理
|
||
- `processing`: 正在处理中(包含进度信息)
|
||
- `completed`: 处理完成(包含结果文件列表)
|
||
- `failed`: 处理失败(包含错误信息)
|
||
|
||
---
|
||
|
||
#### 7. 更新任务状态
|
||
**端点**: `PUT /task/{task_id}`
|
||
|
||
**描述**: 更新任务的状态、信息或优先级。
|
||
|
||
**路径参数**:
|
||
- `task_id`: 任务 ID (UUID 格式)
|
||
|
||
**请求参数** (JSON):
|
||
- `status` (可选): 新的任务状态
|
||
- `pending`: 重新排队等待处理
|
||
- `processing`: 标记为处理中
|
||
- `completed`: 标记为完成
|
||
- `failed`: 标记为失败
|
||
- `message` (可选): 状态消息或错误描述
|
||
- `priority` (可选): 任务优先级 (normal/high/low)
|
||
|
||
**请求示例** (cURL):
|
||
```bash
|
||
# 标记任务为完成
|
||
curl -X PUT http://localhost:5000/task/abc123-def456-ghi789 \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"status": "completed", "message": "手动标记为完成"}'
|
||
|
||
# 更新任务消息
|
||
curl -X PUT http://localhost:5000/task/abc123-def456-ghi789 \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"message": "更新的状态消息"}'
|
||
|
||
# 设置高优先级
|
||
curl -X PUT http://localhost:5000/task/abc123-def456-ghi789 \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"priority": "high"}'
|
||
```
|
||
|
||
**请求示例** (Python):
|
||
```python
|
||
import requests
|
||
|
||
# 标记任务为失败
|
||
response = requests.put(
|
||
'http://localhost:5000/task/abc123-def456-ghi789',
|
||
json={
|
||
'status': 'failed',
|
||
'message': '处理失败 due to invalid input data'
|
||
}
|
||
)
|
||
|
||
# 更新任务优先级
|
||
response = requests.put(
|
||
'http://localhost:5000/task/abc123-def456-ghi789',
|
||
json={'priority': 'high'}
|
||
)
|
||
```
|
||
|
||
**成功响应** (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "任务更新成功",
|
||
"data": {
|
||
"task_id": "abc123-def456-ghi789",
|
||
"status": "updated",
|
||
"task_info": {
|
||
"status": "completed",
|
||
"message": "手动标记为完成",
|
||
"updated_at": 1705257600.123,
|
||
"priority": "normal"
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**错误响应示例**:
|
||
|
||
- 任务不存在 (404):
|
||
```json
|
||
{
|
||
"code": 404,
|
||
"message": "任务未找到",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
- 无效状态 (400):
|
||
```json
|
||
{
|
||
"code": 400,
|
||
"message": "无效状态。必须是以下之一: pending, processing, completed, failed",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
- 无效请求 (400):
|
||
```json
|
||
{
|
||
"code": 400,
|
||
"message": "请求体必须是 JSON 格式",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
#### 8. 删除任务
|
||
**端点**: `DELETE /task/{task_id}`
|
||
|
||
**描述**: 删除任务及其所有相关的文件和数据。
|
||
|
||
**路径参数**:
|
||
- `task_id`: 任务 ID (UUID 格式)
|
||
|
||
**请求示例** (cURL):
|
||
```bash
|
||
# 删除指定任务
|
||
curl -X DELETE http://localhost:5000/task/abc123-def456-ghi789
|
||
```
|
||
|
||
**请求示例** (Python):
|
||
```python
|
||
import requests
|
||
|
||
# 删除任务
|
||
response = requests.delete('http://localhost:5000/task/abc123-def456-ghi789')
|
||
|
||
if response.status_code == 200:
|
||
result = response.json()
|
||
print(f"Task {result['task_id']} deleted")
|
||
print(f"Files deleted: {result['details']['folders_deleted']}")
|
||
print(f"Size freed: {result['details']['total_size_deleted']} bytes")
|
||
else:
|
||
print(f"Failed to delete task: {response.json()}")
|
||
```
|
||
|
||
**成功响应** (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "任务及相关文件删除成功",
|
||
"data": {
|
||
"task_id": "abc123-def456-ghi789",
|
||
"status": "deleted",
|
||
"details": {
|
||
"folders_deleted": 1,
|
||
"total_size_deleted": 307200,
|
||
"task_status": "completed"
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**错误响应示例**:
|
||
|
||
- 任务不存在 (404):
|
||
```json
|
||
{
|
||
"code": 404,
|
||
"message": "任务未找到",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
- 任务正在处理 (409):
|
||
```json
|
||
{
|
||
"code": 409,
|
||
"message": "无法删除当前正在处理或等待处理的任务",
|
||
"data": {
|
||
"task_status": "processing"
|
||
}
|
||
}
|
||
```
|
||
|
||
- 删除文件失败 (500):
|
||
```json
|
||
{
|
||
"code": 500,
|
||
"message": "删除任务文件失败: Permission denied",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
**注意事项**:
|
||
- 只能删除已完成或失败的任务
|
||
- 无法删除正在处理或等待处理的任务
|
||
- 删除操作会同时删除任务记录和所有相关文件
|
||
- 删除操作不可逆,请谨慎使用
|
||
|
||
---
|
||
|
||
### 📋 报告管理和查询
|
||
|
||
#### 9. 分页查询已生成报告
|
||
**端点**: `GET /reports`
|
||
|
||
**描述**: 分页查询所有已生成的处理报告,支持排序和过滤。
|
||
|
||
**查询参数**:
|
||
- `page` (可选): 页码 (默认: 1)
|
||
- `per_page` (可选): 每页报告数量 (默认: 20, 最大: 100)
|
||
- `sort_by` (可选): 排序字段 (默认: created_at)
|
||
- `created_at`: 按创建时间排序
|
||
- `task_id`: 按任务ID排序
|
||
- `file_size`: 按文件总大小排序
|
||
- `processing_time`: 按处理时间排序
|
||
- `sort_order` (可选): 排序顺序 (默认: desc)
|
||
- `asc`: 升序
|
||
- `desc`: 降序
|
||
- `status` (可选): 按任务状态过滤
|
||
- `completed`: 只显示完成的任务
|
||
- `failed`: 只显示失败的任务
|
||
- 不指定: 显示所有任务
|
||
|
||
**请求示例** (cURL):
|
||
```bash
|
||
# 获取第一页,每页20个报告,按创建时间倒序
|
||
curl "http://localhost:5000/reports?page=1&per_page=20&sort_by=created_at&sort_order=desc"
|
||
|
||
# 获取第二页,只显示完成的任务
|
||
curl "http://localhost:5000/reports?page=2&status=completed"
|
||
|
||
# 按处理时间升序排序
|
||
curl "http://localhost:5000/reports?sort_by=processing_time&sort_order=asc"
|
||
```
|
||
|
||
**请求示例** (Python):
|
||
```python
|
||
import requests
|
||
|
||
# 基本查询
|
||
response = requests.get('http://localhost:5000/reports')
|
||
reports = response.json()
|
||
|
||
# 分页查询
|
||
params = {
|
||
'page': 1,
|
||
'per_page': 10,
|
||
'sort_by': 'created_at',
|
||
'sort_order': 'desc',
|
||
'status': 'completed'
|
||
}
|
||
response = requests.get('http://localhost:5000/reports', params=params)
|
||
data = response.json()
|
||
|
||
print(f"总报告数: {data['pagination']['total_reports']}")
|
||
print(f"当前页: {data['pagination']['page']}/{data['pagination']['total_pages']}")
|
||
|
||
for report in data['reports']:
|
||
print(f"任务: {report['task_id']}")
|
||
print(f"状态: {report['status']}")
|
||
print(f"创建时间: {report['created_at']}")
|
||
print(f"文件数量: {report['file_count']}")
|
||
if report['main_report']:
|
||
print(f"主报告: {report['main_report']['download_url']}")
|
||
```
|
||
|
||
**成功响应** (200):
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "报告列表获取成功",
|
||
"data": {
|
||
"reports": [
|
||
{
|
||
"task_id": "abc123-def456-ghi789",
|
||
"report_name": "08_34_01_5m",
|
||
"status": "completed",
|
||
"created_at": 1705257600.123,
|
||
"file_count": 4,
|
||
"total_size": 307200,
|
||
"processing_time_seconds": 125.67,
|
||
"main_report": {
|
||
"name": "08_34_01_5m.processed_ch4_report.html",
|
||
"size": 245760,
|
||
"type": "report",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed/2026-01-14_10-33-29-961698_processing_run/08_34_01_5m.processed_ch4_report.html"
|
||
},
|
||
"all_files": [
|
||
{
|
||
"name": "08_34_01_5m.processed_ch4_report.html",
|
||
"size": 245760,
|
||
"type": "report",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed/2026-01-14_10-33-29-961698_processing_run/08_34_01_5m.processed_ch4_report.html"
|
||
},
|
||
{
|
||
"name": "08_34_01_5m.processed_data.csv",
|
||
"size": 153600,
|
||
"type": "data",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed/2026-01-14_10-33-29-961698_processing_run/08_34_01_5m.processed_data.csv"
|
||
},
|
||
{
|
||
"name": "08_34_01_5m.processed_config.yaml",
|
||
"size": 2048,
|
||
"type": "config",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed/2026-01-14_10-33-29-961698_processing_run/08_34_01_5m.processed_config.yaml"
|
||
},
|
||
{
|
||
"name": "08_34_01_5m.processed_output_vars.json",
|
||
"size": 4096,
|
||
"type": "metadata",
|
||
"download_url": "/download/abc123-def456-ghi789/08_34_01_5m.processed/2026-01-14_10-33-29-961698_processing_run/08_34_01_5m.processed_output_vars.json"
|
||
}
|
||
],
|
||
"run_directory": "abc123-def456-ghi789/08_34_01_5m.processed/2026-01-14_10-33-29-961698_processing_run"
|
||
}
|
||
],
|
||
"pagination": {
|
||
"page": 1,
|
||
"per_page": 20,
|
||
"total_reports": 45,
|
||
"total_pages": 3,
|
||
"has_next": true,
|
||
"has_prev": false
|
||
},
|
||
"filters": {
|
||
"sort_by": "created_at",
|
||
"sort_order": "desc",
|
||
"status": null
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**错误响应示例**:
|
||
|
||
- 参数无效 (400):
|
||
```json
|
||
{
|
||
"code": 400,
|
||
"message": "Invalid parameter: per_page must be between 1 and 100",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 📁 文件下载
|
||
|
||
#### 10. 下载处理结果
|
||
**端点**: `GET /download/{filename}`
|
||
|
||
**描述**: 下载处理后的结果文件。
|
||
|
||
**路径参数**:
|
||
- `filename`: 文件的相对路径 (包含任务ID)
|
||
|
||
**请求示例** (cURL):
|
||
```bash
|
||
# 下载 HTML 报告
|
||
curl -O http://localhost:5000/download/abc123-def456-ghi789/report.html
|
||
|
||
# 下载 CSV 数据
|
||
curl -O http://localhost:5000/download/abc123-def456-ghi789/data.csv
|
||
|
||
# 使用 Python 下载
|
||
import requests
|
||
|
||
response = requests.get('http://localhost:5000/download/abc123-def456-ghi789/report.html')
|
||
with open('report.html', 'wb') as f:
|
||
f.write(response.content)
|
||
```
|
||
|
||
**状态码**:
|
||
- `200`: 成功下载文件
|
||
- `403`: 访问被拒绝 (路径遍历攻击防护)
|
||
- `404`: 文件不存在
|
||
- `400`: 路径不是文件
|
||
|
||
---
|
||
|
||
### 🌐 Web 界面
|
||
|
||
#### 11. Web 管理界面
|
||
**端点**: `GET /`
|
||
|
||
**描述**: 访问用户友好的 Web 界面,支持文件上传、任务监控和结果下载。
|
||
|
||
**响应**: HTML 页面,包含:
|
||
- 文件上传表单
|
||
- 任务状态监控面板
|
||
- 结果文件下载链接
|
||
- 系统状态信息
|
||
|
||
---
|
||
|
||
## 🔄 处理流程详解
|
||
|
||
### 完整处理流程
|
||
|
||
1. **文件上传阶段**
|
||
- 客户端验证文件类型和大小
|
||
- 上传数据文件和可选的配置文件
|
||
- 服务器进行安全检查和文件存储
|
||
|
||
2. **任务队列管理**
|
||
- 服务器为上传任务分配唯一的 UUID
|
||
- 任务进入处理队列,根据系统负载进行调度
|
||
|
||
3. **异步数据处理**
|
||
- **数据预处理**: 格式转换、数据验证、单位转换
|
||
- **配置合并**: 默认配置 + 用户配置
|
||
- **GasFlux 核心分析**:
|
||
- 背景校正算法
|
||
- 气体通量计算
|
||
- 空间插值 (克里金插值)
|
||
- 统计分析和可视化
|
||
- **结果生成**: HTML 报告、CSV 数据、配置文件备份
|
||
|
||
4. **实时状态监控**
|
||
- 客户端通过任务 ID 轮询状态
|
||
- 支持进度跟踪和预计完成时间
|
||
|
||
5. **结果获取和清理**
|
||
- 处理完成后提供下载链接
|
||
- 定期清理过期任务和文件
|
||
|
||
### 处理时间估计
|
||
|
||
- 小文件 (< 10MB): 1-3 分钟
|
||
- 中等文件 (10-50MB): 3-10 分钟
|
||
- 大文件 (> 50MB): 10-30 分钟
|
||
- 受计算复杂度、数据质量和系统负载影响
|
||
|
||
## ⚠️ 错误处理
|
||
|
||
### HTTP 状态码
|
||
|
||
| 状态码 | 说明 | 处理建议 |
|
||
|--------|------|----------|
|
||
| 200 | 请求成功 | 正常处理响应数据 |
|
||
| 202 | 请求已接受 (异步处理) | 记录任务 ID,开始状态轮询 |
|
||
| 400 | 请求参数错误 | 检查请求格式和参数 |
|
||
| 403 | 访问被拒绝 | 检查文件路径和权限 |
|
||
| 404 | 资源不存在 | 验证任务 ID 或文件路径 |
|
||
| 413 | 文件过大 | 压缩文件或联系管理员 |
|
||
| 500 | 服务器内部错误 | 检查系统状态,重试请求 |
|
||
|
||
### 常见错误响应
|
||
|
||
**文件上传错误**:
|
||
```json
|
||
{
|
||
"code": 400,
|
||
"message": "无效的数据文件类型。只允许 .xlsx 和 .xls 格式。",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
```json
|
||
{
|
||
"code": 413,
|
||
"message": "文件过大。最大尺寸为 100MB。",
|
||
"data": {}
|
||
}
|
||
```
|
||
|
||
**任务查询错误**:
|
||
```json
|
||
{
|
||
"code": 404,
|
||
"message": "任务未找到",
|
||
"data": {
|
||
"task_id": "invalid-task-id"
|
||
}
|
||
}
|
||
```
|
||
|
||
**文件下载错误**:
|
||
```json
|
||
{
|
||
"code": 404,
|
||
"message": "文件未找到",
|
||
"data": {
|
||
"filename": "task-id/file.html"
|
||
}
|
||
}
|
||
```
|
||
|
||
## 📊 性能监控和日志
|
||
|
||
### 日志记录
|
||
|
||
所有 API 请求都会记录到 `gasflux_api.log` 文件,包含:
|
||
|
||
```
|
||
2026-01-14 10:30:15,123 - INFO - [task:abc123] POST /upload - 202 - 2.34s
|
||
2026-01-14 10:30:17,456 - INFO - [task:abc123] Processing started: data.xlsx (15.2MB)
|
||
2026-01-14 10:32:22,789 - INFO - [task:abc123] Processing completed: 4 files generated
|
||
2026-01-14 10:32:25,012 - INFO - [task:abc123] GET /download/abc123/report.html - 200 - 0.45s
|
||
```
|
||
|
||
### 性能指标
|
||
|
||
通过 `/stats` 端点获取:
|
||
- 请求响应时间统计
|
||
- 任务成功率
|
||
- 系统资源使用情况
|
||
- 错误率和热点端点
|
||
|
||
## 💻 编程示例
|
||
|
||
### Python 完整示例
|
||
|
||
#### 基础用法
|
||
```python
|
||
import requests
|
||
import time
|
||
import os
|
||
from pathlib import Path
|
||
|
||
class GasFluxClient:
|
||
def __init__(self, base_url="http://localhost:5000"):
|
||
self.base_url = base_url.rstrip('/')
|
||
|
||
def check_health(self):
|
||
"""检查 API 健康状态"""
|
||
response = requests.get(f"{self.base_url}/health")
|
||
return response.json()
|
||
|
||
def upload_and_process(self, data_file, config_file=None, output_dir="./results"):
|
||
"""上传文件并处理"""
|
||
files = {'file': open(data_file, 'rb')}
|
||
if config_file and os.path.exists(config_file):
|
||
files['config'] = open(config_file, 'rb')
|
||
|
||
# 上传文件
|
||
print(f"Uploading {data_file}...")
|
||
response = requests.post(f"{self.base_url}/upload", files=files)
|
||
result = response.json()
|
||
|
||
if response.status_code != 202:
|
||
raise Exception(f"Upload failed: {result.get('error', 'Unknown error')}")
|
||
|
||
task_id = result['job_id']
|
||
print(f"任务已创建: {task_id}")
|
||
|
||
# 监控处理状态
|
||
while True:
|
||
status_response = requests.get(f"{self.base_url}/task/{task_id}")
|
||
status = status_response.json()
|
||
|
||
print(f"Status: {status['status']} - {status['message']}")
|
||
|
||
if status['status'] == 'completed':
|
||
# 下载结果文件
|
||
os.makedirs(output_dir, exist_ok=True)
|
||
for result_file in status['results']:
|
||
download_url = f"{self.base_url}{result_file['download_url']}"
|
||
output_path = Path(output_dir) / result_file['name']
|
||
|
||
print(f"Downloading {result_file['name']}...")
|
||
download_response = requests.get(download_url)
|
||
with open(output_path, 'wb') as f:
|
||
f.write(download_response.content)
|
||
|
||
print(f"Processing completed! Results saved to {output_dir}")
|
||
return status
|
||
|
||
elif status['status'] == 'failed':
|
||
error_msg = status.get('error', 'Unknown error')
|
||
raise Exception(f"Task failed: {error_msg}")
|
||
|
||
time.sleep(3) # 每3秒检查一次状态
|
||
|
||
# 使用示例
|
||
client = GasFluxClient()
|
||
try:
|
||
# 检查 API 状态
|
||
health = client.check_health()
|
||
print(f"API Status: {health['status']}")
|
||
|
||
# 处理数据
|
||
result = client.upload_and_process(
|
||
data_file="data.xlsx",
|
||
config_file="config.yaml",
|
||
output_dir="./gasflux_results"
|
||
)
|
||
print("处理完成成功!")
|
||
|
||
except Exception as e:
|
||
print(f"Error: {e}")
|
||
```
|
||
|
||
#### 异步版本 (使用 asyncio)
|
||
```python
|
||
import asyncio
|
||
import aiohttp
|
||
import aiofiles
|
||
from pathlib import Path
|
||
|
||
class AsyncGasFluxClient:
|
||
def __init__(self, base_url="http://localhost:5000"):
|
||
self.base_url = base_url.rstrip('/')
|
||
|
||
async def upload_and_process(self, data_file, config_file=None, output_dir="./results"):
|
||
async with aiohttp.ClientSession() as session:
|
||
# 准备文件上传
|
||
data = aiohttp.FormData()
|
||
data.add_field('file', open(data_file, 'rb'), filename=Path(data_file).name)
|
||
if config_file and Path(config_file).exists():
|
||
data.add_field('config', open(config_file, 'rb'), filename=Path(config_file).name)
|
||
|
||
# 上传文件
|
||
print(f"Uploading {data_file}...")
|
||
async with session.post(f"{self.base_url}/upload", data=data) as response:
|
||
result = await response.json()
|
||
if response.status != 202:
|
||
raise Exception(f"Upload failed: {result.get('error', 'Unknown error')}")
|
||
|
||
task_id = result['job_id']
|
||
print(f"任务已创建: {task_id}")
|
||
|
||
# 监控处理状态
|
||
while True:
|
||
async with session.get(f"{self.base_url}/task/{task_id}") as response:
|
||
status = await response.json()
|
||
|
||
print(f"Status: {status['status']} - {status['message']}")
|
||
|
||
if status['status'] == 'completed':
|
||
# 下载结果文件
|
||
Path(output_dir).mkdir(exist_ok=True)
|
||
for result_file in status['results']:
|
||
download_url = f"{self.base_url}{result_file['download_url']}"
|
||
output_path = Path(output_dir) / result_file['name']
|
||
|
||
print(f"Downloading {result_file['name']}...")
|
||
async with session.get(download_url) as response:
|
||
async with aiofiles.open(output_path, 'wb') as f:
|
||
await f.write(await response.read())
|
||
|
||
print(f"Processing completed! Results saved to {output_dir}")
|
||
return status
|
||
|
||
elif status['status'] == 'failed':
|
||
error_msg = status.get('error', 'Unknown error')
|
||
raise Exception(f"Task failed: {error_msg}")
|
||
|
||
await asyncio.sleep(3)
|
||
|
||
# 使用异步客户端
|
||
async def main():
|
||
client = AsyncGasFluxClient()
|
||
try:
|
||
result = await client.upload_and_process(
|
||
data_file="large_dataset.xlsx",
|
||
config_file="config.yaml",
|
||
output_dir="./async_results"
|
||
)
|
||
print("Async processing completed!")
|
||
except Exception as e:
|
||
print(f"Error: {e}")
|
||
|
||
# 运行异步示例
|
||
# asyncio.run(main())
|
||
```
|
||
|
||
### JavaScript/Node.js 示例
|
||
|
||
#### 完整实现
|
||
```javascript
|
||
const axios = require('axios');
|
||
const FormData = require('form-data');
|
||
const fs = require('fs').promises;
|
||
const path = require('path');
|
||
|
||
class GasFluxAPI {
|
||
constructor(baseURL = 'http://localhost:5000') {
|
||
this.baseURL = baseURL.replace(/\/$/, '');
|
||
this.client = axios.create({
|
||
baseURL: this.baseURL,
|
||
timeout: 30000
|
||
});
|
||
}
|
||
|
||
async checkHealth() {
|
||
try {
|
||
const response = await this.client.get('/health');
|
||
return response.data;
|
||
} catch (error) {
|
||
throw new Error(`Health check failed: ${error.message}`);
|
||
}
|
||
}
|
||
|
||
async uploadFile(dataFilePath, configFilePath = null) {
|
||
const formData = new FormData();
|
||
|
||
// 添加数据文件
|
||
if (!await fs.access(dataFilePath).then(() => true).catch(() => false)) {
|
||
throw new Error(`Data file not found: ${dataFilePath}`);
|
||
}
|
||
formData.append('file', fs.createReadStream(dataFilePath), {
|
||
filename: path.basename(dataFilePath)
|
||
});
|
||
|
||
// 添加配置文件(如果提供)
|
||
if (configFilePath) {
|
||
if (!await fs.access(configFilePath).then(() => true).catch(() => false)) {
|
||
console.warn(`Config file not found: ${configFilePath}, skipping...`);
|
||
} else {
|
||
formData.append('config', fs.createReadStream(configFilePath), {
|
||
filename: path.basename(configFilePath)
|
||
});
|
||
}
|
||
}
|
||
|
||
try {
|
||
const response = await this.client.post('/upload', formData, {
|
||
headers: formData.getHeaders(),
|
||
maxContentLength: Infinity,
|
||
maxBodyLength: Infinity
|
||
});
|
||
|
||
return response.data;
|
||
} catch (error) {
|
||
if (error.response) {
|
||
throw new Error(`Upload failed: ${error.response.data.error}`);
|
||
}
|
||
throw error;
|
||
}
|
||
}
|
||
|
||
async getTaskStatus(taskId) {
|
||
try {
|
||
const response = await this.client.get(`/task/${taskId}`);
|
||
return response.data;
|
||
} catch (error) {
|
||
if (error.response && error.response.status === 404) {
|
||
throw new Error(`任务未找到: ${taskId}`);
|
||
}
|
||
throw error;
|
||
}
|
||
}
|
||
|
||
async downloadFile(downloadUrl, outputPath) {
|
||
try {
|
||
const response = await this.client.get(downloadUrl, {
|
||
responseType: 'stream'
|
||
});
|
||
|
||
const writer = fs.createWriteStream(outputPath);
|
||
response.data.pipe(writer);
|
||
|
||
return new Promise((resolve, reject) => {
|
||
writer.on('finish', resolve);
|
||
writer.on('error', reject);
|
||
});
|
||
} catch (error) {
|
||
throw new Error(`Download failed: ${error.message}`);
|
||
}
|
||
}
|
||
|
||
async processData(dataFilePath, configFilePath = null, outputDir = './results', pollInterval = 3000) {
|
||
try {
|
||
// 1. 检查 API 健康状态
|
||
console.log('Checking API health...');
|
||
const health = await this.checkHealth();
|
||
console.log(`API Status: ${health.status}`);
|
||
|
||
// 2. 上传文件
|
||
console.log(`Uploading ${dataFilePath}...`);
|
||
const uploadResult = await this.uploadFile(dataFilePath, configFilePath);
|
||
const taskId = uploadResult.job_id;
|
||
console.log(`任务已创建: ${taskId}`);
|
||
|
||
// 3. 监控处理状态
|
||
console.log('Monitoring processing status...');
|
||
while (true) {
|
||
const status = await this.getTaskStatus(taskId);
|
||
|
||
console.log(`[${new Date().toISOString()}] Status: ${status.status} - ${status.message}`);
|
||
|
||
if (status.status === 'completed') {
|
||
console.log('处理完成成功!');
|
||
|
||
// 4. 下载结果文件
|
||
console.log('Downloading result files...');
|
||
await fs.mkdir(outputDir, { recursive: true });
|
||
|
||
for (const resultFile of status.results) {
|
||
const downloadUrl = `${this.baseURL}${resultFile.download_url}`;
|
||
const outputPath = path.join(outputDir, resultFile.name);
|
||
|
||
console.log(`Downloading ${resultFile.name}...`);
|
||
await this.downloadFile(downloadUrl, outputPath);
|
||
console.log(`Saved to ${outputPath}`);
|
||
}
|
||
|
||
return {
|
||
taskId,
|
||
status: status.status,
|
||
results: status.results,
|
||
outputDir
|
||
};
|
||
|
||
} else if (status.status === 'failed') {
|
||
const errorMsg = status.error || 'Unknown error';
|
||
throw new Error(`处理失败: ${errorMsg}`);
|
||
}
|
||
|
||
// 等待后重试
|
||
await new Promise(resolve => setTimeout(resolve, pollInterval));
|
||
}
|
||
|
||
} catch (error) {
|
||
console.error(`Error in processData: ${error.message}`);
|
||
throw error;
|
||
}
|
||
}
|
||
}
|
||
|
||
// 使用示例
|
||
async function main() {
|
||
const api = new GasFluxAPI();
|
||
|
||
try {
|
||
const result = await api.processData(
|
||
'data.xlsx',
|
||
'config.yaml', // 可选
|
||
'./gasflux_results',
|
||
5000 // 5秒检查一次状态
|
||
);
|
||
|
||
console.log('All done!', result);
|
||
|
||
} catch (error) {
|
||
console.error('处理失败:', error.message);
|
||
process.exit(1);
|
||
}
|
||
}
|
||
|
||
// 如果直接运行此文件
|
||
if (require.main === module) {
|
||
main();
|
||
}
|
||
|
||
module.exports = GasFluxAPI;
|
||
```
|
||
|
||
### cURL 命令示例
|
||
|
||
#### 基本上传和监控
|
||
```bash
|
||
#!/bin/bash
|
||
|
||
# API 基础 URL
|
||
API_URL="http://localhost:5000"
|
||
|
||
# 检查健康状态
|
||
echo "Checking API health..."
|
||
curl -s "${API_URL}/health" | jq '.'
|
||
|
||
# 上传文件
|
||
echo "Uploading data file..."
|
||
UPLOAD_RESPONSE=$(curl -s -X POST \
|
||
-F "file=@data.xlsx" \
|
||
-F "config=@config.yaml" \
|
||
"${API_URL}/upload")
|
||
|
||
echo "Upload response: $UPLOAD_RESPONSE"
|
||
|
||
# 提取任务 ID
|
||
TASK_ID=$(echo "$UPLOAD_RESPONSE" | jq -r '.job_id')
|
||
|
||
if [ "$TASK_ID" = "null" ] || [ -z "$TASK_ID" ]; then
|
||
echo "Upload failed!"
|
||
exit 1
|
||
fi
|
||
|
||
echo "Task ID: $TASK_ID"
|
||
|
||
# 监控任务状态
|
||
echo "Monitoring task status..."
|
||
while true; do
|
||
STATUS_RESPONSE=$(curl -s "${API_URL}/task/${TASK_ID}")
|
||
STATUS=$(echo "$STATUS_RESPONSE" | jq -r '.status')
|
||
MESSAGE=$(echo "$STATUS_RESPONSE" | jq -r '.message')
|
||
|
||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] Status: $STATUS - $MESSAGE"
|
||
|
||
if [ "$STATUS" = "completed" ]; then
|
||
echo "Processing completed!"
|
||
|
||
# 下载结果文件
|
||
echo "Downloading result files..."
|
||
mkdir -p results
|
||
|
||
echo "$STATUS_RESPONSE" | jq -r '.results[].download_url' | while read -r download_url; do
|
||
filename=$(basename "$download_url")
|
||
echo "Downloading $filename..."
|
||
curl -s -o "results/$filename" "${API_URL}${download_url}"
|
||
done
|
||
|
||
echo "All files downloaded to ./results/"
|
||
break
|
||
|
||
elif [ "$STATUS" = "failed" ]; then
|
||
ERROR=$(echo "$STATUS_RESPONSE" | jq -r '.error // "Unknown error"')
|
||
echo "Task failed: $ERROR"
|
||
exit 1
|
||
fi
|
||
|
||
sleep 3
|
||
done
|
||
```
|
||
|
||
## 🔧 故障排除指南
|
||
|
||
### 常见问题和解决方案
|
||
|
||
#### 1. 连接问题
|
||
**问题**: 无法连接到 API 服务器
|
||
```bash
|
||
curl: (7) 连接被拒绝,无法连接到 localhost 端口 5000
|
||
```
|
||
|
||
**解决方案**:
|
||
- 检查服务器是否正在运行:`ps aux | grep gasflux`
|
||
- 验证端口配置:检查环境变量 `GASFLUX_PORT`
|
||
- 确认防火墙设置:`sudo ufw status` 或 `sudo firewall-cmd --list-all`
|
||
|
||
#### 2. 文件上传失败
|
||
|
||
**问题**: 文件上传被拒绝
|
||
```json
|
||
{"error": "无效的数据文件类型。只允许 .xlsx 和 .xls 格式。"}
|
||
```
|
||
|
||
**解决方案**:
|
||
- 检查文件扩展名(必须是 .xlsx 或 .xls)
|
||
- 验证文件不是空的
|
||
- 确保文件大小不超过 100MB
|
||
|
||
**问题**: 文件过大
|
||
```json
|
||
{"error": "文件过大。最大尺寸为 100MB。"}
|
||
```
|
||
|
||
**解决方案**:
|
||
- 压缩数据文件
|
||
- 分割成多个较小的文件
|
||
- 联系管理员增加文件大小限制
|
||
|
||
#### 3. 任务处理问题
|
||
|
||
**问题**: 任务长时间处于 pending 状态
|
||
```json
|
||
{"status": "pending", "message": "Task queued for processing"}
|
||
```
|
||
|
||
**解决方案**:
|
||
- 检查系统负载:`GET /stats`
|
||
- 查看服务器资源使用情况
|
||
- 等待队列处理或联系管理员
|
||
|
||
**问题**: 处理失败
|
||
```json
|
||
{
|
||
"status": "failed",
|
||
"error": "处理失败: Invalid data format in column 'temperature'"
|
||
}
|
||
```
|
||
|
||
**解决方案**:
|
||
- 检查输入数据格式和列名
|
||
- 验证数据范围(温度、压力等)
|
||
- 查看详细的错误信息和日志
|
||
|
||
#### 4. 文件下载问题
|
||
|
||
**问题**: 下载失败
|
||
```json
|
||
{"error": "File not found or access denied"}
|
||
```
|
||
|
||
**解决方案**:
|
||
- 确认任务已完成(status: "completed")
|
||
- 检查下载 URL 格式
|
||
- 验证文件路径是否存在
|
||
|
||
#### 5. 服务器性能问题
|
||
|
||
**问题**: 响应缓慢或超时
|
||
|
||
**解决方案**:
|
||
- 检查系统资源:`GET /stats`
|
||
- 查看并发任务数量
|
||
- 监控内存和 CPU 使用率
|
||
- 考虑增加服务器资源或优化配置
|
||
|
||
### 调试技巧
|
||
|
||
#### 启用详细日志
|
||
```bash
|
||
# 设置环境变量启用 DEBUG 模式
|
||
export GASFLUX_LOG_LEVEL=DEBUG
|
||
export GASFLUX_DEBUG=true
|
||
|
||
# 重启服务器
|
||
python server_waitress.py
|
||
```
|
||
|
||
#### 查看实时日志
|
||
```bash
|
||
# 监控日志文件
|
||
tail -f logs/gasflux_api.log
|
||
|
||
# 过滤特定任务的日志
|
||
tail -f logs/gasflux_api.log | grep "task:abc123"
|
||
```
|
||
|
||
#### 使用健康检查进行诊断
|
||
```bash
|
||
# 基本健康检查
|
||
curl -s http://localhost:5000/health | jq '.'
|
||
|
||
# 详细统计信息
|
||
curl -s http://localhost:5000/stats | jq '.'
|
||
|
||
# 配置信息
|
||
curl -s http://localhost:5000/config | jq '.'
|
||
```
|
||
|
||
## 🛡️ 安全考虑
|
||
|
||
### 数据保护
|
||
- **文件类型验证**: 只接受指定的文件类型 (.xlsx, .xls, .yaml, .yml)
|
||
- **路径遍历保护**: 防止通过 `../` 等路径访问敏感文件
|
||
- **文件大小限制**: 防止拒绝服务攻击
|
||
|
||
### 访问控制
|
||
- **无认证设计**: 适用于内部网络或受控环境
|
||
- **IP 白名单**: 可通过反向代理实现
|
||
- **HTTPS 推荐**: 在生产环境中使用 HTTPS
|
||
|
||
### 数据清理
|
||
- **自动清理**: 过期的任务和文件会被自动删除
|
||
- **配置选项**: 可通过环境变量调整清理间隔和过期时间
|
||
|
||
## 📈 最佳实践
|
||
|
||
### 客户端实现
|
||
|
||
#### 1. 错误处理
|
||
```python
|
||
def safe_api_call(url, max_retries=3, backoff_factor=2):
|
||
for attempt in range(max_retries):
|
||
try:
|
||
response = requests.get(url, timeout=30)
|
||
response.raise_for_status()
|
||
return response.json()
|
||
except requests.exceptions.RequestException as e:
|
||
if attempt == max_retries - 1:
|
||
raise e
|
||
wait_time = backoff_factor ** attempt
|
||
print(f"Request failed, retrying in {wait_time}s...")
|
||
time.sleep(wait_time)
|
||
```
|
||
|
||
#### 2. 状态轮询优化
|
||
```python
|
||
def monitor_task_efficiently(task_id, max_wait_time=3600):
|
||
start_time = time.time()
|
||
check_interval = 2 # 初始检查间隔
|
||
|
||
while time.time() - start_time < max_wait_time:
|
||
status = get_task_status(task_id)
|
||
|
||
if status['status'] in ['completed', 'failed']:
|
||
return status
|
||
|
||
# 根据任务阶段调整检查间隔
|
||
if 'progress' in status:
|
||
stage = status['progress'].get('stage', '')
|
||
if 'data_processing' in stage:
|
||
check_interval = 5 # 数据处理阶段检查频率降低
|
||
elif 'report_generation' in stage:
|
||
check_interval = 10 # 报告生成阶段进一步降低
|
||
|
||
time.sleep(check_interval)
|
||
|
||
raise TimeoutError(f"Task monitoring timed out after {max_wait_time}s")
|
||
```
|
||
|
||
#### 3. 大文件处理
|
||
```python
|
||
def upload_large_file(file_path, chunk_size=1024*1024): # 1MB chunks
|
||
file_size = os.path.getsize(file_path)
|
||
|
||
# 对于大文件,考虑压缩或分块上传
|
||
if file_size > 50*1024*1024: # 50MB
|
||
print(f"Large file detected ({file_size/1024/1024:.1f}MB)")
|
||
print("Consider compressing the data or splitting into smaller files")
|
||
|
||
# 标准上传
|
||
with open(file_path, 'rb') as f:
|
||
files = {'file': f}
|
||
response = requests.post('http://localhost:5000/upload', files=files)
|
||
return response.json()
|
||
```
|
||
|
||
### 服务器部署
|
||
|
||
#### 生产环境配置
|
||
```bash
|
||
# 环境变量配置
|
||
export GASFLUX_HOST=0.0.0.0
|
||
export GASFLUX_PORT=5000
|
||
export GASFLUX_LOG_LEVEL=INFO
|
||
export GASFLUX_MAX_CONTENT_LENGTH=104857600 # 100MB
|
||
|
||
# 使用进程管理器
|
||
# systemd 服务示例
|
||
cat > /etc/systemd/system/gasflux.service << EOF
|
||
[Unit]
|
||
Description=GasFlux Web API
|
||
After=network.target
|
||
|
||
[Service]
|
||
User=gasflux
|
||
Group=gasflux
|
||
WorkingDirectory=/opt/gasflux
|
||
ExecStart=/opt/gasflux/venv/bin/python server_waitress.py
|
||
Restart=always
|
||
RestartSec=5
|
||
|
||
[Install]
|
||
WantedBy=multi-user.target
|
||
EOF
|
||
|
||
# 启用和启动服务
|
||
sudo systemctl enable gasflux
|
||
sudo systemctl start gasflux
|
||
```
|
||
|
||
#### 监控和告警
|
||
```bash
|
||
#!/bin/bash
|
||
# 健康检查脚本
|
||
API_URL="http://localhost:5000"
|
||
|
||
# 检查健康状态
|
||
if ! curl -f -s "${API_URL}/health" > /dev/null; then
|
||
echo "API is unhealthy, sending alert..."
|
||
# 发送告警邮件、Slack 通知等
|
||
fi
|
||
|
||
# 检查队列长度
|
||
STATS=$(curl -s "${API_URL}/stats")
|
||
ACTIVE_TASKS=$(echo "$STATS" | jq '.summary.active_tasks')
|
||
|
||
if [ "$ACTIVE_TASKS" -gt 10 ]; then
|
||
echo "High task queue detected: $ACTIVE_TASKS active tasks"
|
||
# 发送告警
|
||
fi
|
||
```
|
||
|
||
## 📚 附录
|
||
|
||
### 支持的环境变量
|
||
|
||
| 变量名 | 默认值 | 描述 |
|
||
|--------|--------|------|
|
||
| `GASFLUX_HOST` | `0.0.0.0` | 服务器监听地址 |
|
||
| `GASFLUX_PORT` | `5000` | 服务器监听端口 |
|
||
| `GASFLUX_DEBUG` | `false` | 调试模式开关 |
|
||
| `GASFLUX_UPLOAD_FOLDER` | `web_api_data/uploads` | 上传文件存储目录 |
|
||
| `GASFLUX_OUTPUT_FOLDER` | `web_api_data/outputs` | 输出文件存储目录 |
|
||
| `GASFLUX_MAX_CONTENT_LENGTH` | `104857600` | 最大文件大小 (字节) |
|
||
| `GASFLUX_LOG_LEVEL` | `INFO` | 日志级别 |
|
||
| `GASFLUX_LOG_FILE` | `logs/gasflux_api.log` | 日志文件路径 |
|
||
| `GASFLUX_CORS_ORIGINS` | `["*"]` | 允许的 CORS 源 |
|
||
| `GASFLUX_TASK_CLEANUP_INTERVAL` | `3600` | 任务清理间隔 (秒) |
|
||
| `GASFLUX_MAX_TASK_AGE` | `86400` | 任务最大年龄 (秒) |
|
||
| `GASFLUX_THREADS` | `8` | Waitress 线程数 |
|
||
| `GASFLUX_CONNECTION_LIMIT` | `100` | 最大连接数 |
|
||
| `GASFLUX_CHANNEL_TIMEOUT` | `300` | 通道超时 (秒) |
|
||
|
||
### API 响应时间基准
|
||
|
||
- `/health`: < 100ms
|
||
- `/stats`: < 500ms
|
||
- `/config`: < 200ms
|
||
- `/upload`: < 2s (文件处理时间)
|
||
- `/task/{id}`: < 300ms
|
||
- `/download/{file}`: 根据文件大小 (通常 < 5s)
|
||
|
||
### 文件格式规范
|
||
|
||
#### 数据文件要求
|
||
- **格式**: Excel (.xlsx 或 .xls)
|
||
- **必需列**: latitude, longitude, height_ato, windspeed, winddir, temperature, pressure
|
||
- **可选列**: ch4, co2, c2h6 等气体浓度
|
||
- **数据类型**: 数值型 (float/int)
|
||
- **缺失值**: NaN 或空值
|
||
|
||
#### 配置文件格式
|
||
```yaml
|
||
output_dir: ~/gasflux_reports
|
||
|
||
required_cols:
|
||
latitude: [-90, 90]
|
||
longitude: [-180, 180]
|
||
height_ato: [-200, 500]
|
||
windspeed: [0, 50]
|
||
winddir: [0, 360]
|
||
temperature: [-50, 60]
|
||
pressure: [900, 1100]
|
||
|
||
gases:
|
||
ch4: [1.5, 500]
|
||
co2: [300, 5000]
|
||
c2h6: [-0.5, 10]
|
||
|
||
strategies:
|
||
background: "algorithmic"
|
||
sensor: "insitu"
|
||
spatial: "curtain"
|
||
interpolation: "kriging"
|
||
```
|
||
|
||
---
|
||
|
||
*最后更新: 2026年1月14日*
|
||
|
||
*GasFlux Web API 版本: 1.0.0*
|
||
*文档维护: API 开发团队* |