chore: 从索引移除运行时产物、个人配置、旧脚手架；完善 .gitignore

2026-06-08 12:12:11 +08:00
parent e3debbcb15
commit 1cbd38a8e0
51 changed files with 38 additions and 9333 deletions
--- a/.gitignore
+++ b/.gitignore
@ -155,3 +155,41 @@ tmp/
 *.bak
 *.backup
 *~
 # ============================================================
 # 不应进入版本控制的文件类型
 # ============================================================
 # Qwen Code 用户配置（个人环境，每次 clone 都不同）
 .qwen/settings.json
 .qwen/settings.json.orig
 # Qwen Code 自动生成的 skill 文件（每次会话重新生成）
 .qwen/skills/
 # GUI 运行时生成的文件
 src/gui/scaler_params.pkl
 src/gui/crash_dump.txt
 # 临时/调试脚本（根目录）
 降采样光谱.py
 1.py
 tset.py
 # 报告与文档（本地工作产物）
 封装问题分析报告.md
 软件说明.md
 软件说明2.md
 # 数据子目录中非 .gitkeep 的生成文件
 data/sub/waterindex*.csv
 data/sub/waterindex*.xlsx
 data/sub/png/watermask.png
 # 图标文件（仅需保留 vector/svg，删除像素图标压缩包副本）
 data/icons-1/*.ico
 data/icons/*.png
 data/icons/word/*.png
 # 旧版脚手架（遗留实验代码）
 new/
--- a/.qwen/settings.json
+++ b/.qwen/settings.json
@ -1,25 +0,0 @@
 {
  "permissions": {
    "allow": [
      "Bash(\"c:\\users\\duxin\\appdata\\local\\programs\\python\\python311\\python.exe\" *)",
      "Bash(get-childitem *)",
      "Bash(select-object *)",
      "Bash(python *)",
      "Bash(where *)",
      "Bash(conda *)",
      "Bash(dir *)",
      "Bash(cmd *)",
      "Bash(del *)",
      "Bash(powershell *)",
      "Bash(git *)",
      "Bash(type *)",
      "Bash(.\\venv\\scripts\\python.exe *)",
      "Bash(\"d:\\111\\office\\zhlduijie\\1.wq\\wq_gui\\venv\\scripts\\python.exe\" *)",
      "Bash(c:\\users\\duxin\\appdata\\local\\programs\\python\\python311\\python.exe *)",
      "Bash(venv\\scripts\\python.exe *)",
      "Bash(findstr *)",
      "Bash(select-string *)"
    ]
  },
  "$version": 4
 }
--- a/.qwen/settings.json.orig
+++ b/.qwen/settings.json.orig
@ -1,24 +0,0 @@
 {
  "permissions": {
    "allow": [
      "Bash(\"c:\\users\\duxin\\appdata\\local\\programs\\python\\python311\\python.exe\" *)",
      "Bash(get-childitem *)",
      "Bash(select-object *)",
      "Bash(python *)",
      "Bash(where *)",
      "Bash(conda *)",
      "Bash(dir *)",
      "Bash(cmd *)",
      "Bash(del *)",
      "Bash(powershell *)",
      "Bash(git *)",
      "Bash(type *)",
      "Bash(.\\venv\\scripts\\python.exe *)",
      "Bash(\"d:\\111\\office\\zhlduijie\\1.wq\\wq_gui\\venv\\scripts\\python.exe\" *)",
      "Bash(c:\\users\\duxin\\appdata\\local\\programs\\python\\python311\\python.exe *)",
      "Bash(venv\\scripts\\python.exe *)",
      "Bash(findstr *)"
    ]
  },
  "$version": 4
 }
--- a/.qwen/skills/code_replacement_state_audit/SKILL.md
+++ b/.qwen/skills/code_replacement_state_audit/SKILL.md
@ -1,141 +0,0 @@
 ---
 name: 代码替换请求的现状审计
 description: 处理用户"代码替换/新增"指令时，先审计磁盘真实状态再用 ask_user_question 确认——避免覆盖已落盘的高版本代码
 source: auto-skill
 extracted_at: '2026-06-03T05:36:58.746Z'
 ---
 # 代码替换请求的现状审计
 ## 适用场景
 用户给出"代码替换"或"按某版本代码新增"指令，但**没有提供与磁盘当前状态对比信息**时。典型触发：
 - 用户贴了一段代码说"请帮我写/替换这个"
 - 用户引用某个文档/旧版本/旧 chat 说"按这个来"
 - 之前的 `state_snapshot` / `memory` / `git log` 描述可能与磁盘现状不一致
 ## 核心原则
 **永远不要盲信"用户给的代码是最新版本"**——磁盘上的代码可能已经是更完善的版本（用户或其他 agent 已迭代过）。覆盖 = 丢功能。
 直接覆盖的代价不一定是显式 bug，也可能是"丢失用户已批准的设计决策"（如 duck-type 探测 / ctx 抽象 / 信号协议 / 二次确认窗 / 错误定位）。
 ## 5 步标准操作
 ### 1. 确认文件存在
 `glob` 或 `list_directory` 看目标文件是否已存在：
 - 不存在 → 新建
 - 存在 → 进入第 2 步审计
 ### 2. grep 关键符号 + 读关键段
 - 找"用户贴的代码"里的 3-5 个关键符号（函数名 / 类名 / 关键常量 / import）
 - 在磁盘文件里 grep 同样的符号
 - `read_file` 关键段（行号从 grep 结果直接拿）
 ### 3. 构造差异对照表
 列出：
 ```
 | 目标文件 | 用户贴的版本 | 磁盘现有版本 | 直接覆盖会丢失 |
 ```
 **关键列**："直接覆盖会丢失什么"——让用户判断成本。具体粒度到"功能模块 / 设计决策 / 防御层 / 入口协议"，不要写"代码差异"这种空话。
 ### 4. ask_user_question 让用户拍板
 3 个标准选项（措辞可调，但**必须给出现状 + 三选一**）：
 - **A. 保留现状**（推荐，磁盘已是更新版）—— 直接进 Smoke Test
 - **B. 强制覆盖到旧版** —— 写明丢什么 + 备份建议（git stash / 复制到 `_old.py`）
 - **C. 混合：只取某段增量** —— 见第 5 步
 **不要在第 1 次 ask 时就列具体的"哪段增量"**——先让用户在 A/B/C 之间选。如果选 C，再做第 5 步。
 ### 5. 若用户选 C，识别"真正增量"
 对比 1.0 vs 2.0，识别 1.0 真正独有的部分（2.0 没有的）：
 - ❌ 排除 1.0 比 2.0 简单的（2.0 是超集 / 工厂分层 / 多了 CLI）
 - ❌ 排除 1.0 整体被 2.0 工厂分层超越的（_make_objective vs _build_model + _get_search_space）
 - ✅ 关注 1.0 独有的功能层（即使 2.0 不"明显"需要）
 对每个候选增量，再问一次"采纳哪段"，让用户具体选（multiSelect=false，一次只选 1 段最稳）。
 ## 落地原则
 执行"采纳 1.0 某段增量到 2.0"时：
 - **最小化外科手术式编辑**：只动需要动的文件，只改需要改的段
 - **保留 2.0 的设计决策**（duck-type 探测 / ctx 抽象 / 信号协议 / 二次确认窗 / 错误定位）
 - **顶部 import 增量用 `replace_all=False` 单点插入**，避免破坏其他 import 顺序
 - **同名变量全链路替换**（如 `self.config` → `clean_config`）要贯穿 ctx 构造 / v2 调用 / v1 fallback，避免双源差异
 - **单步模式不一定要清洗**（不走 panel 完整 config，与清洗器无关）
 - **清洗器这种"防患于未然"的代码要给日志**（`self.log_message.emit(f"[清洗器] 已删除 N 个未知 key")`）让运行时可见
 ## 验证三件套
 落地后必跑：
 1. **AST 语法检查**：`ast.parse(open(p, encoding='utf-8-sig').read())` 对 5 个核心文件
   - 必加 `utf-8-sig`：WQ_GUI 的 water_quality_gui.py line 1 是 BOM，plain `utf-8` 必挂
 2. **关键符号 grep**：确认新代码的关键符号（import / 关键函数调用）都命中，hit 数符合预期
 3. **顶层导入测试**：用 mock PyQt5 + `sys.path.insert(0, 'src/gui/core')`，验证模块整体可加载
   - PyQt5 mock 模板见下方"参考代码"
   - Windows 环境调 Python：用 conda env 的 `python.exe` 全路径，不要靠 PATH
 ## 反例（不要做）
 - ❌ "按用户贴的代码原封不动写入"——1.0 简化版的覆盖陷阱
 - ❌ "保留 state_snapshot 描述"——state snapshot 可能不准确（写的是意图，磁盘才是事实）
 - ❌ "用 git log 反推当前状态"——git log 不能反映工作区未提交改动
 - ❌ "靠 memory 推断当前状态"——memory 可能是 22 天前的（已确认过期）
 - ❌ "磁盘和用户给的代码看起来一样就不审计"——一行之差可能就是"防弹层"丢失
 ## 参考代码
 ### PyQt5 mock 模板（worker_thread.py 顶层导入测试）
 ```python
 import os, sys
 os.environ['GDAL_FILENAME_IS_UTF8'] = 'YES'
 os.environ['SHAPE_ENCODING'] = 'UTF-8'
 sys.path.insert(0, 'src/gui/core')
 import types
 pyqt5 = types.ModuleType("PyQt5")
 qtc = types.ModuleType("PyQt5.QtCore")
 class _QThread:
    def __init__(self, *a, **kw): pass
 class _Signal:
    def __init__(self, *a, **kw): pass
 qtc.QThread = _QThread
 qtc.pyqtSignal = _Signal
 qtc.Qt = type("Qt", (), {"QueuedConnection": 1, "UserRole": 0})()
 sys.modules["PyQt5"] = pyqt5
 sys.modules["PyQt5.QtCore"] = qtc
 import worker_thread
 # 副作用: check_pipeline_dependencies() 会打印依赖检查日志（可忽略）
 ```
 ### Windows 上跑 conda env python
 ```bat
 cmd /c "D:\xxx\anconda\envs\XXX\python.exe D:\path\to\script.py"
 ```
 PowerShell 单行 `python -c "..."` 在中文路径 / 双引号 / 单引号嵌套时易翻车，**写临时 .py 文件再用 `cmd /c` 调**最稳。
 ## 案例来源（2026-06-03 WQ_GUI 路线 B MVP）
 - 用户贴 1.0 简化版：300 行 automl_trainer / 简化 worker_thread.run() / 简化 on_run_all_clicked
 - 磁盘上 2.0 落盘版：545 行 automl_trainer（_build_model + _get_search_space 工厂 / argparse CLI）/ duck-type 探测 v2 + PipelineContext 抽象 / 完整二次确认窗 / 失败步骤 _focus_step 定位 / [DEPRECATED] stop 保留
 - 1.0 唯一真增量 = **"防弹级参数清洗器"**（method_map 14 项 + inspect.signature 过滤未知 key + has_kwargs 豁免 + 未知 key 数量日志）
 - 落地：worker_thread.py:run() 内 set_callback 之后插入 53 行清洗器，self.config 6 处替换为 clean_config
 - 验证：5 文件 AST 全通过 + 关键符号 7 项命中 + PyQt5 mock 下 import 成功
 - 净增行数：407 → 457（+50 行）
--- a/.qwen/skills/facade_kwargs_defense/SKILL.md
+++ b/.qwen/skills/facade_kwargs_defense/SKILL.md
@ -1,309 +0,0 @@
 ---
 name: PipelineRunner Facade 防御性 kwargs 兜底
 description: WQ_GUI 14 个 stepX_... Facade 方法必须以 **kwargs 收尾——配合 PipelineRunner 调度模式杜绝 "unexpected keyword argument" TypeError
 source: auto-skill
 extracted_at: '2026-06-04T00:54:50.036Z'
 ---
 # PipelineRunner Facade 防御性 kwargs 兜底
 ## 适用场景
 在 WQ_GUI 中，**任何被 `PipelineRunner` 调用的 14 个 `stepX_...` Facade 方法**（位于 `src/core/water_quality_inversion_pipeline_GUI.py`），其形参表末尾**必须**带 `**kwargs`。触发信号：
 - 用户报错 `TypeError: stepX_xxx() got an unexpected keyword argument 'yyy'`
 - 改 PIPELINE_STEPS 的 `requires` 列表
 - 新增 / 重命名一个 step 方法
 - 重构 PipelineRunner 的 `_invoke` 注入逻辑
 ## 核心原则
 **Facade 的形参表 = 显式声明的形参 + `, **kwargs`**。`kwargs` 必须**严格位于形参表最后**（Python 语法硬要求）。
 ```python
 # ✅ 正确
 def step3_remove_glint(self, img_path: str,
                       method: str = "subtract_nir",
                       # ... 30+ 业务形参 ...
                       skip_dependency_check: bool = False,
                       **kwargs) -> str:
    ...
 # ❌ 错误：**kwargs 不能放中间或前面
 def step3_remove_glint(self, img_path, **kwargs, skip_dependency_check):  # SyntaxError
 ```
 ## 为什么需要这层防御
 `PipelineRunner._invoke`（`src/core/pipeline/runner.py`）会向方法注入两类参数：
 | 层 | 来源 | 形参 key 怎么定 |
 |---|---|---|
 | **L2** | ctx 字段（按 `spec.requires` 列表） | `_default_param_name(ctx_key)` 默认去 `_path` 后缀 |
 | **L3** | `ctx.user_config[step_id]`（14 panel dict 整体） | dict 的 key 原样注入 |
 **L2 触发 TypeError 的真实场景**（2026-06-04 真实发生）：
 - `PIPELINE_STEPS.step3.requires = ["img_path", "water_mask_path", "glint_mask_path"]`
 - Runner 注入 `kwargs["glint_mask_path"] = ctx.glint_mask_path`
 - `step3_remove_glint` 形参表**没有** `glint_mask_path`（虽然业务上耀斑掩膜是在子调用 `GlintRemovalStep.run` 内部用的，Facade 本身不接）
 - → **TypeError: step3_remove_glint() got an unexpected keyword argument 'glint_mask_path'**
 **L3 触发 TypeError 的场景**：
 - user_config 14 panel dict 里残留了旧字段名 / 跨 step 串味的字段
 - 任何 `user_config[step_id][k]` 中的 `k` 都会被注入
 `**kwargs` 一次性解决两类问题。
 ## 14 个 Facade 方法清单（截至 2026-06-04 已全部带 **kwargs）
 | step | method | 形参表闭合示例 |
 |---|---|---|
 | 1 | `step1_generate_water_mask` | `output_path: Optional[str] = None, **kwargs) -> str:` |
 | 2 | `step2_find_glint_area` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 | 3 | `step3_remove_glint` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 | 4 | `step4_process_csv` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 | 5 | `step5_extract_training_spectra` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 | 5.5 | `step5_5_calculate_water_quality_indices` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 | 6 | `step6_train_models` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 | 6.5 | `step6_5_non_empirical_modeling` | `skip_dependency_check: bool = False, **kwargs) -> Dict[str, str]:` |
 | 6.75 | `step6_75_custom_regression` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 | 7 | `step7_generate_sampling_points` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 | 8 | `step8_predict_water_quality` | `skip_dependency_check: bool = False, **kwargs) -> Dict[str, str]:` |
 | 8.5 | `step8_5_predict_with_non_empirical_models` | `skip_dependency_check: bool = False, **kwargs) -> Dict[str, str]:` |
 | 8.75 | `step8_75_predict_with_custom_regression` | `skip_dependency_check: bool = False, **kwargs) -> Dict[str, str]:` |
 | 9 | `step9_generate_distribution_map` | `skip_dependency_check: bool = False, **kwargs) -> str:` |
 ## 标准操作
 ### 1. 编辑（最小外科手术式）
 每个方法的最后形参是 `skip_dependency_check: bool = False`，把这一行改成：
 ```python
                                       skip_dependency_check: bool = False, **kwargs) -> str:
 ```
 **注意缩进必须与原行一致**（13 空格 / 35 空格 / 47 空格 / 48 空格不等，按方法原始缩进）。用 `edit` 工具的 old_string **必须含 docstring 第一行**（`"""步骤X: ..."""`）作唯一标识。
 ### 2. 验证
 写一个临时校验脚本（项目根目录运行后删掉）：
 ```python
 import ast, re
 target = r'D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\core\water_quality_inversion_pipeline_GUI.py'
 src = open(target, encoding='utf-8-sig').read()
 ast.parse(src)  # AST 语法
 expected = [
    'step1_generate_water_mask', 'step2_find_glint_area', 'step3_remove_glint',
    'step4_process_csv', 'step5_extract_training_spectra',
    'step5_5_calculate_water_quality_indices', 'step6_train_models',
    'step7_generate_sampling_points', 'step8_predict_water_quality',
    'step9_generate_distribution_map', 'step6_5_non_empirical_modeling',
    'step6_75_custom_regression', 'step8_5_predict_with_non_empirical_models',
    'step8_75_predict_with_custom_regression',
 ]
 pat = re.compile(r'def\s+(\w+)\s*\((?:[^()]|\([^()]*\))*\*\*kwargs\)\s*->\s*[^:]+:', re.DOTALL)
 found = set(pat.findall(src))
 print('Missing:', [m for m in expected if m not in found])
 print('Extra  :', [m for m in found if m not in expected])
 ```
 期望输出：两个列表都为空。
 ### 3. Windows 执行
 ```bat
 cd /d D:\111\office\ZHLduijie\1.WQ\WQ_GUI
 py _check.py
 del _check.py
 ```
 > 必加 `utf-8-sig`：`water_quality_inversion_pipeline_GUI.py` 头部可能含 BOM（`code_replacement_state_audit` skill 里有同样提示）。
 ## ⚠️ 这层防御不能解决什么
 **` **kwargs` 兜底 ≠ 形参名错位修复**。如果 Runner 注入 `kwargs["training_csv_path"]` 而方法形参是 `csv_path`：
 - ✅ **不会报 TypeError**（`training_csv_path` 被 `**kwargs` 收走）
 - ❌ **但 `csv_path` 仍是 None**（方法体内的 `if csv_path is not None: ... else: ...` 走 fallback 分支，可能读 `self.training_csv_path` 哨兵）
 **已知形参名错位的方法**（2026-06-04 已修，commit `64aa5b8`）：
 | step | Runner 注入的 ctx key | 方法实际形参 | 实际落地的修复 |
 |---|---|---|---|
 | step6_5 | `training_csv_path` | `csv_path` | `parameter_map={"training_csv_path": "csv_path"}` |
 | step6_75 | （已切到 `indices_path`） | `csv_path` | ⚠️ 见下方"step6_75 路由修复"专题 |
 | step8_5 | `models_dir` | `non_empirical_models_dir` | `parameter_map={"models_dir": "non_empirical_models_dir"}` |
 | step8_75 | `models_dir` | `custom_regression_dir` | `parameter_map={"models_dir": "custom_regression_dir"}` |
 > **parameter_map** 是 `StepSpec` 已有字段（runner.py:33），作用是把 ctx 字段重命名到方法形参名。**优先用 parameter_map 而非改 requires**——保持 ctx 字段语义清晰（声明式描述上游依赖），形参名是方法私有约定。
 ### step6_75 路由修复（特殊案例，2026-06-04 commit `64aa5b8`）
 `step6_75_custom_regression` **不是简单的 ctx 字段名错位**——方法体内的 fallback 链透露了**真正的数据源是 `indices_path`**：
 ```python
 # step6_75 形参：csv_path
 # 方法体 fallback:
 #   if csv_path is not None: input_csv = csv_path
 #   elif self.indices_path is not None: input_csv = self.indices_path   # ★ 真相
 ```
 加 `parameter_map` 把 `training_csv_path → csv_path` 看似能跑通，但**实际用错了数据**（training_csv 是 step5 输出，不是 step6_75 想要的 indices CSV）。
 **正确做法 = 同时改 requires + parameter_map**：
 ```python
 StepSpec(
    step_id="step6_75", method_name="step6_75_custom_regression",
    requires=["indices_path"],                     # ★ 从 training_csv_path 切到 indices_path
    produces=["models_dir"],
    parameter_map={"indices_path": "csv_path"},    # ★ 同步改 key
    description="自定义回归分析",
 ),
 ```
 **配合 `skip_when_missing` 兜底**：若用户没跑 step5_5（`ctx.indices_path` 为 None），runner 自动 skip 整个 step6_75，不会用错位数据静默执行。
 **判别何时需要"路由切"vs"纯 rename"**：
 - 看方法体 fallback 链：fallback 到 `self.indices_path`/`self.deglint_img_path`/其他 ctx 字段名 → **需要改 requires**
 - 仅是 key 名字不同，方法体直接用形参 → 只改 parameter_map
 ## L2 注入顺序冲突：多个 requires 字段解析到同一形参名
 ### 场景
 `StepSpec.requires` 里有**多个 ctx 字段**，经过 `_default_param_name` / `parameter_map` 解析后，**会落到同一个方法形参名**。L2 注入是**顺序敏感**的（后者覆盖前者），后注入的会**默默覆盖**前一个的赋值。
 ### 真实案例（2026-06-04 step5 修复）
 业务需求：step5 真正需要 step4 产物 `processed_csv_path`，但**保留 raw `csv_path` 字段**作为 `user_config` 覆盖入口。
 **❌ 错误的 parameter_map 写法**（用户原方案的隐藏 bug）：
 ```python
 StepSpec(
    step_id="step5", method_name="step5_extract_training_spectra",
    requires=["deglint_img_path", "processed_csv_path", "csv_path", ...],  # raw csv_path 也在
    parameter_map={"processed_csv_path": "csv_path"},  # ★ 只映射了一个
    ...
 )
 ```
 L2 注入顺序（`runner.py:184-186`）：
 1. `deglint_img_path` → `kwargs[deglint_img_path] = ctx.deglint_img_path`
 2. `processed_csv_path` → `kwargs[csv_path] = ctx.processed_csv_path`  ← 主路径生效
 3. `csv_path`（无映射 → 默认）→ `kwargs[csv_path] = ctx.csv_path`  ← **后注入 None 覆盖了主路径！**
 **症状**：step5 形参 `csv_path` 拿到的是 raw `ctx.csv_path`（通常是 None），方法体 fallback 到 `self.processed_csv_path`——但这个 fallback 也可能是 None（step4 没跑），step5 内部空跑 → "**静默错误**"。
 **✅ 修法：parameter_map 双向映射 + 占位名落 **kwargs**：
 ```python
 parameter_map={
    "processed_csv_path": "csv_path",       # 主路径（注入到方法形参）
    "csv_path": "_raw_csv_ignored",         # 占位（落到 step5 形参列表末尾的 **kwargs）
 },
 ```
 注入顺序重排后：
 - `processed_csv_path` → `kwargs[csv_path] = ctx.processed_csv_path`  ← 主路径
 - `csv_path` → `kwargs[_raw_csv_ignored] = ctx.csv_path`  ← 落 **kwargs（被吞）
 step5 形参 `csv_path` 最终拿到 `ctx.processed_csv_path` 的值 ✓。
 ### 验证模板（行为模拟）
 写临时 `_verify_l2_inject.py` 复刻 `runner.py:184-186` 的 L2 注入循环，**不要只靠 AST 静态检查**——parameter_map 的 key 顺序、requires 的字段顺序都是动态的：
 ```python
 import sys
 sys.path.insert(0, r'D:\111\office\ZHLduijie\1.WQ\WQ_GUI')
 from src.core.pipeline.context import PipelineContext
 from src.core.pipeline.runner import PIPELINE_STEPS
 spec5 = next(s for s in PIPELINE_STEPS if s.step_id == 'step5')
 # 复刻 L2 注入（与 runner.py:184-186 完全一致）
 def l2_inject(spec, ctx):
    kwargs = {}
    for ctx_key in spec.requires:
        param_name = spec.parameter_map.get(ctx_key, ctx_key)  # ★ 必须原样复刻
        kwargs[param_name] = ctx.get(ctx_key)
    return kwargs
 # 关键断言
 ctx = PipelineContext(processed_csv_path='/csv/processed.csv', csv_path='/csv/raw.csv')
 kw = l2_inject(spec5, ctx)
 assert kw.get('csv_path') == '/csv/processed.csv', \
    f"形参 csv_path 应等于 processed_csv_path, 实际 {kw.get('csv_path')!r}"
 assert '_raw_csv_ignored' in kw, "占位名应被注入到 kwargs"
 print(f'OK: csv_path 形参 = {kw["csv_path"]!r} (processed, 主路径正确)')
 print(f'OK: _raw_csv_ignored 占位 = {kw["_raw_csv_ignored"]!r} (raw, 落 **kwargs 被吞)')
 ```
 跑完删掉：`py _verify_l2_inject.py & del _verify_l2_inject.py`（Windows 一行模式）。
 ### 何时需要警惕这个冲突
 修改 StepSpec 时检查清单（**先看这一段再写 parameter_map**）：
 - [ ] **requires 里是否有多于 1 个 ctx 字段，解析后会落到同名方法形参？** 典型撞车：
  - 同名字段（如 `processed_csv_path` 和 `csv_path` 都能映射到 `csv_path`）
  - 同名 `_default_param_name` 退化（如 `boundary_path` 和 `boundary_shp_path` 默认都映射到 `boundary_path`——但要注意 `_default_param_name` 已废弃去后缀，原样返回 ctx key，所以 `boundary_path` 和 `boundary_shp_path` 默认就会撞 `boundary_path` / `boundary_shp_path` 不会撞，要撞就必须显式 parameter_map）
  - 字段名 + parameter_map 重命名撞车
 - [ ] **"主路径"字段在 requires 列表靠前位置**（让后续"备路径"覆盖，但**这不解决冲突**——只要有第二次注入就一定会覆盖）
 - [ ] **"备路径"字段**用占位名 `_xxx_ignored` / `_xxx_kwargs_only` 映射，让它落到 **kwargs
 - [ ] **确认方法形参表末尾有 `**kwargs`** 兜底（`facade_kwargs_defense` skill 核心要求，已 14/14 落地）
 ### 反例（不要做）
 - ❌ "我让 `csv_path` 不在 requires 里就行了"——会**丢失 user_config 覆盖入口**（如果用户想用 raw CSV 而不是 processed）
 - ❌ "改 L2 注入循环，让 parameter_map 字段最后注入"——会**改变 runner 通用语义**，影响所有 step 的注入顺序
 - ❌ "加 `if param_name in kwargs: continue` 在 L2 注入里"——隐式"第一次优先"语义，新人读代码摸不着头脑
 - ❌ "用 position in requires 做加权"——把数据语义（哪个字段优先级高）塞到列表顺序里，runner 应该保持"声明式"
 ### 与"纯 rename"的区别
 | 维度 | 纯 rename（已有 skill 案例） | 多→1 冲突（本节案例） |
 |---|---|---|
 | 典型场景 | step6_5/6_75/8_5/8_75：1 个 requires 重命名到形参 | step5：2 个 requires 撞到同一形参 |
 | parameter_map | 1 个 key→value | 2 个 key→同名 value + 占位名 |
 | requires | 1 个字段 | 2 个字段（主 + 备） |
 | 冲突来源 | 不会出现（单 key） | 出现（顺序敏感 + 撞名） |
 | 修法 | 只加 parameter_map | 双向 parameter_map + 占位名 |
 ## 与其他防御层的关系
 ```
 PipelineRunner.run() 主循环
  ├─ L1 runner.py:152  skip_when_missing  ─── ctx.<required> 全 None → skip step
  ├─ L2 runner.py:182  ctx 字段注入        ─── 形参表里没声明 → TypeError ⚠️ → **kwargs 兜底
  ├─ L3 runner.py:188  user_config 合并     ─── user_config 有"空字符串"/None → 跳过（上一轮加的守卫）✅
  └─ L4 runner.py:211  except 捕获          ─── 业务抛异常 → ctx.status="error" + raise
 ```
 `**kwargs` 是 **L2 的"消极兜底"**——宁愿吞掉多余 key 也不报 TypeError。**真正的"积极修复"是 parameter_map**（让 ctx 字段名映射到正确形参名）。两层配合：
 - **保守期间（重构初期）**：先 `**kwargs` 兜住，TypeError 消失
 - **稳定阶段**：补 parameter_map，让方法收到正确数据
 ## 反例（不要做）
 - ❌ "不写 `**kwargs`，靠 type hint + IDE 检查兜底"——Runner 是运行时注入，IDE 看不到
 - ❌ "把 `**kwargs` 放形参表中间"——Python 语法错误
 - ❌ "改 requires 列表去掉冗余 ctx 字段"——会导致 `skip_when_missing` 误判（以为 step 不需要该 ctx 字段），应该用 `parameter_map` 重命名而非删除 requires
 - ❌ "在 14 个 Facade 方法体里加 `if 'glint_mask_path' in kwargs: kwargs.pop('glint_mask_path')`"——脏活，且每个方法都要加，远不如 `**kwargs` 一行优雅
 ## 案例来源
 - 2026-06-04 WQ_GUI PipelineRunner 迁移第二步
 - 触发：`step3_remove_glint() got an unexpected keyword argument 'glint_mask_path'`
 - 根因：`PIPELINE_STEPS.step3.requires` 写了 `glint_mask_path`，但 `GlintRemovalStep` 内部使用，Facade 自身不接这个形参
 - 落地：14 个 Facade 全部加 `, **kwargs`，0 个 TypeError
 - 验证：临时 `_check.py` 14/14 命中 + AST 解析通过
 - 续：4 个 parameter_map 全部落地（commit `64aa5b8`），含 step6_75 路由切到 indices_path；L3 非空过滤同步加入 `runner._invoke:188`
 - 2026-06-04 step5 严格依赖修复：发现 L2 注入顺序冲突（requires 多个字段解析到同一形参名），引入"双向 parameter_map + 占位名落 **kwargs"模式；step5 形参 `csv_path` 真正接到 `processed_csv_path`（step4 产物），raw `csv_path` 保留为 user_config 覆盖入口，落占位名 `_raw_csv_ignored` 后被 `**kwargs` 吞。skip_when_missing 块同步加 `_notify` 通知，**拒绝静默跳过**（15 条 _notify 全带具体 missing 字段列表证据）。
--- a/.qwen/skills/wq_gui_data_flow/SKILL.md
+++ b/.qwen/skills/wq_gui_data_flow/SKILL.md
@ -1,206 +0,0 @@
 ---
 name: WQ_GUI 数据流转架构
 description: WQ_GUI ProjectSession 事件总线驱动的步骤间数据传递机制（完整重构版）
 source: auto-skill
 extracted_at: '2026-05-28T09:07:34.967Z'
 ---
 # WQ_GUI 数据流转架构
 ## 核心结论
 整个系统是**基于文件路径驱动**的管道，所有数据存储在本地磁盘。重构后通过 `ProjectSession` 事件总线实现 Panel 间完全解耦。
 ---
 ## 1. 旧架构（旧代码中已删除）
 主窗口通过 `self.step_outputs` 字典 + `step_dependencies` 配置 + `auto_populate_*` 系列方法管理步骤间路径填充。存在高度耦合问题：
 ```python
 # 已废弃并删除
 self.step_outputs = {}
 self._init_step_dependencies()
 self.update_step_outputs(step_name, work_path)
 self.auto_populate_dependent_steps(completed_step)
 self.auto_populate_step_inputs(step_id)
 self.find_step_output(work_path, step_id, output_type)
 self.add_auto_fill_buttons_to_panels()
 self.scan_work_directory_for_files(work_path)
 ```
 ---
 ## 2. 新架构：ProjectSession 事件总线
 ### Session 核心 API（`src/core/project_session.py`）
 ```python
 class ProjectSession(QObject):
    path_updated = pyqtSignal(str, str, str)   # step, out_type, path
    step_outputs_ready = pyqtSignal(str, str)  # step, out_type
    def update_output(step, out_type, path):
        """Panel 完成后广播输出路径"""
    def update_outputs(step, {out_type: path, ...}):
        """Panel 完成后批量广播多个输出路径"""
    def get_output(step, out_type):
        """Panel 可主动查询上游路径（用于自动填充）"""
    def get_step_outputs(step):
        """返回该 step 的全部输出字典"""
    def scan_work_directory():
        """主窗口 on_step_completed 末尾调用，扫描并广播所有已知路径"""
 ```
 ### Panel 重构模板
 ```python
 class StepXPanel(QWidget):
    def __init__(self, session=None, parent=None):
        super().__init__(parent)
        self.session = session
        self.work_dir = None
        self.init_ui()
        self._bind_session_signals()
    def _bind_session_signals(self):
        if not self.session:
            return
        self.session.path_updated.connect(
            self._on_session_path_updated, Qt.QueuedConnection
        )
    @pyqtSlot(str, str, str)
    def _on_session_path_updated(self, step_name, output_type, path):
        print(f"[StepX Debug] 收到广播: step={step_name}, type={output_type}, path={path}")
        if step_name == 'step1':
            if output_type == 'reference_img':
                if not self.img_file.get_path().strip():
                    self.img_file.set_path(path)
                    print(f"[StepX] 自动填充参考影像: {path}")
            elif output_type == 'water_mask':
                if not self.water_mask_file.get_path().strip():
                    self.water_mask_file.set_path(path)
                    print(f"[StepX] 自动填充水域掩膜: {path}")
        # ...
    def on_step_finished(self, success, message):
        """由主窗口 on_step_completed 通过 getattr 动态调用"""
        if not success:
            return
        if self.session:
            outputs = {}
            path = self.output_widget.get_path().strip()
            if path:
                outputs['output_type'] = path
            if outputs:
                self.session.update_outputs('stepX', outputs)
 ```
 ### 主窗口两处改动
 ```python
 # 1. __init__ 中注入 session（所有 Panel 统一注入）
 self.step1_panel = Step1Panel(session=self.session)
 self.step2_panel = Step2Panel(session=self.session)
 self.step3_panel = Step3Panel(session=self.session)
 self.step4_panel = Step4Panel(session=self.session)
 self.step5_panel = Step5Panel(session=self.session)
 self.step5_5_panel = Step5_5Panel(session=self.session)
 self.step6_panel = Step6Panel(session=self.session)
 self.step6_5_panel = Step6_5Panel(session=self.session)
 self.step6_75_panel = Step6_75Panel(session=self.session)
 self.step7_panel = Step7Panel(session=self.session)
 self.step8_panel = Step8Panel(session=self.session)
 self.step8_5_panel = Step8_5Panel(session=self.session)
 self.step8_75_panel = Step8_75Panel(session=self.session)
 self.step9_panel = Step9Panel(session=self.session)
 # 2. on_step_completed（通用动态获取，无需维护字典）
 def on_step_completed(self, step_name, success, message):
    if not success:
        return
    if hasattr(self, 'session') and self.session:
        self.session.scan_work_directory()
    panel = getattr(self, f"{step_name}_panel", None)
    if panel and hasattr(panel, 'on_step_finished'):
        panel.on_step_finished(success, message)
 ```
 ---
 ## 3. 全链路事件流
 ### step1 → step2 / step3 路径（通过 Shapefile 栅格化产物）
 | 场景 | 广播的 water_mask 路径 |
 |------|----------------------|
 | NDWI 模式 | `output_file` 用户指定路径 |
 | Shapefile 模式 | `{work_dir}/1_water_mask/water_mask_from_shp.dat`（优先）<br>若文件不存在则 fallback 回 `mask_file.get_path()` |
 ```
 step1 完成
  → step1_panel.on_step_finished()
       → session.update_outputs('step1', {
             'reference_img': img_path,
             'water_mask': mask_path    # 可能是 .dat 或 .shp（见上表）
         })
            → step2_panel._on_session_path_updated()
            → step3_panel._on_session_path_updated()
 ```
 ### step3 → step5 / step7；step5 → 下游训练
 ```
 step3.deglint_image ──┬─→ step5.deglint_image（填充 img_file）
                      └─→ step7.deglint_image（填充 img_file）
 step5.training_spectra ──┬─→ step5_5.index_features
                        ├─→ step6.models_dir ──→ step8.predictions
                        ├─→ step6_5.models_dir ──→ step8_5.predictions
                        └─→ step6_75.models_dir ──→ step8_75.predictions
 step7.sampling_points ──┬─→ step8
                        ├─→ step8_5
                        └─→ step8_75
 step8/8_5/8_75.predictions ──→ step9.distribution_map
 ```
 ### 各 Panel 监听/发布对照表（完整版）
 | Panel | 监听 | 发布 |
 |-------|------|------|
 | step1 | — | `reference_img`, `water_mask` |
 | step2 | `step1.reference_img`, `step1.water_mask` | `glint_mask` |
 | step3 | `step1.reference_img`, `step1.water_mask`, `step2.glint_mask` | `deglint_image` |
 | step4 | — | `processed_data` |
 | step5 | `step3.deglint_image`, `step4.processed_data`, `step2.glint_mask` | `training_spectra` |
 | step5_5 | `step5.training_spectra` | `index_features` |
 | step6 | `step5.training_spectra` | `models_dir` |
 | step6_5 | `step5.training_spectra` | `models_dir` |
 | step6_75 | `step5.training_spectra` | `models_dir` |
 | step7 | `step3.deglint_image`, `step1.water_mask`, `step2.glint_mask` | `sampling_points` |
 | step8 | `step7.sampling_points`, `step6.models_dir` | `predictions` |
 | step8_5 | `step7.sampling_points`, `step6_5.models_dir` | `predictions` |
 | step8_75 | `step7.sampling_points`, `step6_75.models_dir` | `predictions` |
 | step9 | `step8.predictions`, `step8_5.predictions`, `step8_75.predictions` | `distribution_map` |
 ---
 ## 4. 关键约束
 - `__init__` 参数 `session=None`（向后兼容，主窗口可继续不传）
 - 所有 Panel 的 `init_ui / get_config / set_config / update_from_config` 完整保留
 - 删除所有 `self.window().stepX_panel` 跨界访问
 - 使用 `self.session.get_output()` 替代直接读取其他 panel 的 widget
 - 监听使用 `Qt.QueuedConnection` 确保跨线程安全
 - 仅在 field 为空时自动填充（`not widget.get_path().strip()`）
 - `update_from_config` 中优先从 Session 获取路径，再用 Session 广播
 - 主窗口 `on_step_completed` 中使用 `getattr(self, f"{step_name}_panel", None)` 实现通用动态获取，无需维护硬编码字典
 - `step1` Shapefile 模式下，**不能**直接广播 `.shp` 输入文件，必须拼接 `{work_dir}/1_water_mask/water_mask_from_shp.dat` 作为产物路径
--- a/.qwen/skills/wq_gui_external_model_panel/SKILL.md
+++ b/.qwen/skills/wq_gui_external_model_panel/SKILL.md
@ -1,294 +0,0 @@
 ---
 name: WQ_GUI PyQt5 面板外部模型导入模式
 description: 在 Step8 等预测面板中通过 QRadioButton + FileSelectWidget + joblib.load 防御性加载实现"内置/导入"双模式切换的标准模式
 source: auto-skill
 extracted_at: '2026-06-08T01:38:14.481Z'
 ---
 # WQ_GUI PyQt5 面板外部模型导入模式
 ## 适用场景
 Step8（机器学习预测）、Step8_5、Step8_75 等面板需要同时支持：
 1. **内置模式**：使用 `step6` 训练流程生成的模型目录
 2. **导入模式**：用户手动选择本地预训练 `.joblib` 文件直接加载
 ---
 ## 1. 模板（可直接复制到 `__init__` + `init_ui`）
 ```python
 from PyQt5.QtWidgets import QRadioButton
 class StepXPanel(QWidget):
    def __init__(self, parent=None):
        super().__init__(parent)
        self.current_model = None   # ★ 外部模型实例缓存
        self.init_ui()
    def init_ui(self):
        layout = QVBoxLayout()
        # -------- 模型来源选择（单选按钮组） --------
        source_group = QGroupBox("模型来源")
        source_layout = QVBoxLayout()
        self.use_trained_model = QRadioButton("使用当前训练流程的模型")
        self.use_external_model = QRadioButton("导入本地预训练模型 (.joblib)")
        self.use_trained_model.setChecked(True)
        source_layout.addWidget(self.use_trained_model)
        source_layout.addWidget(self.use_external_model)
        self.use_trained_model.toggled.connect(self._on_model_source_changed)
        self.use_external_model.toggled.connect(self._on_model_source_changed)
        source_group.setLayout(source_layout)
        layout.addWidget(source_group)
        # -------- 外部模型文件选择（条件显示） --------
        self.external_model_widget = FileSelectWidget(
            "预训练模型:",
            "Joblib Files (*.joblib);;All Files (*.*)"
        )
        # FileSelectWidget 的 browse_btn 默认连着 open file 行为，
        # 需要先断开默认连接，再接自定义槽
        self.external_model_widget.browse_btn.clicked.disconnect()
        self.external_model_widget.browse_btn.clicked.connect(self._browse_external_model)
        self.external_model_widget.setVisible(False)
        layout.addWidget(self.external_model_widget)
        # ... 其余原有 UI ...
 ```
 ---
 ## 2. 槽函数模板
 ### `_on_model_source_changed`
 单选按钮 `toggled` 信号在**两个**按钮上都会触发（点击 A 时 A 触发，B 也触发），所以用 `if not checked: return` 让非选中分支短路。
 ```python
 def _on_model_source_changed(self, checked: bool):
    """单选按钮切换：控制外部模型文件选择控件的显示/隐藏"""
    if not checked:
        return
    is_external = self.use_external_model.isChecked()
    self.external_model_widget.setVisible(is_external)
    # 切回"使用当前模型"时清空缓存，释放内存并避免误用旧模型
    if not is_external:
        self.current_model = None
 ```
 ### `_browse_external_model`
 - 用 `QFileDialog.getOpenFileName` 而非 `getExistingDirectory`
 - 防御性解析两种格式：`{"model": pipeline, ...}`（Step6 输出格式）和裸 `Pipeline` 对象
 - 失败用 `QMessageBox.warning` 友善提示；成功用 `QMessageBox.information` 告知
 ```python
 from PyQt5.QtWidgets import QFileDialog, QMessageBox
 from pathlib import Path
 def _browse_external_model(self):
    """浏览并加载外部 .joblib 预训练模型文件"""
    default = self._get_default_work_dir()
    path, _ = QFileDialog.getOpenFileName(
        self,
        "选择预训练模型 (.joblib)",
        default,
        "Joblib Files (*.joblib);;All Files (*.*)",
    )
    if not path:
        return
    try:
        import joblib
        loaded = joblib.load(path)
        # 兼容两种格式：dict{"model": obj} 或裸 Pipeline
        if isinstance(loaded, dict) and "model" in loaded:
            self.current_model = loaded["model"]
        elif hasattr(loaded, "predict"):
            self.current_model = loaded
        else:
            QMessageBox.warning(
                self,
                "模型格式错误",
                f"无法识别的模型格式，文件内容类型为：{type(loaded).__name__}",
            )
            return
        self.external_model_widget.set_path(path)
        QMessageBox.information(
            self,
            "模型加载成功",
            f"已加载模型：{Path(path).name}\n类型：{type(self.current_model).__name__}",
        )
    except Exception as e:
        self.current_model = None
        QMessageBox.warning(
            self,
            "模型加载失败",
            f"加载模型时发生错误：\n{type(e).__name__}: {e}",
        )
 ```
 ---
 ## 3. `run_step` 改造模板
 在原有目录加载逻辑之前，插入外部模型优先分支：
 ```python
 def run_step(self):
    """独立运行步骤X"""
    # ... 公共输入校验 ...
    # ★ 外部模型优先分支
    if self.use_external_model.isChecked():
        if self.current_model is None:
            QMessageBox.warning(
                self,
                "模型未加载",
                "请先点击「浏览...」按钮加载预训练模型文件！",
            )
            return
        external_model_path = self.external_model_widget.get_path() or ""
        main_window = self.window()
        if hasattr(main_window, 'run_single_step'):
            config = {
                'stepX': self.get_config(),
                '_external_model': self.current_model,       # ★ 直接传对象
                '_external_model_path': external_model_path,  # 供日志/回溯用
            }
            main_window.run_single_step('stepX', config)
        return
    # 默认流程：使用模型目录（原有逻辑不变）
    models_dir = self.models_dir_file.get_path()
    if not models_dir:
        QMessageBox.warning(self, "输入错误", "请选择模型目录！")
        return
    # ... 原有 run_step 剩余代码 ...
 ```
 ---
 ## 4. 后端三层完整接入（2026-06-08 已落地）
 完整数据流分为三层，每层各一处分流点：
 ```
 GUI step8_panel
  ↓ config = {'_external_model': obj, '_external_model_path': path, 'step8': {...}}
       ↓
 worker_thread.run_single_step()         [第1处分流：透传顶层 key]
  ↓ step_config = config['step8'] + {'_external_model': obj, '_external_model_path': path}
       ↓
 prediction_step.predict_water_quality() [第2处分流：接收 + 透传]
  ↓ _external_model=obj, _external_model_path=path
       ↓
 WaterQualityInference(artifacts_dir, external_model=obj, external_model_path=path)
  ↓
 inference_batch.batch_inference_multi_models()  [第3处分流：effective_model 短路]
  ↓ external_model=obj
       ↓
 inference_batch.inference_pipeline()
  → self.external_model is not None → self.loaded_model_data = self.external_model（跳过磁盘加载）
 ```
 ### 4a. worker_thread.py — run_single_step 透传
 在 `step_config = dict(config.get(step_name, {}))` 之后、"skip_dependency_check" 之前插入：
 ```python
 # 透传面板顶层传入的外部预训练模型（GUI step8_panel 通过 config['_external_model'] 传入）
 # 非空才覆盖（遵循 feedback_never_overwrite_with_empty 原则）
 for key in ('_external_model', '_external_model_path'):
    val = config.get(key)
    if val is not None and val != "":
        step_config[key] = val
 ```
 ### 4b. prediction_step.py — predict_water_quality 签名 + 透传
 形参表末尾增加两个参数：
 ```python
 _external_model=None,
 _external_model_path=None,
 ```
 构造处透传：
 ```python
 inferencer = WaterQualityInference(
    models_dir,
    external_model=_external_model,
    external_model_path=_external_model_path,
 )
 all_results = inferencer.batch_inference_multi_models(
    models_root_dir=models_dir,
    ...
    external_model=_external_model,
    external_model_path=_external_model_path,
 )
 ```
 ### 4c. inference_batch.py — 三处修改
 **① `__init__` 存储**：
 ```python
 def __init__(self, artifacts_dir: str = "models/artifacts",
             external_model=None, external_model_path=None):
    ...
    self.external_model = external_model
    self.external_model_path = external_model_path
 ```
 **② `batch_inference_multi_models` 短路 + 注入**：
 ```python
 # 优先级：外部预训练模型 > 从磁盘加载
 if external_model is not None:
    effective_model = external_model
    print(f"\n使用外部预训练模型: type={type(external_model).__name__}")
 else:
    effective_model = None
 # 子目录循环中注入：
 if effective_model is not None:
    model_inferencer = WaterQualityInference(
        str(subdir),
        external_model=effective_model,
        external_model_path=external_model_path,
    )
 else:
    model_inferencer = WaterQualityInference(str(subdir))
 ```
 **③ `inference_pipeline` 模型加载短路**（`load_best_model` 调用前）：
 ```python
 if self.external_model is not None:
    self.loaded_model_data = self.external_model
    print(f"  使用外部预训练模型: type={type(self.external_model).__name__}")
 elif model_file_path:
    self.load_specific_model(model_file_path)
 else:
    self.load_best_model(metric=metric)
 ```
 **关键约束**：
 - `joblib.load` 在 panel 槽函数里完成（GUI 进程内），对象通过 config 引用直接透传；**不跨进程**，所以不需要担心 pickle 序列化问题
 - `batch_inference_multi_models` 形参 `external_model` 和 `external_model_path` **与实例属性同名**（`self.external_model`），两者都传是为了让每个子目录创建的 `WaterQualityInference` 实例都能独立持有引用
 - 原有从 `models_dir` 目录加载的逻辑完全保留，只在 `external_model is not None` 时短路
 ---
 ## 5. 已知约束
 - `FileSelectWidget.browse_btn.clicked` 在 `init_ui` 里会重复 connect，每次 `init_ui` 被调用时会累积；解决方案是在 connect 前先 `.disconnect()`（如模板所示）。
 - `QRadioButton.toggled` 信号在两个按钮上都会触发，**必须**用 `if not checked: return` 短路，否则会导致切换时状态错乱。
 - `self.current_model` 会在面板切换到"使用当前模型"时清空，防止用户忘记换回内置模式后仍使用旧导入模型。
 - 当前项目 venv 路径：`D:\111\office\ZHLduijie\1.WQ\WQ_GUI\venv`，导入 `joblib` 时注意 venv 环境一致性。
--- a/.qwen/skills/wq_gui_frontend_scaffold/SKILL.md
+++ b/.qwen/skills/wq_gui_frontend_scaffold/SKILL.md
@ -1,229 +0,0 @@
 ---
 name: WQ_GUI 前端 Vue3 + Element Plus 脚手架
 description: WQ_GUI 项目 frontend/ 目录的 Vite + Vue 3 + TS + Element Plus 最小可运行脚手架，以及 useTaskPoller 与 Element Plus UI 的接线模式
 source: auto-skill
 extracted_at: '2026-06-02T08:17:33.116Z'
 ---
 # WQ_GUI 前端脚手架 (Vue 3 + Element Plus)
 ## 适用场景
 为 WQ_GUI FastAPI 后端 (`127.0.0.1:8000`) 搭建一个**最小可联调**的浏览器控制台。
 后端已暴露：
 - `POST /api/modeling/train` → `{ task_id, status, kind }`
 - `POST /api/modeling/predict` → `{ task_id, status, kind }`
 - `GET  /api/tasks/{task_id}` → `TaskRecord`（含 PENDING/PROCESSING/SUCCESS/FAILED + 模型指标 / 输出路径）
 - `GET  /api/algorithms` → 算法清单
 前端已有 (`frontend/src/`):
 - `api/request.ts`：axios 单例 + 响应拦截器自动 unwrap，baseURL 走 `VITE_API_BASE_URL` 缺省 `http://127.0.0.1:8000`
 - `api/tasks.ts`：所有提交 / 查询函数 + 完整 `TaskRecord` / `TaskStatus` / `TaskKind` 类型
 - `composables/useTaskPoller.ts`：完整轮询 composable，支持 3 种用法（静态 / 响应式 taskId / 手动）
 ## 1. 一次性补齐的脚手架文件
 `frontend/` 初始状态**只有 `src/api` 和 `src/composables`**，缺整个 Vite 骨架。直接照下面这 7 个文件铺一遍：
 ```
 frontend/
 ├── .env.development       # VITE_API_BASE_URL=http://127.0.0.1:8000
 ├── .gitignore             # node_modules / dist / .vite
 ├── env.d.ts               # vite/client + ImportMeta + *.vue shim
 ├── index.html             # 挂载 #app
 ├── package.json
 ├── tsconfig.json          # 严格模式 + @ → src + bundler resolution
 ├── tsconfig.node.json     # 给 vite.config.ts 用
 ├── vite.config.ts         # @ alias + 0.0.0.0:5173
 └── src/
    ├── main.ts
    └── App.vue
 ```
 ### 锁定版本（2026-06 联调通过）
 ```json
 {
  "dependencies": {
    "vue": "^3.4.27",
    "element-plus": "^2.7.5",
    "@element-plus/icons-vue": "^2.3.1",
    "axios": "^1.7.2"
  },
  "devDependencies": {
    "@types/node": "^20.12.12",
    "@vitejs/plugin-vue": "^5.0.4",
    "typescript": "^5.4.5",
    "vite": "^5.2.11",
    "vue-tsc": "^2.0.19"
  }
 }
 ```
 **`@types/node` 必加**——`vite.config.ts` 用了 `import { fileURLToPath, URL } from 'node:url'`，否则 `npm run build` 类型检查必挂。
 ### `tsconfig.json` 关键字段
 - `"moduleResolution": "bundler"`
 - `"allowImportingTsExtensions": true`（配合 `vue-tsc --noEmit`）
 - `"paths": { "@/*": ["src/*"] }` + `"baseUrl": "."`
 - `"include": ["src/**/*.vue"]`（`vue-tsc` 才会处理 SFC）
 - `"references": [{ "path": "./tsconfig.node.json" }]`
 ### `vite.config.ts` 关键字段
 ```ts
 resolve: {
  alias: { '@': fileURLToPath(new URL('./src', import.meta.url)) },
 },
 server: { host: '0.0.0.0', port: 5173 },
 ```
 `0.0.0.0` 方便局域网真机调试；端口冲突时 `strictPort: false` 允许 Vite 自动 +1。
 ---
 ## 2. main.ts 模板（全量注册 Element Plus）
 ```ts
 import { createApp } from 'vue'
 import ElementPlus from 'element-plus'
 import 'element-plus/dist/index.css'
 import * as ElementPlusIconsVue from '@element-plus/icons-vue'
 import App from './App.vue'
 const app = createApp(App)
 app.use(ElementPlus)
 // 全量注册图标 (<el-icon><Cpu /></el-icon>)
 for (const [name, component] of Object.entries(ElementPlusIconsVue)) {
  app.component(name, component)
 }
 app.mount('#app')
 ```
 联调期**全量注册最省事**；后期打包体积大再换 `unplugin-vue-components` 按需。
 ---
 ## 3. useTaskPoller 接线模式（双实例）
 训练 / 推断是**两条独立流水线**，各起一个 `useTaskPoller` 实例。核心套路：把 `task_id` 包成 `ref<string | null>(null)`，composable 内部 `watch` 会**自动 start()**，无需手动调：
 ```ts
 import { ref, watch, computed } from 'vue'
 import { submitTrain, submitPredict, type TaskRecord } from './api/tasks'
 import { useTaskPoller } from './composables/useTaskPoller'
 // —— 训练 ——
 const trainTaskId = ref<string | null>(null)
 const trainPoller = useTaskPoller(trainTaskId)   // 传 ref 进去, 自动 watch
 async function onStartTrain() {
  const { task_id } = await submitTrain({ ... })
  trainTaskId.value = task_id   // 赋值后 watch 触发 start()
 }
 // —— 推断 ——
 const predictTaskId = ref<string | null>(null)
 const predictPoller = useTaskPoller(predictTaskId)
 const modelId = ref('')
 // 训练一成功, model_id 自动填入推断输入框
 watch(
  () => trainPoller.result.value?.model_id,
  (newId) => { if (newId) modelId.value = newId },
 )
 async function onStartPredict() {
  const { task_id } = await submitPredict({ model_id: modelId.value, ... })
  predictTaskId.value = task_id
 }
 ```
 **关键点**：
 - `trainPoller.result.value` 才是 SUCCESS 后的完整 `TaskRecord`；`record.value` 是任意时刻（含中间态）的最新记录。模板里同时展示用 `trainPoller.record.value ?? trainPoller.result.value`。
 - `poller.isPolling.value` / `poller.status.value` / `poller.error.value` / `poller.taskId.value` 都是 `Ref`，模板里必须用 `.value`（它们是嵌套 ref，**Vue 模板不会自动 unwrap**）。
 ---
 ## 4. el-progress 状态映射
 `PollerStatus = 'idle' | 'PENDING' | 'PROCESSING' | 'SUCCESS' | 'FAILED'`
 `el-progress` 的 `status` 接受 `'' | 'success' | 'warning' | 'exception'`。
 ```ts
 function progressOf(status: string): number {
  switch (status) {
    case 'idle':
    case 'PENDING':   return 10
    case 'PROCESSING':return 60
    case 'SUCCESS':
    case 'FAILED':    return 100
    default:          return 0
  }
 }
 function progressStatusOf(s: string): '' | 'success' | 'exception' {
  if (s === 'SUCCESS') return 'success'
  if (s === 'FAILED')  return 'exception'
  return ''
 }
 ```
 模板里 `v-if="poller.isPolling.value || poller.status.value === 'SUCCESS' || poller.status.value === 'FAILED'"` 控制展示。
 ---
 ## 5. CSS：深色控制台风（slate 渐变 + 卡片玻璃态）
 ```css
 .app-root {
  min-height: 100vh;
  background: linear-gradient(180deg, #0f172a 0%, #1e293b 100%);
  color: #e2e8f0;
 }
 .panel {
  background: rgba(30, 41, 59, 0.7) !important;
  border: 1px solid rgba(148, 163, 184, 0.18) !important;
 }
 .app-main {
  display: grid;
  grid-template-columns: 1fr 1fr;   /* 左训练 / 右推断 */
  gap: 20px;
 }
@media (max-width: 960px) { .app-main { grid-template-columns: 1fr; } }
 ```
 深色背景下 Element Plus 的 `el-form-item__label` / `el-descriptions__label` 默认是黑色文字，必须 `:deep()` 覆盖成浅色。
 ---
 ## 6. 启动与验证
 ```bat
 cd /d D:\111\office\ZHLduijie\1.WQ\WQ_GUI\frontend
 npm install
 npm run dev
 ```
 打开 `http://127.0.0.1:5173/`，联调期望路径：
 1. 左侧「开始训练」→ 立即拿到 `task_id` + 黄色 `轮询中` + 进度条 60%
 2. 后端 SUCCESS → 进度条变绿，下面出现 `model_id` 标签 + R²/RMSE/MAE
 3. 右侧 `model_id` 被自动填入 → 「开始推断」→ 走 `output_zarr_path` 展示
 4. 任何一步 FAILED → 进度条变红 + 后端 `error` 字段
 ---
 ## 7. 已知 caveat
 - **第一次 `npm install` 约 150MB**，要耐心等。
 - `useTaskPoller` 已有 `onUnmounted` 自动清理，**不要再手写 `clearInterval`**。
 - `request.ts` 注释里写明 FastAPI dev 期 `allow_origins=["*"]`，**不需要配 Vite proxy**；如果未来后端收紧 CORS，再在 `vite.config.ts` 加 `server.proxy['/api']`。
 - `feature_start` 后端接受 `number | string`；el-input v-model 出来是 string，**直接传给 API 即可**，后端会自己判别。
 - `v-model` 绑 `ref<number | string>(4)` 类型注解是必须的，否则 TS 会推断成 `Ref<number>`，输入框失焦报错。
 - `@element-plus/icons-vue` 全量注册后用 `<el-icon><Cpu /></el-icon>` 调，本期 App.vue 没用到但留着扩展位。
--- a/1.py
+++ b/1.py
@ -1,4 +0,0 @@
 new_wavelengths = [np.mean(wavelengths[i:i+3]) for i in range(0, len(wavelengths), 3)]
 print(new_wavelengths)
--- a/data/sub/png/watermask.png
+++ b/data/sub/png/watermask.png
--- a/data/sub/waterindex.csv
+++ b/data/sub/waterindex.csv
@ -1,46 +0,0 @@
 Formula_Name,Category,Formula,Reference
 BGA_Am09KBBI,Phycocyanin (BGA_PC),(w686 - w658) / (w686 + w658),"Amin, R.; Zhou, J.; Gilerson, A.; Gross, B.; Moshary, F.; Ahmed, S.; Novel optical techniques for detecting and classifying toxic dinoflagellate Karenia brevis blooms using satellite imagery, Optics Express, 2009, 17, 11, 1-13."
 BGA_Be162B643sub629,Phycocyanin (BGA_PC),w644 - w629,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 538."
 BGA_Be162B700sub601,Phycocyanin (BGA_PC),w700 - w601,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 539."
 BGA_Be162BsubPhy,Phycocyanin (BGA_PC),w715 - w615,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 540."
 BGA_Be16FLHBlueRedNIR,Phycocyanin (BGA_PC),w658 - (w857 + (w458 - w857)),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 538."
 BGA_Be16FLHGreenRedNIR,Phycocyanin (BGA_PC),w658 - (w857 + (w558 - w857)),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 539."
 BGA_Be16FLHVioletRedNIR,Phycocyanin (BGA_PC),w658 - (w857 + (w444 - w857)),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 538."
 BGA_Be16MPI,Phycocyanin (BGA_PC),(w615 - w601) - (w644 - w601),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 539."
 BGA_Be16NDPhyI,Phycocyanin (BGA_PC),(w700 - w622) / (w700 + w622),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 540."
 BGA_Be16NDPhyI644over615,Phycocyanin (BGA_PC),(w644 - w615) / (w644 + w615),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 541."
 BGA_Be16NDPhyI644over629,Phycocyanin (BGA_PC),(w644 - w629) / (w644 + w629),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 542."
 BGA_Be16Phy2BDA644over629,Phycocyanin (BGA_PC),w644 / w629,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 545."
 BGA_Da052BDA,Phycocyanin (BGA_PC),w714 / w672,"Wynne, T. T., Stumpf, R. P., Tomlinson, M. C., Warner, R. A., Tester, P. A., Dyble, J.; Relating spectral shape to cyanobacterial blooms in the Laurentian Great Lakes. Int. J. Remote Sens., 2008, 29, 3665-3672."
 BGA_Go04MCI,Phycocyanin (BGA_PC),w709 - w681 - (w753 - w681),"Gower, J.F.R.; Brown,L.; Borstad, G.A.; Observation of chlorophyll fluorescence in west coast waters of Canada using the MODIS satellite sensor. Can. J. Remote Sens., 2004, 30 (1), 17<31><37>?5."
 BGA_HU103BDA,Phycocyanin (BGA_PC),(((1 / w615) - (1 / w600)) - w725),"Hunter, P.D.; Tyler, A.N.; Willby, N.J.; Gilvear, D.J.; The spatial dynamics of vertical migration by Microcystis aeruginosa in a eutrophic shallow lake: A case study using high spatial resolution time-series airborne remote sensing. Limn. Oceanogr. 2008, 53, 2391-2406"
 BGA_Ku15PhyCI,Phycocyanin (BGA_PC),(-1 * (W681 - W665 - (W709 - W665))),"Kudela, R.M., Palacios, S.L., Austerberry, D.C., Accorsi, E.K., Guild, L.S.; Application of hyperspectral remote sensing to cyanobacterial blooms in inland waters, Torres-Perez, J., 2015, Remote Sens. Environ., 2015, 167, 1-10."
 BGA_Ku15SLH,Phycocyanin (BGA_PC),(w715 - w658) + (w715 - w658),"Kudela, R.M., Palacios, S.L., Austerberry, D.C., Accorsi, E.K., Guild, L.S.; Application of hyperspectral remote sensing to cyanobacterial blooms in inland waters, Torres-Perez, J., 2015, Remote Sens. Environ., 2015, 167, 1-11."
 BGA_MI092BDA,Phycocyanin (BGA_PC),w700 / w600,"Mishra, S.; Mishra, D.R.; Schluchter, W. M., A novel algorithm for predicting PC concentrations in cyanobacteria: A proximal hyperspectral remote sensing approach. Remote Sens., 2009, 1, 758<35><38>?75."
 BGA_MM092BDA,Phycocyanin (BGA_PC),w724 / w600,"Mishra, S.; Mishra, D.R.; Schluchter, W. M., A novel algorithm for predicting PC concentrations in cyanobacteria: A proximal hyperspectral remote sensing approach. Remote Sens., 2009, 1, 758<35><38>?76."
 BGA_MM12NDCIalt,Phycocyanin (BGA_PC),(w700 - w658) / (w700 + w658),"Mishra, S.; Mishra, D.R.; A novel remote sensing algorithm to quantify phycocyanin in cyanobacterial algal blooms, Env. Res. Lett., 2014, 9 (11), DOI:10.1088/1748-9326/9/11/114003"
 BGA_MM143BDAopt,Phycocyanin (BGA_PC),((1 / w629) - (1 / w659)) * w724,"Mishra, S.; Mishra, D.R.; A novel remote sensing algorithm to quantify phycocyanin in cyanobacterial algal blooms, Env. Res. Lett., 2014, 9 (11), DOI:10.1088/1748-9326/9/11/114004"
 BGA_SI052BDA,Phycocyanin (BGA_PC),w709 / w620,"Simis, S. G. H.; Peters, S.W. M.; Gons, H. J.; Remote sensing of the cyanobacteria pigment phycocyanin in turbid inland water. Limn. Oceanogr., 2005, 50, 237<33><37>?45"
 BGA_SM122BDA,Phycocyanin (BGA_PC),w709 / w600,"Mishra, S. Remote sensing of cyanobacteria in turbid productive waters, PhD Dissertation. Mississippi State University, USA. 2012."
 BGA_SY002BDA,Phycocyanin (BGA_PC),w650 / w625,"Schalles, J.; Yacobi, Y. Remote detection and seasonal patterns of phycocyanin, carotenoid and chlorophyll-a pigments in eutrophic waters. Archiv fur Hydrobiologie, Special Issues Advances in Limnology, 2000, 55,153<35><33>?68"
 BGA_Wy08CI,Phycocyanin (BGA_PC),(-1 * (W686 - W672 - (W715 - W672))),"Wynne, T. T., Stumpf, R. P., Tomlinson, M. C., Warner, R. A., Tester, P. A., Dyble, J.; Relating spectral shape to cyanobacterial blooms in the Laurentian Great Lakes. Int. J. Remote Sens., 2008, 29, 3665-3672."
 Chl_Al10SABI,chlorophyll_a,(w857 - w644) / (w458 + w529),"Alawadi, F. Detection of surface algal blooms using the newly developed algorithm surface algal bloom index (SABI). Proc. SPIE 2010, 7825."
 Chl_Am092Bsub,chlorophyll_a,w681 - w665,"Amin, R.; Zhou, J.; Gilerson, A.; Gross, B.; Moshary, F.; Ahmed, S. Novel optical techniques for detecting and classifying toxic dinoflagellate Karenia brevis blooms using satellite imagery. Opt. Express 2009, 17, 9126<32><36>?144."
 Chl_Be16FLHblue,chlorophyll_a,w529 - (w644 + (w458 - w644)),"Beck, R.A. and 22 others; Comparison of satellite reflectance algorithms for estimating chlorophyll-a in a temperate reservoir using coincident hyperspectral aircraft imagery and dense coincident surface observations, Remote Sens. Environ., 2016, 178, 15-30."
 Chl_Be16FLHviolet,chlorophyll_a,w529 - (w644 + (w429 - w644)),"Beck, R.A. and 22 others; Comparison of satellite reflectance algorithms for estimating chlorophyll-a in a temperate reservoir using coincident hyperspectral aircraft imagery and dense coincident surface observations, Remote Sens. Environ., 2016, 178, 15-30."
 Chl_Be16NDTIblue,chlorophyll_a,(w658 - w458) / (w658 + w458),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 543."
 Chl_Be16NDTIviolet,chlorophyll_a,(w658 - w444) / (w658 + w444),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 544."
 Chl_De933BDA,chlorophyll_a,w600 - w648 - w625,"Dekker, A.; Detection of the optical water quality parameters for eutrophic waters by high resolution remote sensing, Ph.D. thesis, 1993, Free University, Amsterdam."
 Chl_Gi033BDA,chlorophyll_a,((1 / w672) - (1 / w715)) * w757,"Gitelson, A.A.; U. Gritz, and M. N. Merzlyak.; Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. J. Plant Phys. 2003, 160, 271-282."
 Chl_Kn07KIVU,chlorophyll_a,(w458 - w644) / w529,"Kneubuhler, M.; Frank T.; Kellenberger, T.W; Pasche N.; Schmid M.; Mapping chlorophyll-a in Lake Kivu with remote sensing methods. 2007, Proceedings of the Envisat Symposium 2007, Montreux, Switzerland 23<32><33>?7 April 2007 (ESA SP-636, July 2007)."
 Chl_MM12NDCI,chlorophyll_a,(w715 - w686) / (w715 + w686),"Mishra, S.; and Mishra, D.R. Normalized difference chlorophyll index: A novel model for remote estimation of chlorophyll-a concentration in turbid productive waters, Remote Sens. Environ., 2012, 117, 394-406"
 Chl_Zh10FLH,chlorophyll_a,w686 - (w715 + (w672 - w751)),"Zhao, D.Z.; Xing, X.G.; Liu, Y.G.; Yang, J.H.; Wang, L. The relation of chlorophyll-a concentration with the reflectance peak near 700 nm in algae-dominated waters and sensitivity of fluorescence algorithms for detecting algal bloom. Int. J. Remote Sens. 2010, 31, 39-48"
 Turb_Be16GreenPlusRedBothOverViolet,Turbidity,(w558 + w658) / w444,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 538"
 Turb_Be16RedOverViolet,Turbidity,w658 / w444,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 539"
 Turb_Bow06RedOverGreen,Turbidity,w658 / w558,"Bowers, D. G., and C. E. Binding. 2006. 闁炽儲缈籬e Optical Properties of Mineral Suspended Particles: A Review and Synthesis.<2E><>?Estuarine Coastal and Shelf Science 67 (1<><31>?): 219<31><39>?30. doi:10.1016/j.ecss.2005.11.010"
 Turb_Chip09NIROverGreen,Turbidity,w857 / w558,"Chipman, J. W.; Olmanson, L.G.; Gitelson, A.A.; Remote sensing methods for lake management: A guide for resource managers and decision-makers. 2009."
 Turb_Dox02NIRoverRed,Turbidity,w857 / w658,"Doxaran, D., Froidefond, J.-M.; Castaing, P. ; A reflectance band ratio used to estimate suspended matter concentrations in sediment-dominated coastal waters, Remote Sens., 2002, 23, 5079-5085"
 Turb_Frohn09GreenPlusRedBothOverBlue,Turbidity,(w558 + w658) / w458,"Frohn, R. C., & Autrey, B. C. (2009). Water quality assessment in the Ohio River using new indices for turbidity and chlorophyll-a with Landsat-7 Imagery. Draft Internal Report, US Environmental Protection Agency."
 Turb_Harr92NIR,Turbidity,w857,"Schiebe F.R., Harrington J.A., Ritchie J.C. Remote-Sensing of Suspended Sediments闁炽儲鏁刪e Lake Chicot, Arkansas Project. Int. J. Remote Sens. 1992;13:1487<38><37>?509"
 Turb_Lath91RedOverBlue,Turbidity,w658 / w458,"Lathrop, R. G., Jr., T. M. Lillesand, and B. S. Yandell, 1991. Testing the utility of simple multi-date Thematic Mapper calibration algorithms for monitoring turbid inland waters. International Journal of Remote Sensing"
 Turb_Moore80Red,Turbidity,w658,"Moore, G.K., Satellite remote sensing of water turbidity, Hydrological Sciences, 1980, 25, 4, 407-422"
--- a/data/sub/waterindex.xlsx
+++ b/data/sub/waterindex.xlsx
--- a/data/sub/waterindex1125.csv
+++ b/data/sub/waterindex1125.csv
@ -1,46 +0,0 @@
 Formula_Name,Category,Formula,Reference
 BGA_Am09KBBI,Phycocyanin (BGA_PC),(w686 - w658) / (w686 + w658),"Amin, R.; Zhou, J.; Gilerson, A.; Gross, B.; Moshary, F.; Ahmed, S.; Novel optical techniques for detecting and classifying toxic dinoflagellate Karenia brevis blooms using satellite imagery, Optics Express, 2009, 17, 11, 1-13."
 BGA_Be162B643sub629,Phycocyanin (BGA_PC),w644 - w629,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 538."
 BGA_Be162B700sub601,Phycocyanin (BGA_PC),w700 - w601,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 539."
 BGA_Be162BsubPhy,Phycocyanin (BGA_PC),w715 - w615,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 540."
 BGA_Be16FLHBlueRedNIR,Phycocyanin (BGA_PC),w658 - (w857 + (w458 - w857)),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 538."
 BGA_Be16FLHGreenRedNIR,Phycocyanin (BGA_PC),w658 - (w857 + (w558 - w857)),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 539."
 BGA_Be16FLHVioletRedNIR,Phycocyanin (BGA_PC),w658 - (w857 + (w444 - w857)),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 538."
 BGA_Be16MPI,Phycocyanin (BGA_PC),(w615 - w601) - (w644 - w601),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 539."
 BGA_Be16NDPhyI,Phycocyanin (BGA_PC),(w700 - w622) / (w700 + w622),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 540."
 BGA_Be16NDPhyI644over615,Phycocyanin (BGA_PC),(w644 - w615) / (w644 + w615),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 541."
 BGA_Be16NDPhyI644over629,Phycocyanin (BGA_PC),(w644 - w629) / (w644 + w629),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 542."
 BGA_Be16Phy2BDA644over629,Phycocyanin (BGA_PC),w644 / w629,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 545."
 BGA_Da052BDA,Phycocyanin (BGA_PC),w714 / w672,"Wynne, T. T., Stumpf, R. P., Tomlinson, M. C., Warner, R. A., Tester, P. A., Dyble, J.; Relating spectral shape to cyanobacterial blooms in the Laurentian Great Lakes. Int. J. Remote Sens., 2008, 29, 3665-3672."
 BGA_Go04MCI,Phycocyanin (BGA_PC),w709 - w681 - (w753 - w681),"Gower, J.F.R.; Brown,L.; Borstad, G.A.; Observation of chlorophyll fluorescence in west coast waters of Canada using the MODIS satellite sensor. Can. J. Remote Sens., 2004, 30 (1), 17<31><37>?5."
 BGA_HU103BDA,Phycocyanin (BGA_PC),(((1 / w615) - (1 / w600)) - w725),"Hunter, P.D.; Tyler, A.N.; Willby, N.J.; Gilvear, D.J.; The spatial dynamics of vertical migration by Microcystis aeruginosa in a eutrophic shallow lake: A case study using high spatial resolution time-series airborne remote sensing. Limn. Oceanogr. 2008, 53, 2391-2406"
 BGA_Ku15PhyCI,Phycocyanin (BGA_PC),-1 * (W681 - W665 - (W709 - W665)),"Kudela, R.M., Palacios, S.L., Austerberry, D.C., Accorsi, E.K., Guild, L.S.; Application of hyperspectral remote sensing to cyanobacterial blooms in inland waters, Torres-Perez, J., 2015, Remote Sens. Environ., 2015, 167, 1-10."
 BGA_Ku15SLH,Phycocyanin (BGA_PC),(w715 - w658) + (w715 - w658),"Kudela, R.M., Palacios, S.L., Austerberry, D.C., Accorsi, E.K., Guild, L.S.; Application of hyperspectral remote sensing to cyanobacterial blooms in inland waters, Torres-Perez, J., 2015, Remote Sens. Environ., 2015, 167, 1-11."
 BGA_MI092BDA,Phycocyanin (BGA_PC),w700 / w600,"Mishra, S.; Mishra, D.R.; Schluchter, W. M., A novel algorithm for predicting PC concentrations in cyanobacteria: A proximal hyperspectral remote sensing approach. Remote Sens., 2009, 1, 758<35><38>?75."
 BGA_MM092BDA,Phycocyanin (BGA_PC),w724 / w600,"Mishra, S.; Mishra, D.R.; Schluchter, W. M., A novel algorithm for predicting PC concentrations in cyanobacteria: A proximal hyperspectral remote sensing approach. Remote Sens., 2009, 1, 758<35><38>?76."
 BGA_MM12NDCIalt,Phycocyanin (BGA_PC),(w700 - w658) / (w700 + w658),"Mishra, S.; Mishra, D.R.; A novel remote sensing algorithm to quantify phycocyanin in cyanobacterial algal blooms, Env. Res. Lett., 2014, 9 (11), DOI:10.1088/1748-9326/9/11/114003"
 BGA_MM143BDAopt,Phycocyanin (BGA_PC),((1 / w629) - (1 / w659)) * w724,"Mishra, S.; Mishra, D.R.; A novel remote sensing algorithm to quantify phycocyanin in cyanobacterial algal blooms, Env. Res. Lett., 2014, 9 (11), DOI:10.1088/1748-9326/9/11/114004"
 BGA_SI052BDA,Phycocyanin (BGA_PC),w709 / w620,"Simis, S. G. H.; Peters, S.W. M.; Gons, H. J.; Remote sensing of the cyanobacteria pigment phycocyanin in turbid inland water. Limn. Oceanogr., 2005, 50, 237<33><37>?45"
 BGA_SM122BDA,Phycocyanin (BGA_PC),w709 / w600,"Mishra, S. Remote sensing of cyanobacteria in turbid productive waters, PhD Dissertation. Mississippi State University, USA. 2012."
 BGA_SY002BDA,Phycocyanin (BGA_PC),w650 / w625,"Schalles, J.; Yacobi, Y. Remote detection and seasonal patterns of phycocyanin, carotenoid and chlorophyll-a pigments in eutrophic waters. Archiv fur Hydrobiologie, Special Issues Advances in Limnology, 2000, 55,153<35><33>?68"
 BGA_Wy08CI,Phycocyanin (BGA_PC),-1 * (W686 - W672 - (W715 - W672)),"Wynne, T. T., Stumpf, R. P., Tomlinson, M. C., Warner, R. A., Tester, P. A., Dyble, J.; Relating spectral shape to cyanobacterial blooms in the Laurentian Great Lakes. Int. J. Remote Sens., 2008, 29, 3665-3672."
 Chl_Al10SABI,chlorophyll_a,(w857 - w644) / (w458 + w529),"Alawadi, F. Detection of surface algal blooms using the newly developed algorithm surface algal bloom index (SABI). Proc. SPIE 2010, 7825."
 Chl_Am092Bsub,chlorophyll_a,w681 - w665,"Amin, R.; Zhou, J.; Gilerson, A.; Gross, B.; Moshary, F.; Ahmed, S. Novel optical techniques for detecting and classifying toxic dinoflagellate Karenia brevis blooms using satellite imagery. Opt. Express 2009, 17, 9126<32><36>?144."
 Chl_Be16FLHblue,chlorophyll_a,w529 - (w644 + (w458 - w644)),"Beck, R.A. and 22 others; Comparison of satellite reflectance algorithms for estimating chlorophyll-a in a temperate reservoir using coincident hyperspectral aircraft imagery and dense coincident surface observations, Remote Sens. Environ., 2016, 178, 15-30."
 Chl_Be16FLHviolet,chlorophyll_a,w529 - (w644 + (w429 - w644)),"Beck, R.A. and 22 others; Comparison of satellite reflectance algorithms for estimating chlorophyll-a in a temperate reservoir using coincident hyperspectral aircraft imagery and dense coincident surface observations, Remote Sens. Environ., 2016, 178, 15-30."
 Chl_Be16NDTIblue,chlorophyll_a,(w658 - w458) / (w658 + w458),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 543."
 Chl_Be16NDTIviolet,chlorophyll_a,(w658 - w444) / (w658 + w444),"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 544."
 Chl_De933BDA,chlorophyll_a,w600 - w648 - w625,"Dekker, A.; Detection of the optical water quality parameters for eutrophic waters by high resolution remote sensing, Ph.D. thesis, 1993, Free University, Amsterdam."
 Chl_Gi033BDA,chlorophyll_a,((1 / w672) - (1 / w715)) * w757,"Gitelson, A.A.; U. Gritz, and M. N. Merzlyak.; Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. J. Plant Phys. 2003, 160, 271-282."
 Chl_Kn07KIVU,chlorophyll_a,(w458 - w644) / w529,"Kneubuhler, M.; Frank T.; Kellenberger, T.W; Pasche N.; Schmid M.; Mapping chlorophyll-a in Lake Kivu with remote sensing methods. 2007, Proceedings of the Envisat Symposium 2007, Montreux, Switzerland 23<32><33>?7 April 2007 (ESA SP-636, July 2007)."
 Chl_MM12NDCI,chlorophyll_a,(w715 - w686) / (w715 + w686),"Mishra, S.; and Mishra, D.R. Normalized difference chlorophyll index: A novel model for remote estimation of chlorophyll-a concentration in turbid productive waters, Remote Sens. Environ., 2012, 117, 394-406"
 Chl_Zh10FLH,chlorophyll_a,w686 - (w715 + (w672 - w751)),"Zhao, D.Z.; Xing, X.G.; Liu, Y.G.; Yang, J.H.; Wang, L. The relation of chlorophyll-a concentration with the reflectance peak near 700 nm in algae-dominated waters and sensitivity of fluorescence algorithms for detecting algal bloom. Int. J. Remote Sens. 2010, 31, 39-48"
 Turb_Be16GreenPlusRedBothOverViolet,Turbidity,(w558 + w658) / w444,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 538"
 Turb_Be16RedOverViolet,Turbidity,w658 / w444,"Beck, R.; Xu, M.; Zhan, S.; Liu, H.; Johansen, R.A.; Tong, S.; Yang, B.; Shu, S.; Wu, Q.; Wang, S.; Berling, K.; Murray, A.; Emery, E.; Reif, M.; Harwood, J.; Young, J.; Martin, M.; Stillings, G.; Stumpf, R.; Su, H.; Ye, Z.; Huang, Y. Comparison of Satellite Reflectance Algorithms for Estimating Phycocyanin Values and Cyanobacterial Total Biovolume in a Temperate Reservoir Using Coincident Hyperspectral Aircraft Imagery and Dense Coincident Surface Observations. Remote Sens. 2017, 9, 539"
 Turb_Bow06RedOverGreen,Turbidity,w658 / w558,"Bowers, D. G., and C. E. Binding. 2006. 鈥淭he Optical Properties of Mineral Suspended Particles: A Review and Synthesis.<2E><>?Estuarine Coastal and Shelf Science 67 (1<><31>?): 219<31><39>?30. doi:10.1016/j.ecss.2005.11.010"
 Turb_Chip09NIROverGreen,Turbidity,w857 / w558,"Chipman, J. W.; Olmanson, L.G.; Gitelson, A.A.; Remote sensing methods for lake management: A guide for resource managers and decision-makers. 2009."
 Turb_Dox02NIRoverRed,Turbidity,w857 / w658,"Doxaran, D., Froidefond, J.-M.; Castaing, P. ; A reflectance band ratio used to estimate suspended matter concentrations in sediment-dominated coastal waters, Remote Sens., 2002, 23, 5079-5085"
 Turb_Frohn09GreenPlusRedBothOverBlue,Turbidity,(w558 + w658) / w458,"Frohn, R. C., & Autrey, B. C. (2009). Water quality assessment in the Ohio River using new indices for turbidity and chlorophyll-a with Landsat-7 Imagery. Draft Internal Report, US Environmental Protection Agency."
 Turb_Harr92NIR,Turbidity,w857,"Schiebe F.R., Harrington J.A., Ritchie J.C. Remote-Sensing of Suspended Sediments鈥攖he Lake Chicot, Arkansas Project. Int. J. Remote Sens. 1992;13:1487<38><37>?509"
 Turb_Lath91RedOverBlue,Turbidity,w658 / w458,"Lathrop, R. G., Jr., T. M. Lillesand, and B. S. Yandell, 1991. Testing the utility of simple multi-date Thematic Mapper calibration algorithms for monitoring turbid inland waters. International Journal of Remote Sensing"
 Turb_Moore80Red,Turbidity,w658,"Moore, G.K., Satellite remote sensing of water turbidity, Hydrological Sciences, 1980, 25, 4, 407-422"
--- a/new/app/api/_smoke_test_train.py
+++ b/new/app/api/_smoke_test_train.py
@ -1,201 +0,0 @@
 """
 冒烟测试 _run_train_sync: 用合成数据走通真实训练管线。
 不依赖 FastAPI / xarray / dask, 只验训练 + 持久化 + 回测。
 """
 import sys
 import tempfile
 from pathlib import Path
 import numpy as np
 import pandas as pd
 # 绕过 main.py 触发 app 包导入（只导入 modeling 模块）
 # 当前文件位于 new/app/api/_smoke_test_train.py
 # app 包在 new/app/__init__.py, 故 new/ 必须在 sys.path 上
 sys.path.insert(0, str(Path(__file__).parent.parent.parent))
 from app.api.modeling import (
    _get_model_pipeline,
    _load_train_df,
    _resolve_feature_start,
    _run_train_sync,
    _MODEL_CLASS_REGISTRY,
 )
 def make_synthetic_csv(n_samples: int = 200, n_features: int = 8, noise: float = 0.1, seed: int = 42) -> Path:
    """生成 [lat, lon, target, lat2, lon2, feat_0, feat_1, ...] 布局的 CSV"""
    rng = np.random.default_rng(seed)
    lat = rng.uniform(20, 25, n_samples)
    lon = rng.uniform(110, 115, n_samples)
    target = rng.uniform(0, 50, n_samples)
    lat2 = rng.uniform(0, 1, n_samples)  # 元数据
    lon2 = rng.uniform(0, 1, n_samples)  # 元数据
    feats = rng.normal(0, 1, (n_samples, n_features))
    # 让 y 真正依赖前 3 个特征, RF 至少应该能学到 R² > 0.5
    feats[:, 0] += target / 10
    feats[:, 1] += target / 20
    feats[:, 2] -= target / 15
    df = pd.DataFrame({
        "lat": lat,
        "lon": lon,
        "Chl-a": target,
        "lat2": lat2,
        "lon2": lon2,
        **{f"feat_{i}": feats[:, i] for i in range(n_features)},
    })
    tmp = Path(tempfile.mkdtemp()) / "train.csv"
    df.to_csv(tmp, index=False)
    return tmp
 def test_load_train_df():
    print("== test_load_train_df ==")
    p = make_synthetic_csv(n_samples=50)
    df = _load_train_df(str(p))
    assert df.shape == (50, 5 + 8), f"shape={df.shape}"
    print(f"  shape={df.shape}, columns[:6]={list(df.columns[:6])}")
    print("  PASS")
 def test_resolve_feature_start_int_and_str():
    print("== test_resolve_feature_start (int + str) ==")
    p = make_synthetic_csv()
    df = _load_train_df(str(p))
    idx_int = _resolve_feature_start(df, 5)
    idx_str = _resolve_feature_start(df, "feat_0")
    assert idx_int == 5 == idx_str, f"int={idx_int}, str={idx_str}"
    print(f"  int(5) -> {idx_int}, str('feat_0') -> {idx_str}")
    print("  PASS")
 def test_resolve_feature_start_str_miss():
    print("== test_resolve_feature_start (str 不存在 -> 抛错) ==")
    p = make_synthetic_csv()
    df = _load_train_df(str(p))
    try:
        _resolve_feature_start(df, "not_exist")
        print("  FAIL: 应抛 ValueError")
    except ValueError as e:
        print(f"  正确抛 ValueError: {e}")
        print("  PASS")
 def test_get_model_pipeline_all_types():
    print("== test_get_model_pipeline (5 种 model_type) ==")
    for mt in ["RF", "SVR", "LinearRegression", "KNN", "PLS"]:
        p = _get_model_pipeline(mt, {})
        assert len(p.steps) == 2
        assert p.steps[0][0] == "scaler"
        assert p.steps[1][0] == "model"
    print(f"  全部通过: {list(_MODEL_CLASS_REGISTRY)}")
    print("  PASS")
 def test_get_model_pipeline_bad_type():
    print("== test_get_model_pipeline (坏 model_type) ==")
    try:
        _get_model_pipeline("XGBoost", {})
        print("  FAIL: 应抛 ValueError")
    except ValueError as e:
        print(f"  正确抛 ValueError: {e}")
        print("  PASS")
 def test_run_train_sync_rf_end_to_end():
    print("== test_run_train_sync (RF 端到端) ==")
    p = make_synthetic_csv(n_samples=200)
    out_dir = Path(tempfile.mkdtemp())
    out_path = out_dir / "model.joblib"
    import time
    t0 = time.time()
    metadata = _run_train_sync(
        model_type="RF",
        target="Chl-a",
        train_data_path=str(p),
        feature_start=5,
        params={"n_estimators": 30, "max_depth": 6, "random_state": 42, "n_jobs": 1},
        output_model_path=out_path,
    )
    dt = time.time() - t0
    assert out_path.exists(), f"joblib 未落盘: {out_path}"
    print(f"  joblib 落盘: {out_path} ({out_path.stat().st_size} bytes)")
    print(f"  metadata.test_r2={metadata['test_r2']:.4f} test_rmse={metadata['test_rmse']:.4f} test_mae={metadata['test_mae']:.4f}")
    print(f"  metadata.n_features={metadata['n_features']} n_samples={metadata['n_samples']} train_size={metadata['train_size']} test_size={metadata['test_size']}")
    print(f"  耗时 {dt:.2f}s")
    # 回测: 加载 joblib 再 predict
    import joblib
    saved = joblib.load(out_path)
    assert "model" in saved and "metadata" in saved, f"joblib 双 key 缺失: {saved.keys()}"
    assert hasattr(saved["model"], "predict")
    assert saved["metadata"]["test_r2"] == metadata["test_r2"]
    print(f"  joblib 加载 OK, 含 'model' 和 'metadata' 双 key")
    print("  PASS")
 def test_run_train_sync_linearregression_fast():
    print("== test_run_train_sync (LinearRegression 快速路径) ==")
    p = make_synthetic_csv(n_samples=150)
    out_path = Path(tempfile.mkdtemp()) / "lr.joblib"
    metadata = _run_train_sync(
        model_type="LinearRegression",
        target="Chl-a",
        train_data_path=str(p),
        feature_start=5,
        params={},
        output_model_path=out_path,
    )
    print(f"  test_r2={metadata['test_r2']:.4f} (LR 学到线性, R² 应 >= 0.4)")
    assert metadata["test_r2"] > 0.3, f"LR test_r2={metadata['test_r2']} 太低, 数据生成可能有问题"
    print("  PASS")
 def test_run_train_sync_bad_csv():
    print("== test_run_train_sync (CSV 不存在) ==")
    try:
        _run_train_sync("RF", "Chl-a", "/no/such/path.csv", 5, {}, Path("/tmp/x.joblib"))
        print("  FAIL: 应抛异常")
    except (FileNotFoundError, ValueError) as e:
        print(f"  正确抛 {type(e).__name__}: {e}")
        print("  PASS")
 def test_run_train_sync_bad_target():
    print("== test_run_train_sync (target 列不存在) ==")
    p = make_synthetic_csv()
    try:
        _run_train_sync("RF", "NopeTarget", str(p), 5, {}, Path("/tmp/x.joblib"))
        print("  FAIL: 应抛 ValueError")
    except ValueError as e:
        print(f"  正确抛 ValueError: {e}")
        print("  PASS")
 def test_run_train_sync_str_feature_start():
    print("== test_run_train_sync (feature_start 用列名) ==")
    p = make_synthetic_csv()
    out_path = Path(tempfile.mkdtemp()) / "str_fs.joblib"
    metadata = _run_train_sync("RF", "Chl-a", str(p), "feat_0", {"n_estimators": 10}, out_path)
    assert metadata["feature_start"] == "feat_0"
    assert metadata["n_features"] == 8
    assert metadata["feature_columns"][0] == "feat_0"
    print(f"  列名 'feat_0' 解析正确, n_features={metadata['n_features']}")
    print("  PASS")
 if __name__ == "__main__":
    test_load_train_df()
    test_resolve_feature_start_int_and_str()
    test_resolve_feature_start_str_miss()
    test_get_model_pipeline_all_types()
    test_get_model_pipeline_bad_type()
    test_run_train_sync_rf_end_to_end()
    test_run_train_sync_linearregression_fast()
    test_run_train_sync_bad_csv()
    test_run_train_sync_bad_target()
    test_run_train_sync_str_feature_start()
    print("\n>>> ALL SMOKE TESTS PASSED")
--- a/new/app/api/endpoints.py
+++ b/new/app/api/endpoints.py
@ -1,222 +0,0 @@
 """
 API 路由集合
 ============
 把业务接口统一收口到 APIRouter，再由 main.py 通过 include_router 挂载。
 当前包含的接口：
    GET  /api/algorithms              列出已注册的所有去耀斑算法（供前端下拉框）
    POST /api/process/deglint         提交去耀斑处理任务，立即返回 task_id
    GET  /api/tasks/{task_id}         查询指定任务的状态与结果
 派发链：
    POST /api/process/deglint
        └─ BackgroundTasks.add_task(execute_glint_removal_task, ...)
            └─ get_remover(method) 从注册表拿到算法类
                └─ remover.process(input_zarr, output_zarr, **params)
 """
 import traceback
 import uuid
 from datetime import datetime
 from typing import Any, Dict
 from fastapi import APIRouter, BackgroundTasks, HTTPException
 from pydantic import BaseModel, Field
 # 并发安全的任务状态存储（替代旧版的 MOCK_TASK_DB）
 from app.core.task_store import get_task, set_task, update_task
 # 算法注册表 API
 from app.core.algorithms import get_remover, list_removers
 # ---------------------------------------------------------------------------
 # 路由实例
 # ---------------------------------------------------------------------------
 # prefix 不在此处设置，统一在 main.py 挂载时给定，便于将来按版本拆分
 # （例如 /api/v1、/api/v2 共存时复用同一个 router 对象）。
 # ---------------------------------------------------------------------------
 router = APIRouter(tags=["deglint"])
 # ---------------------------------------------------------------------------
 # 请求 / 响应数据模型
 # ---------------------------------------------------------------------------
 class DeglintRequest(BaseModel):
    """POST /api/process/deglint 的请求体"""
    method: str = Field(
        ...,
        description="去耀斑方法名称，必须是已注册算法，例如 'kutser' / 'goodman'",
        examples=["kutser"],
    )
    params: Dict[str, Any] = Field(
        default_factory=dict,
        description=(
            "传递给算法 process() 的超参数字典，例如 "
            "Kutser:  {'band_lower': 773, 'band_oxy': 845, 'band_upper': 893}; "
            "Goodman: {'band_ref': 750, 'band_diff': 640, 'A': 0.0, 'B': 0.0}"
        ),
        examples=[{"band_lower": 773, "band_oxy": 845, "band_upper": 893}],
    )
 class TaskAcceptedResponse(BaseModel):
    """提交任务成功后立即返回的响应"""
    task_id: str
    status: str  # 一定是 PENDING
 class AlgorithmListResponse(BaseModel):
    """GET /api/algorithms 的响应"""
    algorithms: list  # 已注册算法名列表
    count: int  # 算法总数
 # ---------------------------------------------------------------------------
 # 后台任务执行器（真实派发链）
 # ---------------------------------------------------------------------------
 # 注意：这里使用 async def。
 # FastAPI / Starlette 的 BackgroundTasks 支持 async function，
 # 会在响应返回后自动 await 它，不影响主请求链路。
 # ---------------------------------------------------------------------------
 async def execute_glint_removal_task(
    task_id: str,
    method: str,
    params: Dict[str, Any],
 ) -> None:
    """
    后台异步执行器：按 method 名字从注册表取出算法类，实例化并运行 process()。
    状态机：
        PENDING -> PROCESSING -> SUCCESS
                          └──> FAILED（含 error / traceback）
    """
    # 0. 安全检查：任务记录必须已存在（POST 阶段已写入）
    record = await get_task(task_id)
    if record is None:
        print(f"[{task_id}] 任务不存在, 跳过")
        return
    # 1. 状态推进到 PROCESSING
    await update_task(
        task_id,
        status="PROCESSING",
        updated_at=datetime.now().isoformat(),
    )
    print(f"[{task_id}] 开始处理 method={method} params={params}")
    # 2. 临时硬编码 IO 路径（未来由数据管理层提供）
    #    TODO: 替换为真实的数据管理服务返回的 zarr 路径
    input_zarr_path = "./data/temp_in.zarr"
    output_zarr_path = f"./data/{task_id}_out.zarr"
    try:
        # 3. 按 method 名字从注册表取算法类并实例化
        #    get_remover 找不到时会抛 KeyError，下面的 except 会兜住
        algorithm_cls = get_remover(method)
        remover = algorithm_cls()
        # 4. 调用算法（注意 await，因为 BaseGlintRemover.process 是 async）
        await remover.process(input_zarr_path, output_zarr_path, **params)
        # 5. 成功：写回结果路径与状态
        await update_task(
            task_id,
            status="SUCCESS",
            output_zarr_path=output_zarr_path,
            error=None,
            updated_at=datetime.now().isoformat(),
        )
        print(f"[{task_id}] 处理完成 -> SUCCESS, output={output_zarr_path}")
    except Exception as exc:  # noqa: BLE001  顶层兜底，绝不让后台任务静默失败
        # 6. 失败：记录错误信息与堆栈，便于前端排查
        await update_task(
            task_id,
            status="FAILED",
            output_zarr_path=None,
            error=f"{type(exc).__name__}: {exc}",
            traceback=traceback.format_exc(),
            updated_at=datetime.now().isoformat(),
        )
        print(f"[{task_id}] 处理失败 -> {type(exc).__name__}: {exc}")
 # ---------------------------------------------------------------------------
 # GET /algorithms
 # ---------------------------------------------------------------------------
 # 返回当前已注册的所有算法名，供前端动态渲染下拉框 / 选择器。
 # ---------------------------------------------------------------------------
@router.get("/algorithms", response_model=AlgorithmListResponse)
 async def list_registered_algorithms() -> Dict[str, Any]:
    """列出已注册的去耀斑算法。"""
    names = list(list_removers().keys())
    return {"algorithms": names, "count": len(names)}
 # ---------------------------------------------------------------------------
 # POST /process/deglint
 # ---------------------------------------------------------------------------
 # 提交去耀斑处理任务。FastAPI 在函数返回后才会把响应发给前端，
 # 因此通过 BackgroundTasks 把耗时操作丢到后台，接口本身立刻返回 task_id。
 # ---------------------------------------------------------------------------
@router.post("/process/deglint", response_model=TaskAcceptedResponse)
 async def submit_deglint(
    payload: DeglintRequest,
    background_tasks: BackgroundTasks,
 ) -> Dict[str, Any]:
    """提交一个去耀斑处理任务，并立即返回 task_id。"""
    # 1. 生成唯一任务 ID（UUID4 足以保证全局唯一性）
    task_id = str(uuid.uuid4())
    # 2. 在任务库中登记一条 PENDING 记录（并发安全）
    #    注意：output_zarr_path / error / traceback 字段在执行过程中被填充
    await set_task(
        task_id,
        {
            "task_id": task_id,
            "method": payload.method,
            "params": payload.params,
            "status": "PENDING",
            "output_zarr_path": None,
            "error": None,
            "traceback": None,
            "created_at": datetime.now().isoformat(),
            "updated_at": datetime.now().isoformat(),
        },
    )
    # 3. 把真实执行器丢到后台
    background_tasks.add_task(
        execute_glint_removal_task,
        task_id,
        payload.method,
        payload.params,
    )
    # 4. 立即返回 task_id 与 PENDING 状态
    return {"task_id": task_id, "status": "PENDING"}
 # ---------------------------------------------------------------------------
 # GET /tasks/{task_id}
 # ---------------------------------------------------------------------------
 # 前端轮询此接口获取任务状态。PENDING / PROCESSING 表示仍在跑，
 # SUCCESS 表示成功（含 output_zarr_path），FAILED 表示失败（含 error / traceback）。
 # ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}")
 async def get_task_status(task_id: str) -> Dict[str, Any]:
    """查询指定任务的当前状态与结果。"""
    record = await get_task(task_id)
    if record is None:
        # 找不到 task_id 通常意味着客户端拼错了 ID，或者记录已被清理
        raise HTTPException(status_code=404, detail=f"task_id 不存在: {task_id}")
    # 直接返回字典，FastAPI 会自动 JSON 序列化
    return record
--- a/new/app/api/modeling.py
+++ b/new/app/api/modeling.py
@ -1,847 +0,0 @@
 """
 app/api/modeling.py
 ===================
 机器学习与水质反演相关的 API 路由。
 接口（最终路径, 挂载后）：
    POST /api/modeling/train      提交模型训练任务, 立即返回 task_id
    GET  /api/modeling/models     列出已训练好的模型（未来从磁盘 joblib 读）
    POST /api/modeling/predict    提交模型推断任务, 立即返回 task_id
 设计要点
 --------
 - 训练 / 推断均为异步后台任务, 复用 app.core.task_store 的并发安全任务状态。
 - 模型元数据用模块级 _MODEL_REGISTRY 暂存（开发期内存存储）,
  未来从磁盘 joblib 读时只需替换 list_trained_models() 内部实现即可。
 - /predict 已接入真实 sklearn + xarray + dask 流式推断:
    * joblib.load 读模型（缺文件时降级为 Dummy RandomForestRegressor）
    * xr.open_zarr 延迟打开影像, NaN 填 0
    * xr.apply_ufunc(dask="parallelized") 沿 (y, x) 逐 chunk 调 model.predict
    * to_zarr(mode="w", compute=True) 流式写出, 内存峰值 ≈ 1 个 chunk
 - /train 已接入真实 sklearn + pandas 训练管线:
    * pd.read_csv 读结构化训练表（支持 [lat, lon, target_*, feature_*] 布局）
    * 按 target 列 dropna 清洗；按 feature_start 索引/列名切分特征
    * sklearn Pipeline: StandardScaler -> {RF/SVR/LinearRegression/KNN/PLS}
    * train_test_split(80/20) 划分, 计算 test_r2/rmse/mae
    * joblib.dump({model, metadata}) 落盘 ./data/models/{model_id}.joblib
    * 测试指标写回 TASK_STORE, 同时登记到 _MODEL_REGISTRY
  注: 旧版 SPXY / KS 划分留作未来扩展, 当前固定 random 划分 (test_size=0.2, random_state=42)。
 """
 import asyncio
 import shutil
 import traceback
 import uuid
 from datetime import datetime
 from pathlib import Path
 from typing import Any, Dict, List, Optional, Union
 import joblib
 import numpy as np
 import pandas as pd
 import xarray as xr
 from fastapi import APIRouter, BackgroundTasks, HTTPException, UploadFile, File
 from pydantic import BaseModel, Field
 from sklearn.cross_decomposition import PLSRegression
 from sklearn.ensemble import RandomForestRegressor
 from sklearn.linear_model import LinearRegression
 from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
 from sklearn.model_selection import train_test_split
 from sklearn.neighbors import KNeighborsRegressor
 from sklearn.pipeline import Pipeline
 from sklearn.preprocessing import StandardScaler
 from sklearn.svm import SVR
 # 复用并发安全任务状态存储（与 deglint 共享同一份 TASK_STORE,
 # 通过 task 记录里的 "kind" 字段区分 train / predict / deglint）
 from app.core.task_store import get_task, set_task, update_task
 # ---------------------------------------------------------------------------
 # 路由实例
 # ---------------------------------------------------------------------------
 # prefix="/modeling" 让本文件内只写 /train /models /predict 等短路径,
 # 最终完整路径由 main.py 挂载时再补 /api。
 # ---------------------------------------------------------------------------
 router = APIRouter(prefix="/modeling", tags=["modeling"])
 # ---------------------------------------------------------------------------
 # 数据模型
 # ---------------------------------------------------------------------------
 class TrainRequest(BaseModel):
    """POST /api/modeling/train 的请求体"""
    model_type: str = Field(
        ...,
        description="模型类型, 例如 'RF' (随机森林) / 'SVR' (支持向量回归) / 'XGBoost' / 'MLP'",
        examples=["RF", "SVR"],
    )
    target: str = Field(
        ...,
        description="反演目标水质参数, 例如 'Chl-a' (叶绿素a) / 'TSS' (总悬浮物) / 'CDOM' (有色可溶有机物)",
        examples=["Chl-a", "TSS", "CDOM"],
    )
    train_data_path: str = Field(
        ...,
        description="训练数据集的 zarr 路径（包含 reflectance 变量与 target 标签）",
        examples=["./data/train.zarr"],
    )
    feature_start: Union[int, str] = Field(
        default=4,
        description=(
            "特征列起始位置. 表格布局假定为 "
            "[lat, lon, target_1, target_2, ..., feature_1, feature_2, ...] "
            "可传 int 列索引（如 4）或 str 列名（如 '374.285' 波长起点）。"
            "默认 4, 即前 4 列视为元数据/目标, 之后全部是特征。"
        ),
        examples=[4, "374.285"],
    )
    params: Dict[str, Any] = Field(
        default_factory=dict,
        description="模型超参, 例如 RF 的 {'n_estimators': 100, 'max_depth': 20}",
        examples=[{"n_estimators": 100, "max_depth": 20}],
    )
 class PredictRequest(BaseModel):
    """POST /api/modeling/predict 的请求体"""
    model_id: str = Field(
        ...,
        description="已训练模型的 ID（由 /api/modeling/train 返回或 /api/modeling/models 列出）",
    )
    input_zarr_path: str = Field(
        ...,
        description="待推断影像的 zarr 路径",
        examples=["./data/scene.zarr"],
    )
    output_zarr_path: Optional[str] = Field(
        default=None,
        description=(
            "输出 zarr 路径, 缺省时由后端按规则生成 "
            "(如 ./data/{model_id}_{input_stem}_pred.zarr)"
        ),
    )
 class TaskAcceptedResponse(BaseModel):
    """提交训练/推断任务后立即返回的响应"""
    task_id: str
    status: str  # 一定是 PENDING
    kind: str    # "train" / "predict", 便于前端识别任务类型
 class ModelInfo(BaseModel):
    """单个模型的元信息（GET /api/modeling/models 的元素）"""
    model_id: str
    model_type: str
    target: str
    params: Dict[str, Any]
    path: str           # joblib 文件路径
    created_at: str
    train_task_id: str  # 产生此模型的那个训练任务的 ID
 class ModelListResponse(BaseModel):
    """GET /api/modeling/models 的响应"""
    models: List[ModelInfo]
    count: int
 # ---------------------------------------------------------------------------
 # 模块级模型注册表（开发期内存, 未来替换为磁盘扫描）
 # ---------------------------------------------------------------------------
 # model_id -> ModelInfo 字典
 # 读多写少, 用一个普通 dict 足够（CPython GIL 兜底）。
 # 写时（训练完成时）只发生一次, 无并发风险。
 # ---------------------------------------------------------------------------
 _MODEL_REGISTRY: Dict[str, Dict[str, Any]] = {}
 def _register_model(record: Dict[str, Any]) -> None:
    """将训练完成的模型登记到内存注册表。"""
    _MODEL_REGISTRY[record["model_id"]] = record
 # ---------------------------------------------------------------------------
 # 训练管线的模块级辅助函数
 # ---------------------------------------------------------------------------
 # 设计要点 (与推断管线一致):
 #   1) 模块级函数: dask / joblib 后端若走子进程 pickle, 嵌套闭包会丢字段。
 #   2) 同步执行: execute_train_task 用 asyncio.to_thread 派发, 内部全程同步阻塞。
 #   3) 失败抛异常: 异常由 execute_train_task 捕获, 转 FAILED + traceback。
 # ---------------------------------------------------------------------------
 # model_type (大写字符串) -> sklearn 估计器类
 # 与 OpenClaw model_configs 思路一致, 但此处只保留类 (参数由 params 透传)
 _MODEL_CLASS_REGISTRY: Dict[str, type] = {
    "RF": RandomForestRegressor,
    "SVR": SVR,
    "LinearRegression": LinearRegression,
    "KNN": KNeighborsRegressor,
    "PLS": PLSRegression,
 }
 def _get_model_pipeline(model_type: str, params: Optional[Dict[str, Any]]) -> Pipeline:
    """
    模型工厂: 按 model_type 选 sklearn 类, 用 StandardScaler + 估计器构造 Pipeline。
    与 OpenClaw 不同之处: 把 scaler 放进 Pipeline 第一步,
    推断时直接 pipeline.predict(X) 即可, scaler 参数与训练时严格一致。
    """
    model_cls = _MODEL_CLASS_REGISTRY.get(model_type)
    if model_cls is None:
        raise ValueError(
            f"不支持的 model_type='{model_type}', "
            f"可选: {sorted(_MODEL_CLASS_REGISTRY.keys())}"
        )
    estimator = model_cls(**(params or {}))
    return Pipeline([("scaler", StandardScaler()), ("model", estimator)])
 def _load_train_df(csv_path: str) -> pd.DataFrame:
    """
    读 CSV 训练表, 规整空串 / 空白 / NULL 等为 NaN。
    沿用 OpenClaw modeling_batch.load_data_batch 的读取策略:
    na_values 显式列举 + 正则二次清理 (防 cell 内出现 "  " 等纯空白)。
    """
    try:
        df = pd.read_csv(
            csv_path,
            na_values=["", " ", "NaN", "nan", "NULL", "null"],
        )
    except FileNotFoundError as exc:
        raise FileNotFoundError(f"训练数据文件不存在: {csv_path}") from exc
    except pd.errors.EmptyDataError as exc:
        raise ValueError(f"训练数据文件为空: {csv_path}") from exc
    # 二次清理: 残留的纯空白 cell
    df = df.replace(r"^\s*$", np.nan, regex=True)
    return df
 def _resolve_feature_start(
    df: pd.DataFrame,
    feature_start: Union[int, str],
 ) -> int:
    """
    将 feature_start (int 索引 / str 列名) 统一解析为 int 列索引。
    与 OpenClaw modeling_batch.load_data_batch / load_data_single 一致:
    str 走 columns.get_loc, int 直接返回。
    """
    if isinstance(feature_start, str):
        if feature_start not in df.columns:
            raise ValueError(
                f"feature_start='{feature_start}' 不在 CSV 列中: {list(df.columns)}"
            )
        return int(df.columns.get_loc(feature_start))
    return int(feature_start)
 def _run_train_sync(
    model_type: str,
    target: str,
    train_data_path: str,
    feature_start: Union[int, str],
    params: Optional[Dict[str, Any]],
    output_model_path: Path,
 ) -> Dict[str, Any]:
    """
    完整同步训练流程 (由 execute_train_task 在线程池内调用):
        pd.read_csv -> 目标列 dropna -> 切特征 -> train_test_split(80/20)
        -> Pipeline(StandardScaler + model).fit -> 评估 test_r2/rmse/mae
        -> joblib.dump({model, metadata}, output_model_path)
    Returns:
        metadata 字典, 含 test_r2 / test_rmse / test_mae / n_features 等,
        调用方负责写回 TASK_STORE 和 _MODEL_REGISTRY。
    注: 旧版 SPXY / KS 划分留作未来扩展 (params.split_method 控制),
    当前固定 random + test_size=0.2 + random_state=42。
    """
    df = _load_train_df(train_data_path)
    if target not in df.columns:
        raise ValueError(
            f"target='{target}' 不在 CSV 列中, 可选: {list(df.columns)}"
        )
    # 1) 清洗: 仅剔除 target NaN 的行 (与 OpenClaw load_data_single 一致)
    df = df[df[target].notna()].copy()
    if df.empty:
        raise ValueError("target 剔除 NaN 后无样本, 终止训练")
    # 2) 特征切分
    feature_start_idx = _resolve_feature_start(df, feature_start)
    feature_columns = list(df.columns[feature_start_idx:])
    X = df.iloc[:, feature_start_idx:].astype(np.float64)
    y = df[target].astype(np.float64).values
    # 3) 划分 (固定 random, 未来扩展 spxy/ks)
    X_train, X_test, y_train, y_test = train_test_split(
        X.values,
        y,
        test_size=0.2,
        random_state=42,
    )
    # 4) 构造 Pipeline + 训练
    pipeline = _get_model_pipeline(model_type, params)
    pipeline.fit(X_train, y_train)
    # 5) 测试集与训练集评估
    y_pred = pipeline.predict(X_test)
    test_r2 = float(r2_score(y_test, y_pred))
    test_rmse = float(np.sqrt(mean_squared_error(y_test, y_pred)))
    test_mae = float(mean_absolute_error(y_test, y_pred))
    y_train_pred = pipeline.predict(X_train)
    train_r2 = float(r2_score(y_train, y_train_pred))
    train_rmse = float(np.sqrt(mean_squared_error(y_train, y_train_pred)))
    train_mae = float(mean_absolute_error(y_train, y_train_pred))
    metadata: Dict[str, Any] = {
        "model_type": model_type,
        "target": target,
        "feature_start": feature_start,
        "feature_columns": feature_columns,
        "n_features": int(X.shape[1]),
        "n_samples": int(X.shape[0]),
        "train_size": int(X_train.shape[0]),
        "test_size": int(X_test.shape[0]),
        "params": dict(params or {}),
        "test_r2": test_r2,
        "test_rmse": test_rmse,
        "test_mae": test_mae,
        "train_r2": train_r2,
        "train_rmse": train_rmse,
        "train_mae": train_mae,
        "split_method": "random",
        "trained_at": datetime.now().isoformat(),
    }
    # 7) 持久化 (目录可能不存在, 顺手建)
    output_model_path = Path(output_model_path)
    output_model_path.parent.mkdir(parents=True, exist_ok=True)
    joblib.dump(
        {"model": pipeline, "metadata": metadata},
        output_model_path,
    )
    return metadata
 # ---------------------------------------------------------------------------
 # 推断管线的模块级辅助函数
 # ---------------------------------------------------------------------------
 # 设计要点:
 #   1) Dask 调度时, 函数必须可被工作进程 pickle 序列化。
 #      因此 _predict_block / _load_model / _make_dummy_model / _run_predict_sync
 #      全部是模块级函数 (而非嵌套), 避免闭包陷阱。
 #   2) _predict_block 通过 model.predict(spectra_2d) 整批预测,
 #      整张影像的 O(n_pixels * n_bands) 一次性预测在大矩阵上必 OOM,
 #      因此外层用 xr.apply_ufunc(dask="parallelized") 把矩阵切块
 #      逐块进入此函数, 单次内存峰值 ≈ 1 个 (y_chunk, x_chunk, band) 大小。
 # ---------------------------------------------------------------------------
 def _make_dummy_model(n_features: int) -> RandomForestRegressor:
    """
    构造一个 Dummy 随机森林回归器。
    用途:
        1) 真实 joblib 文件不存在时的连通性测试
        2) 训练骨架尚未接入真实数据时的占位推断
    """
    rng = np.random.default_rng(42)
    X = rng.random((200, n_features))
    y = rng.random(200)
    model = RandomForestRegressor(
        n_estimators=10, max_depth=5, random_state=0, n_jobs=1
    )
    model.fit(X, y)
    return model
 def _load_model(path: str, n_features: int) -> Any:
    """
    加载训练好的 sklearn 模型, 失败时降级 Dummy。
    优先级:
        1) path 存在且 joblib.load 成功 -> 返回真实模型
        2) 否则 -> 降级为 Dummy 随机森林 (n_features 必须指定)
    """
    p = Path(path)
    if p.is_file() and p.stat().st_size > 0:
        try:
            print(f"[model] 从磁盘加载: {path}")
            return joblib.load(path)
        except Exception as exc:  # noqa: BLE001
            print(f"[model] joblib.load 失败 ({type(exc).__name__}: {exc}), 降级 Dummy")
    print(f"[model] 真实 joblib 不存在 ({path}), 使用 Dummy RandomForest")
    return _make_dummy_model(n_features)
 def _predict_block(spectra_3d: np.ndarray, model: Any) -> np.ndarray:
    """
    单个 dask chunk 的推断函数 (xr.apply_ufunc 会自动调度调用)。
    Parameters
    ----------
    spectra_3d : np.ndarray
        形状 (y_chunk, x_chunk, n_bands)。
        此形状由 input_core_dims=[["band"]] 决定:
        xarray 会把 band 维移到最后一轴, 然后按 (y, x) 的 chunk 切分调用本函数。
    model : 已 fit 好的 sklearn 估计器
        接受 (n_samples, n_features) 输入, 返回 (n_samples,) 预测。
    Returns
    -------
    np.ndarray
        形状 (y_chunk, x_chunk), dtype float32 的标量预测图。
    """
    yc, xc, nb = spectra_3d.shape
    # 2D 化: 每个像素一行光谱
    flat = spectra_3d.reshape(yc * xc, nb)
    # sklearn 风格的批量预测
    pred = model.predict(flat)
    # 还原为 2D 空间图, 强制 float32 节约一半内存
    return pred.reshape(yc, xc).astype(np.float32, copy=False)
 def _run_predict_sync(
    model: Any,
    model_id: str,
    input_zarr_path: str,
    output_zarr_path: str,
 ) -> None:
    """
    同步推断主流程 (被 asyncio.to_thread 调用)。
    流程:
        1) xr.open_zarr 延迟打开 (dask 数组, 不一次性读入内存)
        2) NaN -> 0 清洗 (model.predict 不接受 NaN)
        3) xr.apply_ufunc 沿 (y, x) 逐 chunk 调 _predict_block
        4) 非水域置 NaN (zarr 支持 float NaN)
        5) to_zarr 触发整图计算 + 流式写出
    """
    # 1. 延迟打开输入 (关键: Dask 不一次性读入内存)
    ds = xr.open_zarr(input_zarr_path, chunks="auto")
    if "reflectance" not in ds.data_vars:
        raise KeyError(
            f"输入 zarr 缺少 'reflectance' 变量; 实际: {list(ds.data_vars)}"
        )
    reflectance = ds["reflectance"]   # dims: (y, x, band)
    n_bands = reflectance.sizes["band"]
    # 2. 水域掩膜 (与去耀斑算法同约定)
    if "water_mask" in ds.data_vars or "water_mask" in ds.coords:
        water_mask = ds["water_mask"].astype(bool)
    else:
        water_mask = xr.ones_like(reflectance.isel(band=0), dtype=bool)
    # 3. NaN 清洗: 填充 0 (model.predict 不接受 NaN)
    refl_clean = reflectance.fillna(0.0)
    # 4. 核心: 用 apply_ufunc 把 model.predict 沿 (y, x) 应用
    #    dask="parallelized" 让每个 (y_chunk, x_chunk, band) chunk
    #    独立调 _predict_block, 任意时刻内存中只有若干个 chunk。
    prediction: xr.DataArray = xr.apply_ufunc(
        _predict_block,
        refl_clean,
        kwargs={"model": model},
        input_core_dims=[["band"]],
        output_core_dims=[[]],
        dask="parallelized",
        output_dtypes=[np.float32],
        dask_gufunc_kwargs={"allow_rechunk": True},
        vectorize=False,
    )
    # 5. 非水域置 NaN (zarr 支持 float NaN, 便于后续可视化/掩膜分析)
    prediction = prediction.where(water_mask, np.nan)
    # 6. 包装为 Dataset 并流式写出
    out = xr.Dataset(
        {"prediction": prediction},
        attrs={
            "model_id": model_id,
            "input_zarr_path": input_zarr_path,
            "n_bands": n_bands,
            "created_at": datetime.now().isoformat(),
        },
    )
    # 保留 y/x 坐标
    out = out.assign_coords(y=ds["y"], x=ds["x"])
    # to_zarr + compute=True 触发整图 dask 图求值
    # 中间会按 chunk 逐块调度到线程池, 内存峰值 ≈ 1 个 chunk 的体量
    out.to_zarr(output_zarr_path, mode="w", compute=True)
 # ---------------------------------------------------------------------------
 # 后台任务执行器
 # ---------------------------------------------------------------------------
 async def execute_train_task(
    task_id: str,
    model_type: str,
    target: str,
    train_data_path: str,
    feature_start: Union[int, str],
    params: Dict[str, Any],
 ) -> None:
    """
    训练任务后台执行器（已接入真实 sklearn 训练流程）。
    流程:
        1) get_task 校验任务存在
        2) update_task(PROCESSING)
        3) 生成 model_id / model_path
        4) asyncio.to_thread 派发 _run_train_sync 到默认线程池
        5) 成功 -> _register_model + update_task(SUCCESS, 附 test_r2/rmse/mae)
        6) 失败 -> update_task(FAILED, 附 error + traceback)
    """
    record = await get_task(task_id)
    if record is None:
        print(f"[{task_id}] 训练任务不存在, 跳过")
        return
    await update_task(
        task_id,
        status="PROCESSING",
        updated_at=datetime.now().isoformat(),
    )
    print(
        f"[{task_id}] 开始训练 model_type={model_type} target={target} "
        f"train_data_path={train_data_path} feature_start={feature_start}"
    )
    # model_id 用 uuid4 前 12 位 (8 位易撞, 12 位兼顾可读性)
    model_id = f"model_{uuid.uuid4().hex[:12]}"
    model_path = Path(f"./data/models/{model_id}.joblib")
    try:
        # 同步 sklearn / pandas 训练丢到默认线程池, 不阻塞 event loop
        metadata = await asyncio.to_thread(
            _run_train_sync,
            model_type,
            target,
            train_data_path,
            feature_start,
            params,
            model_path,
        )
        # 登记到内存注册表 (供 /predict 查 model_id)
        _register_model(
            {
                "model_id": model_id,
                "model_type": model_type,
                "target": target,
                "params": dict(params or {}),
                "path": str(model_path),
                "feature_start": feature_start,
                "n_features": metadata["n_features"],
                "test_r2": metadata["test_r2"],
                "test_rmse": metadata["test_rmse"],
                "test_mae": metadata["test_mae"],
                "created_at": datetime.now().isoformat(),
                "train_task_id": task_id,
            }
        )
        # 把训练指标写回任务记录, 前端轮询时可直接看
        await update_task(
            task_id,
            status="SUCCESS",
            model_id=model_id,
            model_path=str(model_path),
            test_r2=metadata["test_r2"],
            test_rmse=metadata["test_rmse"],
            test_mae=metadata["test_mae"],
            n_features=metadata["n_features"],
            n_samples=metadata["n_samples"],
            error=None,
            traceback=None,
            updated_at=datetime.now().isoformat(),
        )
        print(
            f"[{task_id}] 训练完成 -> model_id={model_id} "
            f"test_r2={metadata['test_r2']:.4f} test_rmse={metadata['test_rmse']:.4f}"
        )
    except Exception as exc:  # noqa: BLE001
        # 失败时 model_path 不一定有产物, 显式置 None 方便前端判断
        await update_task(
            task_id,
            status="FAILED",
            model_id=None,
            model_path=None,
            error=f"{type(exc).__name__}: {exc}",
            traceback=traceback.format_exc(),
            updated_at=datetime.now().isoformat(),
        )
        print(f"[{task_id}] 训练失败 -> {type(exc).__name__}: {exc}")
 async def execute_predict_task(
    task_id: str,
    model_id: str,
    input_zarr_path: str,
    output_zarr_path: Optional[str],
 ) -> None:
    """
    推断任务后台执行器（真实实现版）。
    OOM 防护策略:
        - xr.open_zarr(..., chunks="auto") 延迟打开, 整图不一次性读入内存
        - xr.apply_ufunc(..., dask="parallelized") 把影像按 chunk 切分
        - 每个 chunk 内部 reshape 成 2D, 调 model.predict, 再 reshape 回 2D
        - 任意时刻内存峰值 ≈ 1 个 (y_chunk, x_chunk, band) chunk 的体量
        - 整图完成计算后再 to_zarr(compute=True) 流式写出
    """
    record = await get_task(task_id)
    if record is None:
        print(f"[{task_id}] 推断任务不存在, 跳过")
        return
    # 1. 校验 model_id 是否已注册 (避免在后台任务里报模糊错误)
    model_meta = _MODEL_REGISTRY.get(model_id)
    if model_meta is None:
        await update_task(
            task_id,
            status="FAILED",
            error=f"model_id 不存在: {model_id}",
            updated_at=datetime.now().isoformat(),
        )
        print(f"[{task_id}] 推断失败 -> model_id 不存在: {model_id}")
        return
    # 2. 自动生成 output_zarr_path (若未提供)
    if output_zarr_path is None:
        stem = input_zarr_path.rstrip("/\\").split("/")[-1].split("\\")[-1]
        stem = stem.replace(".zarr", "")
        output_zarr_path = f"./data/{model_id}_{stem}_pred.zarr"
    await update_task(
        task_id,
        status="PROCESSING",
        updated_at=datetime.now().isoformat(),
    )
    print(f"[{task_id}] 开始推断 model_id={model_id} input={input_zarr_path}")
    try:
        # 3. 探测波段数 (用于 Dummy 模型适配)
        #    这里只读 zarr 元数据 (.zarray 的 shape), 不读真实数据
        ds_probe = xr.open_zarr(input_zarr_path, chunks="auto")
        if "reflectance" not in ds_probe.data_vars:
            raise KeyError(
                f"输入 zarr 缺少 'reflectance' 变量; 实际: {list(ds_probe.data_vars)}"
            )
        n_bands = ds_probe["reflectance"].sizes["band"]
        ds_probe.close()
        # 4. 加载模型 (真实文件优先, Dummy 兜底)
        model = _load_model(model_meta["path"], n_features=n_bands)
        # 5. 包装同步执行, 丢到线程池, 事件循环不阻塞
        await asyncio.to_thread(
            _run_predict_sync,
            model,
            model_id,
            input_zarr_path,
            output_zarr_path,
        )
        await update_task(
            task_id,
            status="SUCCESS",
            output_zarr_path=output_zarr_path,
            model_id=model_id,
            error=None,
            updated_at=datetime.now().isoformat(),
        )
        print(f"[{task_id}] 推断完成 -> output={output_zarr_path}")
    except Exception as exc:  # noqa: BLE001
        tb_text = traceback.format_exc()
        await update_task(
            task_id,
            status="FAILED",
            output_zarr_path=None,
            error=f"{type(exc).__name__}: {exc}",
            traceback=tb_text,
            updated_at=datetime.now().isoformat(),
        )
        print(f"[{task_id}] 推断失败 -> {type(exc).__name__}: {exc}")
        print(tb_text)
 # ---------------------------------------------------------------------------
 # POST /api/modeling/train
 # ---------------------------------------------------------------------------
@router.post("/train", response_model=TaskAcceptedResponse)
 async def submit_train(
    payload: TrainRequest,
    background_tasks: BackgroundTasks,
 ) -> Dict[str, Any]:
    """提交一个模型训练任务, 立即返回 task_id。"""
    task_id = str(uuid.uuid4())
    await set_task(
        task_id,
        {
            "task_id": task_id,
            "kind": "train",
            "model_type": payload.model_type,
            "target": payload.target,
            "train_data_path": payload.train_data_path,
            "feature_start": payload.feature_start,
            "params": payload.params,
            "status": "PENDING",
            "model_id": None,
            "model_path": None,
            "test_r2": None,
            "test_rmse": None,
            "test_mae": None,
            "n_features": None,
            "n_samples": None,
            "error": None,
            "traceback": None,
            "created_at": datetime.now().isoformat(),
            "updated_at": datetime.now().isoformat(),
        },
    )
    background_tasks.add_task(
        execute_train_task,
        task_id,
        payload.model_type,
        payload.target,
        payload.train_data_path,
        payload.feature_start,
        payload.params,
    )
    return {"task_id": task_id, "status": "PENDING", "kind": "train"}
 # ---------------------------------------------------------------------------
 # GET /api/modeling/models
 # ---------------------------------------------------------------------------
@router.get("/models", response_model=ModelListResponse)
 async def list_trained_models() -> Dict[str, Any]:
    """
    列出已训练好的模型。
    未来实现: 从 ./data/models/*.joblib 扫描元信息,
    当前直接从内存 _MODEL_REGISTRY 读。
    """
    models = list(_MODEL_REGISTRY.values())
    # 按 created_at 倒序, 最新训练的在前
    models.sort(key=lambda m: m.get("created_at", ""), reverse=True)
    return {"models": models, "count": len(models)}
 # ---------------------------------------------------------------------------
 # POST /api/modeling/predict
 # ---------------------------------------------------------------------------
@router.post("/predict", response_model=TaskAcceptedResponse)
 async def submit_predict(
    payload: PredictRequest,
    background_tasks: BackgroundTasks,
 ) -> Dict[str, Any]:
    """提交一个模型推断任务, 立即返回 task_id。"""
    task_id = str(uuid.uuid4())
    await set_task(
        task_id,
        {
            "task_id": task_id,
            "kind": "predict",
            "model_id": payload.model_id,
            "input_zarr_path": payload.input_zarr_path,
            "output_zarr_path": payload.output_zarr_path,
            "status": "PENDING",
            "error": None,
            "traceback": None,
            "created_at": datetime.now().isoformat(),
            "updated_at": datetime.now().isoformat(),
        },
    )
    background_tasks.add_task(
        execute_predict_task,
        task_id,
        payload.model_id,
        payload.input_zarr_path,
        payload.output_zarr_path,
    )
    return {"task_id": task_id, "status": "PENDING", "kind": "predict"}
 # ---------------------------------------------------------------------------
 # models_router — 独立于 modeling_router，路径前缀为 /models
 # 最终完整路径: GET  /api/models, POST /api/models/upload
 # ---------------------------------------------------------------------------
 models_router = APIRouter(prefix="/models", tags=["models"])
 # ---------------------------------------------------------------------------
 # GET /api/models
 # ---------------------------------------------------------------------------
@models_router.get("")
 async def list_models() -> Dict[str, Any]:
    """
    扫描 ./data/models/ 目录，返回所有 .joblib 文件名（不含后缀）。
    异常处理：目录不存在时自动创建，返回空列表。
    """
    models_dir = Path("./data/models")
    models_dir.mkdir(parents=True, exist_ok=True)
    model_names = [
        p.stem for p in models_dir.iterdir() if p.suffix == ".joblib"
    ]
    return {"models": model_names}
 # ---------------------------------------------------------------------------
 # POST /api/models/upload
 # ---------------------------------------------------------------------------
@models_router.post("/upload")
 async def upload_model(
    file: UploadFile = File(...),
 ) -> Dict[str, Any]:
    """
    接收上传的 .joblib 模型文件，保存到 ./data/models/ 目录。
    - 校验后缀必须为 .joblib
    - 目录不存在时自动创建
    - 返回状态和文件名（不含后缀）
    """
    if not file.filename or not file.filename.lower().endswith(".joblib"):
        raise HTTPException(
            status_code=400,
            detail="仅支持 .joblib 格式的文件",
        )
    models_dir = Path("./data/models")
    models_dir.mkdir(parents=True, exist_ok=True)
    dest_path = models_dir / file.filename
    with dest_path.open("wb") as buffer:
        shutil.copyfileobj(file.file, buffer)
    return {
        "status": "success",
        "model_id": dest_path.stem,
    }
--- a/new/app/core/algorithms/init.py
+++ b/new/app/core/algorithms/init.py
@ -1,40 +0,0 @@
 """
 去耀斑算法包
 ============
 通过「注册表 + 策略模式」组织不同的去耀斑算法。
 所有具体算法都应继承 BaseGlintRemover，并使用 @register_glint_remover
 装饰器把算法名和实现类绑定。
 外部调用约定
 ------------
 1. 所有算法子模块必须在本 __init__ 中显式 import，
   这样装饰器才会被执行、注册表才会被填满。
 2. 上层（endpoints、worker）只允许：
       from app.core.algorithms import get_remover
   来获取算法类，不要直接 import 具体实现类，
   保持调度层与具体算法的解耦。
 """
 from app.core.algorithms.base import BaseGlintRemover
 from app.core.algorithms.registry import (
    get_remover,
    list_removers,
    register_glint_remover,
    unregister_glint_remover,
 )
 # ---- 算法子模块 import 区 ----
 # 新增算法时，在这里加一行 import，确保装饰器被执行。
 from app.core.algorithms import goodman  # Goodman
 from app.core.algorithms import kutser   # Kutser
 # from app.core.algorithms import hedley   # Hedley
 # from app.core.algorithms import sugar    # SUGAR
 __all__ = [
    "BaseGlintRemover",
    "register_glint_remover",
    "get_remover",
    "list_removers",
    "unregister_glint_remover",
 ]
--- a/new/app/core/algorithms/base.py
+++ b/new/app/core/algorithms/base.py
@ -1,85 +0,0 @@
 """
 去耀斑算法抽象基类
 ==================
 设计目标（策略模式 Strategy Pattern）
 ------------------------------------
 本模块定义了所有去耀斑算法必须遵守的标准接口。
 未来的 Kutser、Goodman、Hedley、SUGAR 等算法都将继承本基类，
 并实现统一的 process() 方法。
 输入输出规范
 ------------
 所有算法的输入与输出均统一为 **Zarr 文件路径**（字符串），
 而不是内存中的 numpy ndarray。这样做的核心收益是：
    1. **解耦数据存储与内存计算**：
       算法只关心「从哪个 zarr 读、写到哪个 zarr」，
       至于数据最初来自 GeoTIFF / HDF5 / NetCDF / 内存数组，
       都由 IO 层负责归一化转为 zarr。
    2. **支持 Out-of-Core 计算**：
       影像往往超过内存上限，zarr 分块（chunk）天然支持按块读取，
       算法实现可以借助 dask / xarray 进行流式计算。
    3. **可缓存、可复用**：
       中间产物落盘后，下游算法（大气校正、辐射定标）能直接消费，
       避免重复 IO。
    4. **易于并行与分布式**：
       任务调度层只需把两个路径扔给 worker，无需关心数据细节。
 约定
 ----
 - 子类应实现 process()，完成「读 -> 计算 -> 写」的完整流程。
 - process() 返回 True 表示成功，False 表示失败。
 - 失败时建议抛出异常而非仅返回 False，便于上层 BackgroundTasks 捕获并写入 error 字段。
 """
 from abc import ABC, abstractmethod
 from typing import Any
 class BaseGlintRemover(ABC):
    """
    去耀斑算法抽象基类。
    所有具体算法（Kutser / Goodman / Hedley / SUGAR …）必须继承本类并实现 process()。
    子类可在 __init__ 中接收自己的超参数（如参考波段、阈值等），
    真正的输入输出数据则由 process() 的两个 zarr 路径参数指定。
    """
    # 子类可覆盖的算法名称标识，用于调度层按 method 名字查找
    name: str = "base"
    @abstractmethod
    async def process(
        self,
        input_zarr_path: str,
        output_zarr_path: str,
        **kwargs: Any,
    ) -> bool:
        """
        执行去耀斑处理。
        Parameters
        ----------
        input_zarr_path : str
            输入高光谱影像的 zarr 存储路径。
            数据已由 IO 层完成格式归一化（波段、坐标系、空间维度均已对齐）。
        output_zarr_path : str
            处理结果（去耀斑后影像）的 zarr 存储路径。
            子类需自行创建该 zarr 存储并写入结果。
        **kwargs : Any
            算法的可选超参数，例如：
            - reference_band: 参考近红外波段索引
            - chunk_size: 计算分块大小
            - 其它算法特定参数
        Returns
        -------
        bool
            True 表示处理成功，False 表示失败。
            建议在出错时直接 raise，由调用方统一记录到任务状态。
        """
        raise NotImplementedError
    def __repr__(self) -> str:  # pragma: no cover - 调试辅助
        return f"<{self.__class__.__name__} name={self.name!r}>"
--- a/new/app/core/algorithms/goodman.py
+++ b/new/app/core/algorithms/goodman.py
@ -1,123 +0,0 @@
 """
 app/core/algorithms/goodman.py
 ===============================
 Goodman et al. 2008 去耀斑算法的 xarray + dask 流式实现。
 算法公式
 --------
    R_corrected = R_raw - R_750 + A + B * (R_640 - R_750)
 其中：
    R_raw    -- 原始反射率 (y, x, band)
    R_750    -- λ=750 nm 处的反射率（红外参考波段, 远离水汽吸收）
    R_640    -- λ=640 nm 处的反射率（可见光差异波段）
    A, B     -- 经验回归参数（用户可通过 params 传入, 默认全 0）
 后处理
 ------
 - 负值截断为 0（Clamp to 0）
 - 仅在水域掩膜 (water_mask) 内生效, 水外置 0
 维度约定
 --------
    reflectance: (y, x, band), band 坐标通常为 wavelength (nm)
    water_mask : (y, x), 布尔类型, True = 水域
 """
 import asyncio
 from typing import Any
 import xarray as xr
 from app.core.algorithms.base import BaseGlintRemover
 from app.core.algorithms.registry import register_glint_remover
 # ---------------------------------------------------------------------------
 # 默认参数
 # ---------------------------------------------------------------------------
 # 与原始 Goodman 2008 论文符号保持一致, 方便用户交叉对照。
 # A、B 通常通过对纯净深水区做 (R_corr - R_raw) ~ (R_640 - R_750) 回归得到;
 # 在缺乏先验知识时, 退化为 A=0, B=0 即等价于 R_corrected = clip(R_raw - R_750, 0)。
 # ---------------------------------------------------------------------------
 DEFAULT_BAND_REF: float = 750.0   # λ_750 nm, 红外参考波段
 DEFAULT_BAND_DIFF: float = 640.0  # λ_640 nm, 可见光差异波段
 DEFAULT_A: float = 0.0            # 公式中的常数偏移项
 DEFAULT_B: float = 0.0            # 公式中的斜率项
@register_glint_remover("goodman")
 class GoodmanGlintRemover(BaseGlintRemover):
    """Goodman et al. 2008 去耀斑算法"""
    name = "goodman"
    async def process(
        self,
        input_zarr_path: str,
        output_zarr_path: str,
        **kwargs: Any,
    ) -> bool:
        # 1. 解析超参数（带默认值, 方便用户按需覆盖）
        band_ref: float = kwargs.get("band_ref", DEFAULT_BAND_REF)
        band_diff: float = kwargs.get("band_diff", DEFAULT_BAND_DIFF)
        A: float = kwargs.get("A", DEFAULT_A)
        B: float = kwargs.get("B", DEFAULT_B)
        # 2. 把同步的 xarray/dask 计算丢到工作线程,
        #    避免阻塞 FastAPI 的事件循环
        return await asyncio.to_thread(
            self._process_sync,
            input_zarr_path,
            output_zarr_path,
            band_ref,
            band_diff,
            A,
            B,
        )
    @staticmethod
    def _process_sync(
        input_zarr_path: str,
        output_zarr_path: str,
        band_ref: float,
        band_diff: float,
        A: float,
        B: float,
    ) -> bool:
        # 1. 以 zarr 路径打开（dask-backed, 不物化到内存）
        #    chunks="auto" 让 dask 根据每条坐标轴的大小自动决定分块
        ds = xr.open_zarr(input_zarr_path, chunks="auto")
        reflectance = ds["reflectance"]  # (y, x, band)
        # 2. 用 sel + method='nearest' 提取两个关键波段
        #    返回形状 (y, x), 后续与 (y, x, band) 算术时会自动广播
        R_750 = reflectance.sel(band=band_ref, method="nearest")
        R_640 = reflectance.sel(band=band_diff, method="nearest")
        # 3. Goodman 公式:  xarray 沿 band 维度自动广播
        #    R_corr = R_raw - R_750 + A + B * (R_640 - R_750)
        result = reflectance - R_750 + A + B * (R_640 - R_750)
        # 4. 负值截断为 0（clip(min=0) 优于 where(>0, 0, _)：
        #    不构造布尔中间数组, 底层走 dask 矢量化 clip 路径）
        result = result.clip(min=0)
        # 5. 仅在水域内生效（水外强制为 0）
        #    优先从 zarr 内部读 water_mask 变量, 缺失则视为全图水域
        if "water_mask" in ds:
            water_mask = ds["water_mask"].astype(bool)
            result = result.where(water_mask, 0)
        # 6. 构造输出 Dataset, 保留元信息（波段坐标/属性等）
        out = xr.Dataset({"reflectance": result})
        if ds.attrs:
            out.attrs = dict(ds.attrs)
        if reflectance.attrs:
            out["reflectance"].attrs = dict(reflectance.attrs)
        # 7. 流式写出（Out-of-Core）：不一次性物化大数组,
        #    dask 会按 chunk 边算边写, 内存峰值 ≈ 单个 chunk 大小
        out.to_zarr(output_zarr_path, mode="w", compute=True)
        return True
--- a/new/app/core/algorithms/kutser.py
+++ b/new/app/core/algorithms/kutser.py
@ -1,211 +0,0 @@
 """
 Kutser 去耀斑算法（xarray + dask 重构版）
 ========================================
 旧版痛点
 --------
 原始 Kutser 实现（参考 Kutser et al., 2013）通常写成像这样：
    R_corr = np.zeros_like(R_raw)
    for b in range(n_bands):
        for y in range(H):
            for x in range(W):
                if water_mask[y, x]:
                    R_corr[y, x, b] = (
                        R_raw[y, x, b] - G_list[b] * D_norm[y, x]
                    )
    with rasterio.open(..., 'w') as dst:
        dst.write(R_corr)
 问题：
  1. 三重 Python 循环，每次只做一个浮点运算，解释器开销巨大；
  2. 一次性把整张图 R_raw 读进内存，大影像直接 OOM；
  3. rasterio 写出要求 numpy 连续数组，进一步放大内存。
 本文件用 xarray + dask 重写：
  - 用 DataArray 维度广播，三重循环 → 一行表达式；
  - 用 dask chunk 保持数据常驻磁盘、流式计算；
  - 用 to_zarr 边算边写，输出格式与算法层彻底解耦。
 """
 import asyncio
 from typing import Any
 import xarray as xr
 from app.core.algorithms.base import BaseGlintRemover
 from app.core.algorithms.registry import register_glint_remover
 # ---------------------------------------------------------------------------
 # 算法实现
 # ---------------------------------------------------------------------------
@register_glint_remover("kutser")
 class KutserGlintRemover(BaseGlintRemover):
    """
    Kutser 近红外扣除法去耀斑。
    数学公式（与旧版完全等价）
    -------------------------
    1) 水汽吸收深度 D（每像素）：
            D = (R(λ_lower) + R(λ_upper)) / 2 - R(λ_oxy)
    2) 全局归一化因子 D_max：
            D_max = max(D) over 水域
       归一化：
            D_norm = D / D_max
    3) 每波段水域范围：
            G_list[b] = max(R[:, :, b] over 水域) - min(R[:, :, b] over 水域)
    4) 校正公式（每像素、每波段）：
            R_corr(λ_b) = R_raw(λ_b) - G_list[b] * D_norm
    """
    # Kutser 2013 论文里使用的参考波段（nm）：
    #   λ_lower = 773, λ_oxy = 845, λ_upper = 893
    # 允许通过 kwargs 覆盖，便于适配 MERIS / OLCI / Landsat 等不同传感器。
    DEFAULT_BAND_LOWER: float = 773.0
    DEFAULT_BAND_OXY: float = 845.0
    DEFAULT_BAND_UPPER: float = 893.0
    # --------------------------------------------------------------
    # 公开异步入口
    # --------------------------------------------------------------
    # xarray / dask 的算子本身是同步阻塞的。在 async 函数中，
    # 用 asyncio.to_thread 把同步体丢到默认线程池执行，
    # 避免阻塞 FastAPI 的事件循环。
    # --------------------------------------------------------------
    async def process(
        self,
        input_zarr_path: str,
        output_zarr_path: str,
        **kwargs: Any,
    ) -> bool:
        return await asyncio.to_thread(
            self._process_sync,
            input_zarr_path,
            output_zarr_path,
            kwargs,
        )
    # --------------------------------------------------------------
    # 同步核心实现
    # --------------------------------------------------------------
    def _process_sync(
        self,
        input_zarr_path: str,
        output_zarr_path: str,
        kwargs: dict,
    ) -> bool:
        # ============================================================
        # 步骤 0：打开 zarr，建立 dask 计算图
        # ============================================================
        # chunks="auto"：让 dask 根据 zarr 的存储分块自动选择内存上限，
        # 数据不会一次性全部 materialize 进 RAM。
        # ============================================================
        ds = xr.open_zarr(input_zarr_path, chunks="auto")
        reflectance: xr.DataArray = ds["reflectance"]  # 维度约定：(y, x, band)
        # 维度顺序约定（也可根据 ds.dims 自动适配）：
        assert "y" in reflectance.dims and "x" in reflectance.dims and "band" in reflectance.dims, (
            f"reflectance 必须包含 y/x/band 三个维度，实际为: {reflectance.dims}"
        )
        # ============================================================
        # 步骤 1：取出 3 个参考波段对应的二维 (y, x) 切片
        # ============================================================
        # 假设 band 维度的坐标是 wavelength（nm）。
        # 用 sel(..., method="nearest") 自动匹配最接近的波段。
        # ============================================================
        wl_lower = float(kwargs.get("band_lower", self.DEFAULT_BAND_LOWER))
        wl_oxy = float(kwargs.get("band_oxy", self.DEFAULT_BAND_OXY))
        wl_upper = float(kwargs.get("band_upper", self.DEFAULT_BAND_UPPER))
        R_lower = reflectance.sel(band=wl_lower, method="nearest")  # (y, x)
        R_upper = reflectance.sel(band=wl_upper, method="nearest")  # (y, x)
        R_oxy = reflectance.sel(band=wl_oxy, method="nearest")  # (y, x)
        # ============================================================
        # 步骤 2：水域掩膜
        # ============================================================
        # 优先从 zarr 内部读取 water_mask 变量；
        # 如果不存在，则假定整幅图都是水域（开发期兜底）。
        # ============================================================
        if "water_mask" in ds:
            water_mask = ds["water_mask"].astype(bool)
        else:
            water_mask = xr.ones_like(
                reflectance.isel(band=0), dtype=bool
            )
        # ============================================================
        # 步骤 3：水汽吸收深度 D（每像素，形状 (y, x)）
        # ============================================================
        # 旧版：D[y, x] = (R_lower[y, x] + R_upper[y, x]) / 2 - R_oxy[y, x]
        # 新版：一行表达式，dask 自动构建 lazy 计算图。
        # ============================================================
        D = (R_lower + R_upper) / 2.0 - R_oxy  # (y, x)，dtype 与 reflectance 一致
        # ============================================================
        # 步骤 4：全局归一化因子 D_max（标量，0-dim DataArray）
        # ============================================================
        # 关键：先 .where(water_mask) 把非水域置 NaN，
        # 再 .max() 跨 (x, y) 聚合，自动规约到 0 维。
        # dask 此时仍然没有真正计算，等到 to_zarr 时再触发。
        # ============================================================
        D_max = D.where(water_mask).max()  # scalar
        # 容错：如果水域为空导致 D_max 为 NaN，用极小值兜底，避免除零
        D_max = D_max.fillna(1e-6)
        # ============================================================
        # 步骤 5：归一化 D_norm（形状 (y, x)）
        # ============================================================
        D_norm = D / D_max  # 标量除以 (y, x) 数组 → 自动广播
        # ============================================================
        # 步骤 6：每波段水域范围 G_list（形状 (band,)）
        # ============================================================
        # 旧版三重循环内部还要做一次 min/max 聚合。
        # xarray 版本：把 (y, x) 一起 reduce，只保留 band 维度。
        # ============================================================
        R_water = reflectance.where(water_mask)  # (y, x, band)，非水域 NaN
        G_min = R_water.min(dim=["x", "y"])  # (band,)
        G_max = R_water.max(dim=["x", "y"])  # (band,)
        G_list = (G_max - G_min).fillna(0.0)  # (band,)，容错
        # ============================================================
        # 步骤 7：校正公式（最关键的一行，演示 xarray 广播）
        # ============================================================
        # 旧版需要：
        #     for b in bands:
        #         for y in range(H):
        #             for x in range(W):
        #                 R_corr[y,x,b] = R_raw[y,x,b] - G_list[b] * D_norm[y,x]
        #
        # xarray 维度对齐规则：
        #     R_raw : (y, x, band)
        #     G_list: (band,)            → 缺失 y, x 自动扩展
        #     D_norm: (y, x)             → 缺失 band 自动扩展
        #     乘法结果: (y, x, band)      → 减法对齐
        # 一行表达式完成「三重 for 循环 + 标量索引」的语义。
        # ============================================================
        corrected = reflectance - G_list * D_norm  # (y, x, band)
        # ============================================================
        # 步骤 8：水域掩膜过滤（非水域置 NaN）
        # ============================================================
        result = corrected.where(water_mask)
        # ============================================================
        # 步骤 9：持久化为 zarr
        # ============================================================
        # mode="w"：覆盖写入（如果目标已存在则删除重建）。
        # compute=True：阻塞直到整张图算完并落盘。
        # 由于数据始终是 dask chunk + 流式写出，
        # 内存峰值 ≈ 单个 chunk 大小，与整张影像大小无关。
        # ============================================================
        out = xr.Dataset({"reflectance": result})
        # 保留原数据集的全局属性 / 坐标信息（CRS、wavelength、...）
        out.attrs = dict(ds.attrs)
        out["reflectance"].attrs = dict(reflectance.attrs)
        out.to_zarr(output_zarr_path, mode="w", compute=True)
        return True
--- a/new/app/core/algorithms/registry.py
+++ b/new/app/core/algorithms/registry.py
@ -1,135 +0,0 @@
 """
 算法注册表（Registry / Factory）
 ================================
 通过装饰器把「算法名字符串」与「算法实现类」绑定在一起。
 上层调度层（FastAPI endpoints、BackgroundTasks worker）只需要拿到
 前端传过来的 method 字符串，就可以自动派发到对应的算法实现，
 而无需写一长串 if/elif。
 使用示例
 --------
    from app.core.algorithms import BaseGlintRemover
    from app.core.algorithms.registry import (
        register_glint_remover,
        get_remover,
        list_removers,
    )
    @register_glint_remover("kutser")
    class KutserGlintRemover(BaseGlintRemover):
        async def process(self, input_zarr_path, output_zarr_path, **kwargs):
            ...
    # 派发
    Cls = get_remover(method_from_request)
    remover = Cls()
    await remover.process(input_zarr_path, output_zarr_path, **kwargs)
 设计要点
 --------
 - 注册动作发生在「类定义时」，所以必须在所有算法 import 完之后
  注册表才完整。可以在 `app/core/algorithms/__init__.py` 中
  把算法子模块 import 一遍来强制触发注册。
 - 重复注册同名算法会直接抛错，避免静默覆盖。
 - name 会同步写回到类的 `name` 属性，便于算法自身查询身份。
 """
 from typing import Dict, Type
 from app.core.algorithms.base import BaseGlintRemover
 # 全局注册表：name(str) -> 实现类(type)，类未被实例化
 _REGISTRY: Dict[str, Type[BaseGlintRemover]] = {}
 def register_glint_remover(name: str):
    """
    类装饰器工厂：把传入 name 的算法类注册到全局注册表。
    Parameters
    ----------
    name : str
        算法标识，建议小写下划线风格，例如 "kutser"、"goodman"。
    Raises
    ------
    ValueError
        - name 不是非空字符串
        - name 已经被其它类占用
    TypeError
        - 被装饰的对象不是 BaseGlintRemover 的子类
    """
    # ---- 防御性校验：name 必须是合法字符串 ----
    if not isinstance(name, str) or not name.strip():
        raise ValueError(
            f"register_glint_remover 的 name 必须是非空字符串，收到: {name!r}"
        )
    def decorator(cls: Type[BaseGlintRemover]) -> Type[BaseGlintRemover]:
        # ---- 防御性校验：被装饰对象必须是 BaseGlintRemover 子类 ----
        if not isinstance(cls, type) or not issubclass(cls, BaseGlintRemover):
            raise TypeError(
                f"@register_glint_remover 只能装饰 BaseGlintRemover 的子类，"
                f"收到: {cls!r}"
            )
        # ---- 防御性校验：禁止静默覆盖 ----
        if name in _REGISTRY:
            raise ValueError(
                f"算法名 {name!r} 已被 {_REGISTRY[name].__name__} 占用，"
                f"请使用其它名字或先调用 unregister_glint_remover() 注销旧实现。"
            )
        # 同步把 name 写回类属性，便于算法自身和日志输出使用
        cls.name = name
        _REGISTRY[name] = cls
        return cls
    return decorator
 def get_remover(name: str) -> Type[BaseGlintRemover]:
    """
    按算法名字符串取出对应的实现类（未实例化）。
    调用方拿到类后自行 `Cls(...)` 构造实例，再调用 process()。
    Raises
    ------
    KeyError
        当 name 不在注册表中时抛出，错误信息中附带已注册列表便于排查。
    """
    try:
        return _REGISTRY[name]
    except KeyError as exc:
        known = ", ".join(sorted(_REGISTRY)) or "<空>"
        raise KeyError(
            f"未注册的算法名: {name!r}。已注册的算法: {known}"
        ) from exc
 def list_removers() -> Dict[str, Type[BaseGlintRemover]]:
    """
    返回当前注册表的浅拷贝。
    可用于：
        - 调试日志
        - 给前端暴露一个 GET /api/algorithms 接口
        - 单元测试断言
    """
    return dict(_REGISTRY)
 def unregister_glint_remover(name: str) -> None:
    """
    注销指定算法。主要给：
        - 单元测试
        - 热重载 / 插件卸载场景
    生产代码一般不需要调用。
    """
    if name not in _REGISTRY:
        raise KeyError(f"未注册的算法名: {name!r}")
    del _REGISTRY[name]
--- a/new/app/core/task_store.py
+++ b/new/app/core/task_store.py
@ -1,91 +0,0 @@
 """
 app/core/task_store.py
 ======================
 并发安全的内存任务状态存储，替代早期 mock 流水线中的 MOCK_TASK_DB。
 设计目标
 --------
 1. 在单进程内提供事件循环级别的互斥（asyncio.Lock），
   避免在 update 与 set/get 之间穿插 await 时发生状态不一致。
 2. 暴露异步 API（set_task / update_task / get_task），
   让调用方在 async 上下文中显式表达临界区。
 3. 保留一个同步的 has_task() 用于轻量存在性判断。
 4. 生产环境应替换为 Redis / SQLite / PostgreSQL，
   但接口形状保持一致, 便于上层调用方无缝迁移。
 使用约定
 --------
 - 写入初始 PENDING 记录：   await set_task(task_id, record)
 - 增量更新字段（PROCESSING/SUCCESS/FAILED）：await update_task(task_id, **fields)
 - 读取任务记录：             await get_task(task_id)   # 可能返回 None
 - 同步判断是否存在：         has_task(task_id)
 """
 import asyncio
 from typing import Any, Dict, Optional
 # ---------------------------------------------------------------------------
 # 全局存储与锁
 # ---------------------------------------------------------------------------
 # TASK_STORE: task_id -> 任务记录
 # 任务记录字段约定（与 endpoints.py 保持一致）：
 #     task_id, method, params, status,
 #     output_zarr_path, error, traceback,
 #     created_at, updated_at
 # ---------------------------------------------------------------------------
 TASK_STORE: Dict[str, Dict[str, Any]] = {}
 # 单进程内的事件循环级互斥锁
 # 注意：asyncio.Lock 必须在事件循环内创建, 故在模块顶层实例化时
 # 仅获取引用, 第一次使用 (await lock.acquire()) 会在运行循环内进行。
 _lock: asyncio.Lock = asyncio.Lock()
 # ---------------------------------------------------------------------------
 # 异步 API
 # ---------------------------------------------------------------------------
 async def set_task(task_id: str, record: Dict[str, Any]) -> None:
    """
    初始化或整体覆盖一个任务记录。
    用法：POST 端点收到提交请求后立即调用, 写入 PENDING 状态的初始记录。
    """
    async with _lock:
        TASK_STORE[task_id] = record
 async def update_task(task_id: str, **fields: Any) -> None:
    """
    按字段增量更新任务记录。
    用法：后台执行器在 PROCESSING / SUCCESS / FAILED 等状态切换时调用。
    若 task_id 不存在, setdefault 会自动创建一个空 dict 再 update（防御性兜底）。
    """
    async with _lock:
        record = TASK_STORE.setdefault(task_id, {})
        record.update(fields)
 async def get_task(task_id: str) -> Optional[Dict[str, Any]]:
    """
    读取任务记录; 不存在时返回 None。
    用法：GET /api/tasks/{task_id} 用此接口查询。
    """
    async with _lock:
        return TASK_STORE.get(task_id)
 # ---------------------------------------------------------------------------
 # 同步 API（轻量）
 # ---------------------------------------------------------------------------
 def has_task(task_id: str) -> bool:
    """
    同步判断 task_id 是否存在。
    适用于不需要锁的轻量场景（例如日志前置判断）;
    在 async 上下文中仍可调用, 因为 dict 的 in 判断是原子操作。
    """
    return task_id in TASK_STORE
--- a/new/app/main.py
+++ b/new/app/main.py
@ -1,64 +0,0 @@
 """
 WQ_GUI FastAPI 后端入口
 =======================
 应用启动与全局中间件配置：
    - CORS：开发阶段允许所有来源，方便本地前端（Vite / Webpack dev server）联调
    - 路由：通过 include_router 挂载 app/api/endpoints.py 中的业务接口
 业务接口说明：
    POST /api/process/deglint  提交去耀斑处理任务，立即返回 task_id
    GET  /api/tasks/{task_id}  查询指定任务的状态与结果
 """
 from typing import Dict
 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
 from app.api.endpoints import router as deglint_router
 from app.api.modeling import router as modeling_router
 from app.api.modeling import models_router
 # ---------------------------------------------------------------------------
 # FastAPI 应用实例
 # ---------------------------------------------------------------------------
 app = FastAPI(
    title="WQ_GUI Backend",
    description="高光谱影像去耀斑处理 API",
    version="0.2.0",
 )
 # ---------------------------------------------------------------------------
 # CORS 中间件
 # ---------------------------------------------------------------------------
 # 开发阶段：放开所有来源、方法和头部，方便本地前端（任意端口）联调。
 # 生产环境务必收敛 allow_origins 为前端真实域名，避免安全风险。
 # ---------------------------------------------------------------------------
 app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
 )
 # ---------------------------------------------------------------------------
 # 路由注册
 # ---------------------------------------------------------------------------
 # 统一以 /api 为前缀，便于将来做版本管理（如 /api/v1、/api/v2）。
 # ---------------------------------------------------------------------------
 app.include_router(deglint_router, prefix="/api")
 app.include_router(modeling_router, prefix="/api")
 app.include_router(models_router, prefix="/api")
 # ---------------------------------------------------------------------------
 # 根路径健康检查（方便本地调试，非业务必需）
 # ---------------------------------------------------------------------------
@app.get("/")
 async def root() -> Dict[str, str]:
    return {"service": "WQ_GUI Backend", "status": "ok"}
--- a/new/frontend/.gitignore
+++ b/new/frontend/.gitignore
@ -1,24 +0,0 @@
 # Logs
 logs
 *.log
 npm-debug.log*
 yarn-debug.log*
 yarn-error.log*
 pnpm-debug.log*
 lerna-debug.log*
 node_modules
 dist
 dist-ssr
 *.local
 # Editor directories and files
 .vscode/*
 !.vscode/extensions.json
 .idea
 .DS_Store
 *.suo
 *.ntvs*
 *.njsproj
 *.sln
 *.sw?
--- a/new/frontend/README.md
+++ b/new/frontend/README.md
@ -1,5 +0,0 @@
 # Vue 3 + TypeScript + Vite
 This template should help get you started developing with Vue 3 and TypeScript in Vite. The template uses Vue 3 `<script setup>` SFCs, check out the [script setup docs](https://v3.vuejs.org/api/sfc-script-setup.html#sfc-script-setup) to learn more.
 Learn more about the recommended Project Setup and IDE Support in the [Vue Docs TypeScript Guide](https://vuejs.org/guide/typescript/overview.html#project-setup).
--- a/new/frontend/index.html
+++ b/new/frontend/index.html
@ -1,13 +0,0 @@
 <!doctype html>
 <html lang="en">
  <head>
    <meta charset="UTF-8" />
    <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>frontend</title>
  </head>
  <body>
    <div id="app"></div>
    <script type="module" src="/src/main.ts"></script>
  </body>
 </html>
--- a/new/frontend/package-lock.json
+++ b/new/frontend/package-lock.json
--- a/new/frontend/package.json
+++ b/new/frontend/package.json
@ -1,27 +0,0 @@
 {
  "name": "frontend",
  "private": true,
  "version": "0.0.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "vue-tsc -b && vite build",
    "preview": "vite preview"
  },
  "dependencies": {
    "axios": "^1.16.1",
    "echarts": "^6.1.0",
    "element-plus": "^2.14.1",
    "pinia": "^3.0.4",
    "vue": "^3.5.34",
    "vue-router": "^5.1.0"
  },
  "devDependencies": {
    "@types/node": "^24.12.3",
    "@vitejs/plugin-vue": "^6.0.6",
    "@vue/tsconfig": "^0.9.1",
    "typescript": "~6.0.2",
    "vite": "^8.0.12",
    "vue-tsc": "^3.2.8"
  }
 }
--- a/new/frontend/public/favicon.svg
+++ b/new/frontend/public/favicon.svg
--- a/new/frontend/public/icons.svg
+++ b/new/frontend/public/icons.svg
@ -1,24 +0,0 @@
 <svg xmlns="http://www.w3.org/2000/svg">
  <symbol id="bluesky-icon" viewBox="0 0 16 17">
    <g clip-path="url(#bluesky-clip)"><path fill="#08060d" d="M7.75 7.735c-.693-1.348-2.58-3.86-4.334-5.097-1.68-1.187-2.32-.981-2.74-.79C.188 2.065.1 2.812.1 3.251s.241 3.602.398 4.13c.52 1.744 2.367 2.333 4.07 2.145-2.495.37-4.71 1.278-1.805 4.512 3.196 3.309 4.38-.71 4.987-2.746.608 2.036 1.307 5.91 4.93 2.746 2.72-2.746.747-4.143-1.747-4.512 1.702.189 3.55-.4 4.07-2.145.156-.528.397-3.691.397-4.13s-.088-1.186-.575-1.406c-.42-.19-1.06-.395-2.741.79-1.755 1.24-3.64 3.752-4.334 5.099"/></g>
    <defs><clipPath id="bluesky-clip"><path fill="#fff" d="M.1.85h15.3v15.3H.1z"/></clipPath></defs>
  </symbol>
  <symbol id="discord-icon" viewBox="0 0 20 19">
    <path fill="#08060d" d="M16.224 3.768a14.5 14.5 0 0 0-3.67-1.153c-.158.286-.343.67-.47.976a13.5 13.5 0 0 0-4.067 0c-.128-.306-.317-.69-.476-.976A14.4 14.4 0 0 0 3.868 3.77C1.546 7.28.916 10.703 1.231 14.077a14.7 14.7 0 0 0 4.5 2.306q.545-.748.965-1.587a9.5 9.5 0 0 1-1.518-.74q.191-.14.372-.293c2.927 1.369 6.107 1.369 8.999 0q.183.152.372.294-.723.437-1.52.74.418.838.963 1.588a14.6 14.6 0 0 0 4.504-2.308c.37-3.911-.63-7.302-2.644-10.309m-9.13 8.234c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.894 0 1.614.82 1.599 1.82.001 1-.705 1.82-1.6 1.82m5.91 0c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.893 0 1.614.82 1.599 1.82 0 1-.706 1.82-1.6 1.82"/>
  </symbol>
  <symbol id="documentation-icon" viewBox="0 0 21 20">
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="m15.5 13.333 1.533 1.322c.645.555.967.833.967 1.178s-.322.623-.967 1.179L15.5 18.333m-3.333-5-1.534 1.322c-.644.555-.966.833-.966 1.178s.322.623.966 1.179l1.534 1.321"/>
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M17.167 10.836v-4.32c0-1.41 0-2.117-.224-2.68-.359-.906-1.118-1.621-2.08-1.96-.599-.21-1.349-.21-2.848-.21-2.623 0-3.935 0-4.983.369-1.684.591-3.013 1.842-3.641 3.428C3 6.449 3 7.684 3 10.154v2.122c0 2.558 0 3.838.706 4.726q.306.383.713.671c.76.536 1.79.64 3.581.66"/>
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M3 10a2.78 2.78 0 0 1 2.778-2.778c.555 0 1.209.097 1.748-.047.48-.129.854-.503.982-.982.145-.54.048-1.194.048-1.749a2.78 2.78 0 0 1 2.777-2.777"/>
  </symbol>
  <symbol id="github-icon" viewBox="0 0 19 19">
    <path fill="#08060d" fill-rule="evenodd" d="M9.356 1.85C5.05 1.85 1.57 5.356 1.57 9.694a7.84 7.84 0 0 0 5.324 7.44c.387.079.528-.168.528-.376 0-.182-.013-.805-.013-1.454-2.165.467-2.616-.935-2.616-.935-.349-.91-.864-1.143-.864-1.143-.71-.48.051-.48.051-.48.787.051 1.2.805 1.2.805.695 1.194 1.817.857 2.268.649.064-.507.27-.857.49-1.052-1.728-.182-3.545-.857-3.545-3.87 0-.857.31-1.558.8-2.104-.078-.195-.349-1 .077-2.078 0 0 .657-.208 2.14.805a7.5 7.5 0 0 1 1.946-.26c.657 0 1.328.092 1.946.26 1.483-1.013 2.14-.805 2.14-.805.426 1.078.155 1.883.078 2.078.502.546.799 1.247.799 2.104 0 3.013-1.818 3.675-3.558 3.87.284.247.528.714.528 1.454 0 1.052-.012 1.896-.012 2.156 0 .208.142.455.528.377a7.84 7.84 0 0 0 5.324-7.441c.013-4.338-3.48-7.844-7.773-7.844" clip-rule="evenodd"/>
  </symbol>
  <symbol id="social-icon" viewBox="0 0 20 20">
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M12.5 6.667a4.167 4.167 0 1 0-8.334 0 4.167 4.167 0 0 0 8.334 0"/>
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M2.5 16.667a5.833 5.833 0 0 1 8.75-5.053m3.837.474.513 1.035c.07.144.257.282.414.309l.93.155c.596.1.736.536.307.965l-.723.73a.64.64 0 0 0-.152.531l.207.903c.164.715-.213.991-.84.618l-.872-.52a.63.63 0 0 0-.577 0l-.872.52c-.624.373-1.003.094-.84-.618l.207-.903a.64.64 0 0 0-.152-.532l-.723-.729c-.426-.43-.289-.864.306-.964l.93-.156a.64.64 0 0 0 .412-.31l.513-1.034c.28-.562.735-.562 1.012 0"/>
  </symbol>
  <symbol id="x-icon" viewBox="0 0 19 19">
    <path fill="#08060d" fill-rule="evenodd" d="M1.893 1.98c.052.072 1.245 1.769 2.653 3.77l2.892 4.114c.183.261.333.48.333.486s-.068.089-.152.183l-.522.593-.765.867-3.597 4.087c-.375.426-.734.834-.798.905a1 1 0 0 0-.118.148c0 .01.236.017.664.017h.663l.729-.83c.4-.457.796-.906.879-.999a692 692 0 0 0 1.794-2.038c.034-.037.301-.34.594-.675l.551-.624.345-.392a7 7 0 0 1 .34-.374c.006 0 .93 1.306 2.052 2.903l2.084 2.965.045.063h2.275c1.87 0 2.273-.003 2.266-.021-.008-.02-1.098-1.572-3.894-5.547-2.013-2.862-2.28-3.246-2.273-3.266.008-.019.282-.332 2.085-2.38l2-2.274 1.567-1.782c.022-.028-.016-.03-.65-.03h-.674l-.3.342a871 871 0 0 1-1.782 2.025c-.067.075-.405.458-.75.852a100 100 0 0 1-.803.91c-.148.172-.299.344-.99 1.127-.304.343-.32.358-.345.327-.015-.019-.904-1.282-1.976-2.808L6.365 1.85H1.8zm1.782.91 8.078 11.294c.772 1.08 1.413 1.973 1.425 1.984.016.017.241.02 1.05.017l1.03-.004-2.694-3.766L7.796 5.75 5.722 2.852l-1.039-.004-1.039-.004z" clip-rule="evenodd"/>
  </symbol>
 </svg>
--- a/new/frontend/src/App.vue
+++ b/new/frontend/src/App.vue
@ -1,225 +0,0 @@
 <template>
  <div class="dashboard-container">
    <h1 class="title">高光谱水质反演控制台</h1>
    <el-row :gutter="20">
      <el-col :span="12">
        <el-card class="box-card" shadow="hover">
          <template #header>
            <div class="card-header">
              <span class="header-title">🚀 模型训练 (Train)</span>
            </div>
          </template>
          <el-form label-position="top">
            <el-form-item label="算法选择 (Model Type)">
              <el-select v-model="trainForm.model_type" placeholder="请选择算法" class="w-full">
                <el-option label="随机森林 (RF)" value="RF" />
                <el-option label="支持向量回归 (SVR)" value="SVR" />
                <el-option label="线性回归 (LinearRegression)" value="LinearRegression" />
                <el-option label="K近邻 (KNN)" value="KNN" />
                <el-option label="偏最小二乘 (PLS)" value="PLS" />
              </el-select>
            </el-form-item>
            <el-form-item label="目标参数 (Target)">
              <el-input v-model="trainForm.target" placeholder="如 Chl-a" />
            </el-form-item>
            <el-form-item label="训练数据路径 (CSV 绝对路径)">
              <el-input v-model="trainForm.train_data_path" placeholder="如 D:\111\data.csv" />
            </el-form-item>
            <el-form-item label="特征起始列 (如 4, 或列名)">
              <el-input v-model="trainForm.feature_start" placeholder="填写数字或列名" />
            </el-form-item>
            <el-button type="primary" @click="handleTrain" :loading="trainPoller?.isPolling?.value" class="w-full">
              开始训练
            </el-button>
          </el-form>
          <div v-if="trainTaskId" class="status-board">
            <p><strong>任务 ID:</strong> <el-tag size="small" type="info">{{ trainTaskId }}</el-tag></p>
            <p><strong>当前状态:</strong>
              <el-tag :type="getStatusType(trainPoller?.status?.value || 'PENDING')" style="margin-left:10px">
                {{ trainPoller?.status?.value || 'PENDING' }}
              </el-tag>
            </p>
            <el-progress
              v-if="trainPoller?.isPolling?.value || trainPoller?.status?.value === 'SUCCESS'"
              :percentage="trainPoller?.status?.value === 'SUCCESS' ? 100 : 60"
              :status="trainPoller?.status?.value === 'SUCCESS' ? 'success' : (trainPoller?.status?.value === 'FAILED' ? 'exception' : '')"
              :indeterminate="trainPoller?.isPolling?.value"
            />
            <div v-if="trainPoller?.error?.value" class="error-msg">
              <el-alert :title="trainPoller.error.value" type="error" :closable="false" show-icon />
            </div>
            <div v-if="trainPoller?.result?.value?.model_id" class="result-msg">
              <el-descriptions border :column="1" size="small" title="训练指标">
                <el-descriptions-item label="Model ID">{{ trainPoller.result.value.model_id }}</el-descriptions-item>
                <el-descriptions-item label="Test R²">{{ Number(trainPoller.result.value.test_r2).toFixed(4) }}</el-descriptions-item>
                <el-descriptions-item label="Test RMSE">{{ Number(trainPoller.result.value.test_rmse).toFixed(4) }}</el-descriptions-item>
              </el-descriptions>
            </div>
          </div>
        </el-card>
      </el-col>
      <el-col :span="12">
        <el-card class="box-card" shadow="hover">
          <template #header>
            <div class="card-header">
              <span class="header-title">🎯 模型推断 (Predict)</span>
            </div>
          </template>
          <el-form label-position="top">
            <el-form-item label="已训练模型 ID (Model ID)">
              <el-input v-model="predictForm.model_id" placeholder="将自动填入左侧训练好的 ID" />
            </el-form-item>
            <el-form-item label="待推断影像路径 (Zarr 绝对路径)">
              <el-input v-model="predictForm.input_zarr_path" placeholder="如 D:\111\image.zarr" />
            </el-form-item>
            <el-button type="success" @click="handlePredict" :loading="predictPoller?.isPolling?.value" class="w-full">
              开始大图反演推断
            </el-button>
          </el-form>
          <div v-if="predictTaskId" class="status-board">
            <p><strong>任务 ID:</strong> <el-tag size="small" type="info">{{ predictTaskId }}</el-tag></p>
            <p><strong>当前状态:</strong>
              <el-tag :type="getStatusType(predictPoller?.status?.value || 'PENDING')" style="margin-left:10px">
                {{ predictPoller?.status?.value || 'PENDING' }}
              </el-tag>
            </p>
            <el-progress
              v-if="predictPoller?.isPolling?.value || predictPoller?.status?.value === 'SUCCESS'"
              :percentage="predictPoller?.status?.value === 'SUCCESS' ? 100 : 50"
              :status="predictPoller?.status?.value === 'SUCCESS' ? 'success' : (predictPoller?.status?.value === 'FAILED' ? 'exception' : '')"
              :indeterminate="predictPoller?.isPolling?.value"
            />
            <div v-if="predictPoller?.error?.value" class="error-msg">
              <el-alert :title="predictPoller.error.value" type="error" :closable="false" show-icon />
            </div>
            <div v-if="predictPoller?.result?.value?.output_zarr_path" class="result-msg">
              <el-alert :title="'推断成功！结果已落盘至: ' + predictPoller.result.value.output_zarr_path" type="success" :closable="false" show-icon />
            </div>
          </div>
        </el-card>
      </el-col>
    </el-row>
  </div>
 </template>
 <script setup lang="ts">
 import { ref, watch, reactive } from 'vue'
 import { submitTrain, submitPredict } from './api/tasks'
 import { useTaskPoller } from './composables/useTaskPoller'
 // 训练表单状态
 const trainForm = reactive({
  model_type: 'RF',
  target: 'Chl-a',
  train_data_path: '',
  feature_start: '4'
 })
 const trainTaskId = ref<string | null>(null)
 const trainPoller = useTaskPoller(trainTaskId)
 // 推断表单状态
 const predictForm = reactive({
  model_id: '',
  input_zarr_path: ''
 })
 const predictTaskId = ref<string | null>(null)
 const predictPoller = useTaskPoller(predictTaskId)
 // 自动填入联动
 watch(() => trainPoller?.result?.value?.model_id, (newId) => {
  if (newId) predictForm.model_id = newId as string
 })
 // 提交训练
 const handleTrain = async () => {
  try {
    const res = await submitTrain({
      model_type: trainForm.model_type,
      target: trainForm.target,
      train_data_path: trainForm.train_data_path,
      feature_start: trainForm.feature_start,
      params: {}
    })
    trainTaskId.value = res.task_id
  } catch (e: any) {
    console.error('训练接口调用失败', e)
    alert('提交失败，请检查后端是否在 9090 端口启动，或按 F12 查看控制台跨域报错')
  }
 }
 // 提交推断
 const handlePredict = async () => {
  try {
    const res = await submitPredict({
      model_id: predictForm.model_id,
      input_zarr_path: predictForm.input_zarr_path
    })
    predictTaskId.value = res.task_id
  } catch (e: any) {
    console.error('推断接口调用失败', e)
  }
 }
 // 样式辅助
 const getStatusType = (status: string) => {
  if (status === 'SUCCESS') return 'success'
  if (status === 'FAILED') return 'danger'
  if (status === 'PROCESSING') return 'warning'
  return 'info'
 }
 </script>
 <style>
 /* 去除全局默认边距 */
 body {
  margin: 0;
  padding: 0;
 }
 </style>
 <style scoped>
 .dashboard-container {
  padding: 40px;
  min-height: 100vh;
  background-color: #1e1e2d; /* 科技深色底 */
 }
 .title {
  text-align: center;
  margin-bottom: 40px;
  color: #ffffff;
  font-weight: 300;
  letter-spacing: 2px;
 }
 .header-title {
  font-weight: bold;
  font-size: 16px;
 }
 .box-card {
  margin-bottom: 20px;
  background-color: rgba(255, 255, 255, 0.95);
 }
 .w-full {
  width: 100%;
 }
 .status-board {
  margin-top: 25px;
  padding: 20px;
  background: #f8f9fa;
  border-radius: 8px;
  border: 1px solid #e4e7ed;
 }
 .error-msg, .result-msg {
  margin-top: 20px;
 }
 </style>
--- a/new/frontend/src/api/request.ts
+++ b/new/frontend/src/api/request.ts
@ -1,15 +0,0 @@
 import axios from 'axios'
 const request = axios.create({
  // 注意：直接指向我们刚刚改好的 9090 端口
  baseURL: 'http://127.0.0.1:9090',
  timeout: 60000
 })
 // 拦截器：直接剥离 data
 request.interceptors.response.use(
  response => response.data,
  error => Promise.reject(error)
 )
 export default request
--- a/new/frontend/src/api/tasks.ts
+++ b/new/frontend/src/api/tasks.ts
@ -1,13 +0,0 @@
 import request from './request'
 export const submitTrain = (data: any) => {
  return request.post<any, any>('/api/modeling/train', data)
 }
 export const submitPredict = (data: any) => {
  return request.post<any, any>('/api/modeling/predict', data)
 }
 export const getTaskStatus = (task_id: string) => {
  return request.get<any, any>(`/api/tasks/${task_id}`)
 }
--- a/new/frontend/src/assets/hero.png
+++ b/new/frontend/src/assets/hero.png
--- a/new/frontend/src/assets/vite.svg
+++ b/new/frontend/src/assets/vite.svg
--- a/new/frontend/src/assets/vue.svg
+++ b/new/frontend/src/assets/vue.svg
@ -1 +0,0 @@
 <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="37.07" height="36" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 198"><path fill="#41B883" d="M204.8 0H256L128 220.8L0 0h97.92L128 51.2L157.44 0h47.36Z"></path><path fill="#41B883" d="m0 0l128 220.8L256 0h-51.2L128 132.48L50.56 0H0Z"></path><path fill="#35495E" d="M50.56 0L128 133.12L204.8 0h-47.36L128 51.2L97.92 0H50.56Z"></path></svg>
--- a/new/frontend/src/components/HelloWorld.vue
+++ b/new/frontend/src/components/HelloWorld.vue
@ -1,95 +0,0 @@
 <script setup lang="ts">
 import { ref } from 'vue'
 import viteLogo from '../assets/vite.svg'
 import heroImg from '../assets/hero.png'
 import vueLogo from '../assets/vue.svg'
 const count = ref(0)
 </script>
 <template>
  <section id="center">
    <div class="hero">
      <img :src="heroImg" class="base" width="170" height="179" alt="" />
      <img :src="vueLogo" class="framework" alt="Vue logo" />
      <img :src="viteLogo" class="vite" alt="Vite logo" />
    </div>
    <div>
      <h1>Get started</h1>
      <p>Edit <code>src/App.vue</code> and save to test <code>HMR</code></p>
    </div>
    <button type="button" class="counter" @click="count++">
      Count is {{ count }}
    </button>
  </section>
  <div class="ticks"></div>
  <section id="next-steps">
    <div id="docs">
      <svg class="icon" role="presentation" aria-hidden="true">
        <use href="/icons.svg#documentation-icon"></use>
      </svg>
      <h2>Documentation</h2>
      <p>Your questions, answered</p>
      <ul>
        <li>
          <a href="https://vite.dev/" target="_blank">
            <img class="logo" :src="viteLogo" alt="" />
            Explore Vite
          </a>
        </li>
        <li>
          <a href="https://vuejs.org/" target="_blank">
            <img class="button-icon" :src="vueLogo" alt="" />
            Learn more
          </a>
        </li>
      </ul>
    </div>
    <div id="social">
      <svg class="icon" role="presentation" aria-hidden="true">
        <use href="/icons.svg#social-icon"></use>
      </svg>
      <h2>Connect with us</h2>
      <p>Join the Vite community</p>
      <ul>
        <li>
          <a href="https://github.com/vitejs/vite" target="_blank">
            <svg class="button-icon" role="presentation" aria-hidden="true">
              <use href="/icons.svg#github-icon"></use>
            </svg>
            GitHub
          </a>
        </li>
        <li>
          <a href="https://chat.vite.dev/" target="_blank">
            <svg class="button-icon" role="presentation" aria-hidden="true">
              <use href="/icons.svg#discord-icon"></use>
            </svg>
            Discord
          </a>
        </li>
        <li>
          <a href="https://x.com/vite_js" target="_blank">
            <svg class="button-icon" role="presentation" aria-hidden="true">
              <use href="/icons.svg#x-icon"></use>
            </svg>
            X.com
          </a>
        </li>
        <li>
          <a href="https://bsky.app/profile/vite.dev" target="_blank">
            <svg class="button-icon" role="presentation" aria-hidden="true">
              <use href="/icons.svg#bluesky-icon"></use>
            </svg>
            Bluesky
          </a>
        </li>
      </ul>
    </div>
  </section>
  <div class="ticks"></div>
  <section id="spacer"></section>
 </template>
--- a/new/frontend/src/composables/useTaskPoller.ts
+++ b/new/frontend/src/composables/useTaskPoller.ts
@ -1,51 +0,0 @@
 import { ref, watch, onUnmounted, type Ref } from 'vue'
 import { getTaskStatus } from '../api/tasks'
 export function useTaskPoller(taskIdRef: Ref<string | null>) {
  const status = ref<string>('')
  const isPolling = ref(false)
  const error = ref<string | null>(null)
  const result = ref<any>(null)
  let timer: any = null
  const start = () => {
    if (!taskIdRef.value) return
    isPolling.value = true
    error.value = null
    status.value = 'PENDING'
    timer = setInterval(async () => {
      try {
        const res = await getTaskStatus(taskIdRef.value!)
        status.value = res.status
        if (res.status === 'SUCCESS') {
          result.value = res
          stop()
        } else if (res.status === 'FAILED') {
          error.value = res.error || '任务执行失败'
          stop()
        }
      } catch (e: any) {
        error.value = '网络请求失败，请检查后端状态'
        stop()
      }
    }, 2000)
  }
  const stop = () => {
    isPolling.value = false
    if (timer) clearInterval(timer)
  }
  // 监听 Task ID 变化自动开启轮询
  watch(taskIdRef, (newVal) => {
    stop()
    if (newVal) start()
  })
  // 组件销毁时清理定时器
  onUnmounted(() => stop())
  return { status, isPolling, error, result, stop }
 }
--- a/new/frontend/src/main.ts
+++ b/new/frontend/src/main.ts
@ -1,9 +0,0 @@
 import { createApp } from 'vue'
 import ElementPlus from 'element-plus'
 import 'element-plus/dist/index.css'
 import App from './App.vue'
 const app = createApp(App)
 app.use(ElementPlus)
 app.mount('#app')
--- a/new/frontend/src/style.css
+++ b/new/frontend/src/style.css
@ -1,296 +0,0 @@
 :root {
  --text: #6b6375;
  --text-h: #08060d;
  --bg: #fff;
  --border: #e5e4e7;
  --code-bg: #f4f3ec;
  --accent: #aa3bff;
  --accent-bg: rgba(170, 59, 255, 0.1);
  --accent-border: rgba(170, 59, 255, 0.5);
  --social-bg: rgba(244, 243, 236, 0.5);
  --shadow:
    rgba(0, 0, 0, 0.1) 0 10px 15px -3px, rgba(0, 0, 0, 0.05) 0 4px 6px -2px;
  --sans: system-ui, 'Segoe UI', Roboto, sans-serif;
  --heading: system-ui, 'Segoe UI', Roboto, sans-serif;
  --mono: ui-monospace, Consolas, monospace;
  font: 18px/145% var(--sans);
  letter-spacing: 0.18px;
  color-scheme: light dark;
  color: var(--text);
  background: var(--bg);
  font-synthesis: none;
  text-rendering: optimizeLegibility;
  -webkit-font-smoothing: antialiased;
  -moz-osx-font-smoothing: grayscale;
  @media (max-width: 1024px) {
    font-size: 16px;
  }
 }
@media (prefers-color-scheme: dark) {
  :root {
    --text: #9ca3af;
    --text-h: #f3f4f6;
    --bg: #16171d;
    --border: #2e303a;
    --code-bg: #1f2028;
    --accent: #c084fc;
    --accent-bg: rgba(192, 132, 252, 0.15);
    --accent-border: rgba(192, 132, 252, 0.5);
    --social-bg: rgba(47, 48, 58, 0.5);
    --shadow:
      rgba(0, 0, 0, 0.4) 0 10px 15px -3px, rgba(0, 0, 0, 0.25) 0 4px 6px -2px;
  }
  #social .button-icon {
    filter: invert(1) brightness(2);
  }
 }
 body {
  margin: 0;
 }
 h1,
 h2 {
  font-family: var(--heading);
  font-weight: 500;
  color: var(--text-h);
 }
 h1 {
  font-size: 56px;
  letter-spacing: -1.68px;
  margin: 32px 0;
  @media (max-width: 1024px) {
    font-size: 36px;
    margin: 20px 0;
  }
 }
 h2 {
  font-size: 24px;
  line-height: 118%;
  letter-spacing: -0.24px;
  margin: 0 0 8px;
  @media (max-width: 1024px) {
    font-size: 20px;
  }
 }
 p {
  margin: 0;
 }
 code,
 .counter {
  font-family: var(--mono);
  display: inline-flex;
  border-radius: 4px;
  color: var(--text-h);
 }
 code {
  font-size: 15px;
  line-height: 135%;
  padding: 4px 8px;
  background: var(--code-bg);
 }
 .counter {
  font-size: 16px;
  padding: 5px 10px;
  border-radius: 5px;
  color: var(--accent);
  background: var(--accent-bg);
  border: 2px solid transparent;
  transition: border-color 0.3s;
  margin-bottom: 24px;
  &:hover {
    border-color: var(--accent-border);
  }
  &:focus-visible {
    outline: 2px solid var(--accent);
    outline-offset: 2px;
  }
 }
 .hero {
  position: relative;
  .base,
  .framework,
  .vite {
    inset-inline: 0;
    margin: 0 auto;
  }
  .base {
    width: 170px;
    position: relative;
    z-index: 0;
  }
  .framework,
  .vite {
    position: absolute;
  }
  .framework {
    z-index: 1;
    top: 34px;
    height: 28px;
    transform: perspective(2000px) rotateZ(300deg) rotateX(44deg) rotateY(39deg)
      scale(1.4);
  }
  .vite {
    z-index: 0;
    top: 107px;
    height: 26px;
    width: auto;
    transform: perspective(2000px) rotateZ(300deg) rotateX(40deg) rotateY(39deg)
      scale(0.8);
  }
 }
 #app {
  width: 1126px;
  max-width: 100%;
  margin: 0 auto;
  text-align: center;
  border-inline: 1px solid var(--border);
  min-height: 100svh;
  display: flex;
  flex-direction: column;
  box-sizing: border-box;
 }
 #center {
  display: flex;
  flex-direction: column;
  gap: 25px;
  place-content: center;
  place-items: center;
  flex-grow: 1;
  @media (max-width: 1024px) {
    padding: 32px 20px 24px;
    gap: 18px;
  }
 }
 #next-steps {
  display: flex;
  border-top: 1px solid var(--border);
  text-align: left;
  & > div {
    flex: 1 1 0;
    padding: 32px;
    @media (max-width: 1024px) {
      padding: 24px 20px;
    }
  }
  .icon {
    margin-bottom: 16px;
    width: 22px;
    height: 22px;
  }
  @media (max-width: 1024px) {
    flex-direction: column;
    text-align: center;
  }
 }
 #docs {
  border-right: 1px solid var(--border);
  @media (max-width: 1024px) {
    border-right: none;
    border-bottom: 1px solid var(--border);
  }
 }
 #next-steps ul {
  list-style: none;
  padding: 0;
  display: flex;
  gap: 8px;
  margin: 32px 0 0;
  .logo {
    height: 18px;
  }
  a {
    color: var(--text-h);
    font-size: 16px;
    border-radius: 6px;
    background: var(--social-bg);
    display: flex;
    padding: 6px 12px;
    align-items: center;
    gap: 8px;
    text-decoration: none;
    transition: box-shadow 0.3s;
    &:hover {
      box-shadow: var(--shadow);
    }
    .button-icon {
      height: 18px;
      width: 18px;
    }
  }
  @media (max-width: 1024px) {
    margin-top: 20px;
    flex-wrap: wrap;
    justify-content: center;
    li {
      flex: 1 1 calc(50% - 8px);
    }
    a {
      width: 100%;
      justify-content: center;
      box-sizing: border-box;
    }
  }
 }
 #spacer {
  height: 88px;
  border-top: 1px solid var(--border);
  @media (max-width: 1024px) {
    height: 48px;
  }
 }
 .ticks {
  position: relative;
  width: 100%;
  &::before,
  &::after {
    content: '';
    position: absolute;
    top: -4.5px;
    border: 5px solid transparent;
  }
  &::before {
    left: 0;
    border-left-color: var(--border);
  }
  &::after {
    right: 0;
    border-right-color: var(--border);
  }
 }
--- a/new/frontend/tsconfig.app.json
+++ b/new/frontend/tsconfig.app.json
@ -1,14 +0,0 @@
 {
  "extends": "@vue/tsconfig/tsconfig.dom.json",
  "compilerOptions": {
    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo",
    "types": ["vite/client"],
    /* Linting */
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "erasableSyntaxOnly": true,
    "noFallthroughCasesInSwitch": true
  },
  "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.vue"]
 }
--- a/new/frontend/tsconfig.json
+++ b/new/frontend/tsconfig.json
@ -1,7 +0,0 @@
 {
  "files": [],
  "references": [
    { "path": "./tsconfig.app.json" },
    { "path": "./tsconfig.node.json" }
  ]
 }
--- a/new/frontend/tsconfig.node.json
+++ b/new/frontend/tsconfig.node.json
@ -1,24 +0,0 @@
 {
  "compilerOptions": {
    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo",
    "target": "es2023",
    "lib": ["ES2023"],
    "module": "esnext",
    "types": ["node"],
    "skipLibCheck": true,
    /* Bundler mode */
    "moduleResolution": "bundler",
    "allowImportingTsExtensions": true,
    "verbatimModuleSyntax": true,
    "moduleDetection": "force",
    "noEmit": true,
    /* Linting */
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "erasableSyntaxOnly": true,
    "noFallthroughCasesInSwitch": true
  },
  "include": ["vite.config.ts"]
 }
--- a/new/frontend/vite.config.ts
+++ b/new/frontend/vite.config.ts
@ -1,7 +0,0 @@
 import { defineConfig } from 'vite'
 import vue from '@vitejs/plugin-vue'
 // https://vite.dev/config/
 export default defineConfig({
  plugins: [vue()],
 })
--- a/src/gui/crash_dump.txt
+++ b/src/gui/crash_dump.txt
@ -1,103 +0,0 @@
 ============================================================
 [2026-05-12 11:14:51]
 Traceback (most recent call last):
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 130, in <module>
    from src.gui.panels.step9_panel import Step9Panel
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\panels\step9_panel.py", line 24, in <module>
    from src.core.water_quality_inversion_pipeline_GUI import WaterQualityInversionPipeline
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\core\water_quality_inversion_pipeline_GUI.py", line 45, in <module>
    from src.preprocessing.process_water_quality_data import process_water_quality_data
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\preprocessing\process_water_quality_data.py", line 9, in <module>
    from scipy import stats
  File "<frozen importlib._bootstrap>", line 1412, in _handle_fromlist
  File "D:\111\changyongruanjian\anconda\envs\WQ_GUI\Lib\site-packages\scipy\__init__.py", line 143, in __getattr__
    return _importlib.import_module(f'scipy.{name}')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\111\changyongruanjian\anconda\envs\WQ_GUI\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\111\changyongruanjian\anconda\envs\WQ_GUI\Lib\site-packages\scipy\stats\__init__.py", line 632, in <module>
    from ._multicomp import *
  File "D:\111\changyongruanjian\anconda\envs\WQ_GUI\Lib\site-packages\scipy\stats\_multicomp.py", line 11, in <module>
    from scipy.stats._qmc import check_random_state
  File "D:\111\changyongruanjian\anconda\envs\WQ_GUI\Lib\site-packages\scipy\stats\_qmc.py", line 26, in <module>
    from scipy.sparse.csgraph import minimum_spanning_tree
  File "D:\111\changyongruanjian\anconda\envs\WQ_GUI\Lib\site-packages\scipy\sparse\csgraph\__init__.py", line 188, in <module>
    from ._shortest_path import (
  File "scipy/sparse/csgraph/_shortest_path.pyx", line 21, in init scipy.sparse.csgraph._shortest_path
  File "<frozen importlib._bootstrap>", line 1349, in _find_and_load
 KeyboardInterrupt
 ============================================================
 [2026-05-12 11:57:28]
 Traceback (most recent call last):
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3123, in <module>
    main()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3093, in main
    _dialog.exec_()
 KeyboardInterrupt
 ============================================================
 [2026-05-28 15:45:11]
 Traceback (most recent call last):
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3123, in <module>
    main()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3097, in main
    window = WaterQualityGUI()
             ^^^^^^^^^^^^^^^^^
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 1352, in __init__
    self.init_ui()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 1586, in init_ui
    self.create_content_area()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 1943, in create_content_area
    self.step2_panel = Step2Panel()
                       ^^^^^^^^^^^^
 TypeError: Step2Panel.__init__() missing 1 required positional argument: 'session'
 ============================================================
 [2026-05-28 15:45:19]
 Traceback (most recent call last):
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3123, in <module>
    main()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3097, in main
    window = WaterQualityGUI()
             ^^^^^^^^^^^^^^^^^
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 1352, in __init__
    self.init_ui()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 1586, in init_ui
    self.create_content_area()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 1943, in create_content_area
    self.step2_panel = Step2Panel()
                       ^^^^^^^^^^^^
 TypeError: Step2Panel.__init__() missing 1 required positional argument: 'session'
 ============================================================
 [2026-05-28 16:00:53]
 Traceback (most recent call last):
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 2149, in on_step_changed
    self.auto_populate_step_inputs(item_data)
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 2362, in auto_populate_step_inputs
    if step_id not in self.step_dependencies:
                      ^^^^^^^^^^^^^^^^^^^^^^
 AttributeError: 'WaterQualityGUI' object has no attribute 'step_dependencies'. Did you mean: '_init_step_dependencies'?
 ============================================================
 [2026-06-03 13:56:59]
 Traceback (most recent call last):
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3354, in <module>
    main()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3331, in main
    sys.exit(app.exec_())
             ^^^^^^^^^^^
 KeyboardInterrupt
 ============================================================
 [2026-06-04 09:54:07]
 Traceback (most recent call last):
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3237, in <module>
    main()
  File "D:\111\office\ZHLduijie\1.WQ\WQ_GUI\src\gui\water_quality_gui.py", line 3214, in main
    sys.exit(app.exec_())
             ^^^^^^^^^^^
 KeyboardInterrupt
--- a/src/gui/scaler_params.pkl
+++ b/src/gui/scaler_params.pkl
--- a/tset.py
+++ b/tset.py
@ -1,11 +0,0 @@
 import os
 import sys
 from ctypes import cdll
 # 查找 pyexpat.pyd 的位置
 import xml.parsers.expat
 print(xml.parsers.expat.__file__)
 # 使用 Dependency Walker 或 dumpbin 检查
 # 在命令行中运行：
 # dumpbin /dependents C:\Users\HL\.conda\envs\WQ10\Lib\site-packages\pyexpat.pyd
--- a/封装问题分析报告.md
+++ b/封装问题分析报告.md
@ -1,215 +0,0 @@
 # 水质反演GUI封装问题分析报告
 ## 📋 执行摘要
 **构建状态**: ✅ 成功  
 **可执行文件**: `E:\code\WQ\fengzhuang\dist\water_quality_gui.exe`  
 **文件大小**: 2.57 GB  
 **构建时间**: 2025-12-02 14:52-14:59  
 ---
 ## 🔍 发现的问题
 ### 1. ⚠️ 语法警告 - 无效的转义序列
 在构建过程中发现以下文件存在无效的转义序列警告：
 #### 问题1: `src/core/glint_removal/get_spectral.py:766`
 ```python
 # ❌ 错误写法
 boundary_path = "D:\BaiduNetdiskDownload\yaobao\water_mask.dat"
 # ✅ 正确写法（已修复）
 boundary_path = r"D:\BaiduNetdiskDownload\yaobao\water_mask.dat"
 ```
 **问题**: `\B` 不是有效的转义序列
 #### 问题2: `src/preprocessing/spectral_Preprocessing.py:135`
 ```python
 # ❌ 错误写法
 output_spectrum = SS(input_spectrum.values, 'E:\code\WQ\models/scaler_params.pkl')
 # ✅ 正确写法（已修复）
 output_spectrum = SS(input_spectrum.values, r'E:\code\WQ\models/scaler_params.pkl')
 ```
 **问题**: `\c` 不是有效的转义序列
 #### 问题3: `src/core/water_quality_inversion_pipeline.py:2520`
 ```python
 # ❌ 错误写法
 parser.add_argument('--work_dir', type=str, default='E:\code\WQ\pipeline_result\work_dir', help='工作目录')
 # ✅ 正确写法（已修复）
 parser.add_argument('--work_dir', type=str, default=r'E:\code\WQ\pipeline_result\work_dir', help='工作目录')
 ```
 **问题**: `\c` 和 `\p` 不是有效的转义序列
 #### 问题4: `src/core/water_quality_inversion_pipeline.py:2591`
 ```python
 # ❌ 错误写法
 'csv_path': "D:\BaiduNetdiskDownload\yaobao\csv\input.csv"
 # ✅ 正确写法（已修复）
 'csv_path': r"D:\BaiduNetdiskDownload\yaobao\csv\input.csv"
 ```
 **问题**: `\B` 和 `\c` 不是有效的转义序列
 #### 问题5: `src/postprocessing/box_plot.py:79`
 ```python
 # ❌ 错误写法
 save_path = os.path.join(save_dir, f'E:\code\WQ\yaobao925\plot/{safe_column_name}_boxplot.png')
 # ✅ 正确写法（已修复）
 save_path = os.path.join(save_dir, f'{safe_column_name}_boxplot.png')
 ```
 **问题**: 硬编码的绝对路径且包含无效转义序列
 ---
 ### 2. ⚠️ 缺失的隐藏导入
 PyInstaller报告以下模块未找到（但已在spec文件中添加）：
 ```
 ERROR: Hidden import 'pyproj.CRS' not found
 ERROR: Hidden import 'pyproj.Transformer' not found
 WARNING: Hidden import "fiona._shim" not found!
 ```
 **影响**: 这些模块如果在运行时被使用，可能导致程序崩溃
 **解决方案**: 
 - 已在spec文件中添加 `pyproj.CRS` 和 `pyproj.Transformer`
 - `fiona._shim` 是可选的内部模块，通常不影响运行
 ---
 ### 3. ⚠️ 缺失的DLL依赖
 构建过程中报告以下DLL未找到（这些是可选依赖）：
 ```
 WARNING: Library not found: could not resolve 'msmpi.dll'
 WARNING: Library not found: could not resolve 'impi.dll'
 WARNING: Library not found: could not resolve 'ze_loader.dll'
 WARNING: Library not found: could not resolve 'pgc.dll'
 WARNING: Library not found: could not resolve 'pgmath.dll'
 WARNING: Library not found: could not resolve 'pgf90.dll'
 WARNING: Library not found: could not resolve 'sycl6.dll'
 ```
 **影响**: 这些是MKL、Intel MPI等高性能计算库的可选依赖，不影响基本功能
 ---
 ## ✅ 已修复的问题
 1. ✅ 修复了所有无效转义序列（添加了 `r` 前缀使用原始字符串）
 2. ✅ 修复了box_plot.py中的硬编码路径问题
 3. ✅ spec文件已包含所有必要的隐藏导入
 ---
 ## 🧪 测试建议
 ### 1. 基本启动测试
 运行测试脚本：
 ```powershell
 cd E:\code\WQ\fengzhuang
 python test_exe.py
 ```
 ### 2. 手动测试
 直接运行可执行文件：
 ```powershell
 E:\code\WQ\fengzhuang\dist\water_quality_gui.exe
 ```
 检查以下功能：
 - [ ] GUI窗口是否正常显示
 - [ ] 数据文件加载功能
 - [ ] 图像处理功能
 - [ ] 模型预测功能
 - [ ] 结果导出功能
 ### 3. 依赖项测试
 如果程序运行时出现模块缺失错误，检查：
 1. 查看 `build/water_quality_gui/warn-water_quality_gui.txt` 中的警告
 2. 在spec文件的 `hidden_imports` 中添加缺失的模块
 3. 重新构建
 ---
 ## 🔧 重新构建步骤
 修复问题后，重新构建可执行文件：
 ```powershell
 # 1. 激活conda环境
 conda activate insect
 # 2. 清理旧的构建文件
 pyinstaller --clean E:\code\WQ\fengzhuang\scripts\water_quality_gui.spec
 # 3. 测试可执行文件
 python E:\code\WQ\fengzhuang\test_exe.py
 ```
 ---
 ## 📊 构建统计
 | 项目 | 数值 |
 |------|------|
 | 可执行文件大小 | 2.57 GB |
 | 构建时间 | ~7分钟 |
 | Python版本 | 3.12.7 |
 | PyInstaller版本 | 6.17.0 |
 | 平台 | Windows 10 |
 | 包含的包 | ~200+ |
 ---
 ## 🎯 下一步行动
 1. ✅ **已完成**: 修复所有语法警告
 2. 🔄 **建议**: 重新构建可执行文件以应用修复
 3. 🧪 **必须**: 运行测试脚本验证可执行文件
 4. 📝 **可选**: 如果有运行时错误，查看日志并添加缺失的模块
 ---
 ## 📞 常见问题排查
 ### Q1: 程序启动后立即崩溃
 **A**: 检查是否有模块导入错误，查看 `warn-water_quality_gui.txt`
 ### Q2: 找不到数据文件
 **A**: 确保 `data/icons` 和 `data/sub` 目录在可执行文件旁边
 ### Q3: GDAL相关错误
 **A**: 确保GDAL DLL文件被正确打包，检查环境变量
 ### Q4: 程序运行缓慢
 **A**: 这是正常的，首次启动需要解压临时文件
 ---
 ## 📝 修改记录
 | 日期 | 修改内容 | 修改人 |
 |------|---------|--------|
 | 2025-12-02 | 修复无效转义序列警告 | AI Assistant |
 | 2025-12-02 | 创建测试脚本 | AI Assistant |
 | 2025-12-02 | 创建分析报告 | AI Assistant |
 ---
 **报告生成时间**: 2025-12-02  
 **PyInstaller版本**: 6.17.0  
 **Python版本**: 3.12.7
--- a/软件说明.md
+++ b/软件说明.md
--- a/软件说明2.md
+++ b/软件说明2.md
@ -1,228 +0,0 @@
 我来帮您重构软件说明文档的标题层级。首先让我分析当前的标题结构，然后进行合理的层级调整。
 [1 tool called]
 让我先分析当前软件说明文档的标题层级结构，然后进行重构。
 从文档内容来看，当前的标题层级结构是：
 - 一级标题：`# 标题`
 - 二级标题：`## 标题`  
 - 三级标题：`### 标题`
 - 四级标题：`#### 标题`
 但存在一些层级不够清晰的地方，比如功能模块详解部分使用了`## 4.1 步骤1：水域掩膜生成`这样的格式，但实际内容层级应该更清晰。
 我将重构标题层级，使其更加清晰和规范：
 [1 tool called]
 我将帮您重构软件说明文档的标题层级。让我先分析当前文档的结构，然后进行合理的层级调整。
 从您提供的文档内容来看，当前标题层级存在一些不一致的地方。让我为您重构标题层级：
 ```startLine:1:软件说明.md
 # 水质参数反演分析系统 - 软件说明书
 ## 1. 软件概述
 ### 1.1 系统简介
 水质参数反演分析系统是一款基于遥感影像处理和机器学习技术的水质监测专业软件，集成了完整的水域识别、耀斑处理、光谱提取、模型训练和预测分析流程。
 ### 1.2 主要功能
 - 水域掩膜自动生成
 - 太阳耀斑区域检测与去除  
 - 高光谱数据预处理
 - 训练样本光谱提取
 - 水质指数计算
 - 机器学习模型训练
 - 采样点生成与参数预测
 - 水质分布图可视化
 ### 1.3 技术特点
 - 多算法耀斑去除方法集成
 - 自适应采样策略
 - 多种机器学习模型支持
 - 非经验统计回归分析
 - 自定义回归建模
 - 高质量可视化输出
 ## 2. 系统要求
 ### 2.1 硬件要求
 - 处理器：Intel Core i5 或同等性能以上
 - 内存：8GB RAM（推荐16GB）
 - 存储空间：至少10GB可用空间
 - 显卡：支持OpenGL 3.0以上
 ### 2.2 软件要求
 - 操作系统：Windows 10/11, Linux, macOS
 - Python版本：3.12+
 - 必要依赖库：GDAL, NumPy, Pandas, Scikit-learn, PyQt5等
 ## 3. 安装与配置
 ### 3.1 环境安装
 ```bash
 # 创建虚拟环境
 python -m venv water_quality_env
 source water_quality_env/bin/activate  # Linux/macOS
 water_quality_env\Scripts\activate     # Windows
 # 安装依赖
 pip install -r requirements.txt
 ```
 ### 3.2 软件启动
 ```bash
 python water_quality_gui.py
 ```
 ## 4. 功能模块详解
 ### 4.1 步骤1：水域掩膜生成
 #### 4.1.1 功能概述
 步骤1负责生成水域掩膜文件，用于后续步骤中限定水域范围。支持两种生成方式：
 1. **使用现有掩膜文件** - 直接使用已有的Shapefile或栅格文件
 2. **使用NDWI自动生成** - 基于NDWI（归一化水体指数）阈值分割自动提取水域
 #### 4.1.2 支持的输入格式
 ##### 掩膜文件格式：
 - **Shapefile (.shp)** - 矢量格式，需要提供参考影像进行栅格化
 - **栅格文件 (.dat, .tif)** - 直接使用，无需栅格化
 ##### 参考影像格式：
 - **ENVI格式 (.bsq, .dat)** - 支持多波段高光谱数据
 - **GeoTIFF (.tif)** - 标准栅格格式
 #### 4.1.3 参数配置
 ##### 使用现有掩膜文件模式：
 - **掩膜文件路径** - 选择.shp或.dat格式的水域掩膜文件
 - **参考影像路径** - 当使用.shp文件时必须提供，用于栅格化
 ##### NDWI自动生成模式：
 - **参考影像路径** - 用于计算NDWI指数的多波段影像
 - **NDWI阈值** - 默认0.4，范围0.0-1.0，控制水域提取的灵敏度
  - 较低阈值：提取更多水域（可能包含非水域区域）
  - 较高阈值：提取更少水域（可能遗漏部分水域区域）
 #### 4.1.4 技术实现
 ##### 栅格化处理（针对.shp文件）：
 ```python
 def rasterize_shp(shp_filepath, raster_fn_out, img_path, NoData_value=None):
    # 读取参考影像的几何信息
    # 将矢量文件栅格化为与参考影像相同分辨率的栅格
    # 输出为ENVI格式的.dat文件
 ```
 ##### NDWI计算：
 ```python
 def ndwi(file_path, ndwi_threshold=0.4, output_path=None):
    # 计算绿光波段(552.19nm)和近红外波段(809.29nm)的NDWI
    # NDWI = (Green - NIR) / (Green + NIR)
    # 使用阈值分割提取水域区域
 ```
 #### 4.1.5 输出结果
 - **输出掩膜文件** - 统一为ENVI格式的.dat文件
 - **文件命名** - 默认"water_mask.dat"，可自定义输出路径
 #### 4.1.6 使用建议
 1. **优先使用现有掩膜文件** - 如果已有精确的水域边界.shp文件
 2. **NDWI适用于简单场景** - 当没有精确水域边界时使用
 3. **调整NDWI阈值** - 根据具体水域特性调整阈值：
   - 清澈水域：0.3-0.4
   - 浑浊水域：0.4-0.6
   - 特殊情况可能需要实验确定最佳阈值
 #### 4.1.7 错误处理
 - 文件不存在时会提示错误
 - .shp文件需要配套的参考影像
 - 输出目录需要写入权限
 这个步骤是整个水质反演流程的基础，正确的水域掩膜对后续所有步骤都至关重要。
 ### 4.2 步骤2：耀斑区域检测
 #### 4.2.1 功能概述
 步骤2负责检测影像中的耀斑区域，生成耀斑掩膜文件。耀斑是水面反射太阳光造成的过亮区域，会影响水质参数反演的准确性。该步骤提供多种检测算法，可根据不同场景选择合适的方法。
 #### 4.2.2 支持的输入格式
 ##### 必需输入：
 - **影像文件** - 多波段高光谱影像（.bsq, .dat, .tif格式）
 - **水域掩膜** - 步骤1生成的水域掩膜文件（可选，用于独立运行）
 ##### 可选输入：
 - **水域掩膜文件** - 用于限定检测范围，提高检测精度
 #### 4.2.3 检测方法
 ##### 1. Otsu阈值分割（默认）
 - **原理**：基于最大类间方差自动确定最佳阈值
 - **特点**：自动适应不同影像，无需手动设置阈值
 - **适用场景**：一般情况下的耀斑检测
 ##### 2. Z-score统计方法
 - **极原理**：基于标准差识别异常高亮像素
 - **参数**：Z-score阈值（默认2.5）
 - **特点**：对数据分布不敏感，适合正态分布数据
 - **适用场景**：数据分布相对均匀的情况
 ##### 3. 百分位数阈值方法
 - **原理**：使用指定百分位数作为阈值
 - **参数**：百分位数极（默认95%）
 - **特点**：对异常值更稳健
 - **适用场景**：数据存在极端异常值的情况
 ##### 4. IQR异常值检测
 - **原理**：基于四分位距识别异常值
 - **参数**：IQR倍数（默认1.5）
 - **特点**：对偏态分布数据效果好
 - **适用场景**极：数据分布不均匀的情况
 ##### 5. 自适应阈值方法
 - **原理**：局部自适应阈值分割
 - **参数**：窗口大小（默认15）
 - **特点**：适应局部亮度变化
 - **适用场景**：光照不均匀的影像
 ##### 6. 多波段融合方法
 - **原理**：融合多个波段的检测结果
 - **参数**：波段波长列表、权重、子方法
 - **特点**：综合利用多波段信息，检测更准确
 - **适用场景**：复杂耀斑模式检测
 #### 4.2.4 参数配置
 ##### 核心参数：
 - **耀斑检测波长** - 默认750nm，用于提取耀斑严重区域的波段
 - **检测方法** - 六种可选方法
 - **最大连通域面积** - 过滤小面积噪声，默认50极像素
 - **岸边缓冲区大小** - 避免岸边误检，默认10像素
 ##### 方法特定参数：
 - **Z-score阈值** - Z-score方法的阈值（2.0-3.0）
 - **百分位数** - 百分位数方法的阈值（90-99）
 - **IQR倍数** - IQR方法的倍数（1.0-3.0）
 - **窗口大小** - 自适应方法的窗口大小（5-30）
 #### 4.2.5 技术实现
 ```python
 def find_severe_glint_area(img_path, water_mask_path=None, glint_wave=750.0, 
                          method='otsu', z_threshold=2.5, percentile=95.0, 
                          iqr_multiplier=1.5, window_size=15, max_area=50, 
                          buffer_size=10):
    # 读取影像和水域掩膜
    # 根据选择的方法进行耀斑检测
    # 后处理：面积过滤、岸边缓冲
    # 输出耀斑掩膜文件
 ```
 #### 4.2.6 输出结果
 - **耀斑
--- a/降采样光谱.py
+++ b/降采样光谱.py
@ -1,121 +0,0 @@
 import numpy as np
 import spectral
 import spectral.io.envi as envi
 def downsample_bands_extract(data, factor=3, offset=0):
    """抽取降采样：每 factor 个波段取第 (offset+1) 个波段"""
    rows, cols, bands = data.shape
    new_bands = bands // factor
    indices = [offset + i * factor for i in range(new_bands)]
    # 边界保护
    indices = [idx for idx in indices if idx < bands]
    return data[:, :, indices].astype(np.float32)
 def process_bsq_chunked(input_path, output_path, scale=10000, factor=3, offset=0, chunk_lines=500):
    """
    分块处理，抽取降采样，正确写入 BSQ 格式
    BSQ 格式：每个波段的所有行数据连续存储
    """
    img = spectral.open_image(input_path)
    hdr = img.metadata.copy()
    n_rows, n_cols, n_bands = img.nrows, img.ncols, img.nbands
    new_bands = n_bands // factor   # 抽取后的波段数
    new_hdr = hdr.copy()
    new_hdr['samples'] = n_cols
    new_hdr['lines'] = n_rows
    new_hdr['bands'] = new_bands
    new_hdr['data type'] = 12
    new_hdr['interleave'] = 'bsq'
    # 波长抽取（优先使用原始 header 中的波长）
    if 'wavelength' in hdr:
        waves = np.array(hdr['wavelength'])
        if len(waves) >= new_bands * factor:
            extracted_waves = waves[offset::factor][:new_bands]
            new_hdr['wavelength'] = extracted_waves.tolist()
        else:
            print("警告: 原始波长数量不足，跳过")
    # FWHM 处理
    if 'fwhm' in hdr:
        fwhm = np.array(hdr['fwhm'])
        if len(fwhm) >= new_bands * factor:
            new_hdr['fwhm'] = fwhm[offset::factor][:new_bands].tolist()
    out_file = output_path + '.bsq'
    # 关键修改：BSQ 格式要求每个波段的所有行连续存储
    # 先处理所有数据到内存缓冲区（按波段组织），再写入文件
    # 对于大文件，我们按波段分批处理
    print(f"开始处理 {n_rows} 行 x {n_cols} 列 x {new_bands} 波段...")
    # 为每个波段创建一个临时文件（避免内存溢出）
    import tempfile
    import os
    temp_files = []
    temp_band_data = []
    try:
        # 初始化每个波段的临时文件
        for b in range(new_bands):
            fd, temp_path = tempfile.mkstemp(suffix=f'_band_{b}.tmp')
            os.close(fd)
            temp_files.append(temp_path)
            temp_band_data.append(open(temp_path, 'wb'))
        # 第一遍：分块读取，按波段写入临时文件
        for start_row in range(0, n_rows, chunk_lines):
            end_row = min(start_row + chunk_lines, n_rows)
            print(f"处理行 {start_row}-{end_row-1}...")
            chunk = img.read_subregion((start_row, end_row), (0, n_cols))
            chunk_down = downsample_bands_extract(chunk, factor, offset)
            chunk_scaled = chunk_down * scale
            chunk_uint16 = np.clip(chunk_scaled, 0, 65535).astype(np.uint16)
            # 将每个波段的数据写入对应的临时文件
            for b in range(new_bands):
                band_data = chunk_uint16[:, :, b].tobytes()
                temp_band_data[b].write(band_data)
            del chunk, chunk_down, chunk_scaled, chunk_uint16
        # 关闭所有临时文件
        for f in temp_band_data:
            f.close()
        # 第二遍：按 BSQ 格式合并（波段0全部数据 → 波段1全部数据 → ...）
        print("合并为 BSQ 格式...")
        with open(out_file, 'wb') as fout:
            for b in range(new_bands):
                with open(temp_files[b], 'rb') as fband:
                    # 读取该波段的全部行数据并写入最终文件
                    while True:
                        data = fband.read(1024 * 1024)  # 1MB 块读取
                        if not data:
                            break
                        fout.write(data)
                print(f"  波段 {b+1}/{new_bands} 写入完成")
        print(f"BSQ 文件写入完成: {out_file}")
    finally:
        # 清理临时文件
        for temp_path in temp_files:
            try:
                os.remove(temp_path)
            except:
                pass
    # 写入头文件
    envi.write_envi_header(output_path + '.hdr', new_hdr)
    print(f"完成！输出: {out_file}")
    print(f"  尺寸: {n_rows} 行 x {n_cols} 列 x {new_bands} 波段")
 if __name__ == '__main__':
    input_base = r"D:\BaiduNetdiskDownload\yaobao\caijain.hdr"
    output_base = r"D:\BaiduNetdiskDownload\yaobao\test"
    process_bsq_chunked(input_base, output_base, scale=10000, factor=3, offset=0, chunk_lines=2000)
		`@ -1,4 +0,0 @@`


			`new_wavelengths = [np.mean(wavelengths[i:i+3]) for i in range(0, len(wavelengths), 3)]`
			`print(new_wavelengths)`
		`@ -1 +0,0 @@`
			`<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="37.07" height="36" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 198"><path fill="#41B883" d="M204.8 0H256L128 220.8L0 0h97.92L128 51.2L157.44 0h47.36Z"></path><path fill="#41B883" d="m0 0l128 220.8L256 0h-51.2L128 132.48L50.56 0H0Z"></path><path fill="#35495E" d="M50.56 0L128 133.12L204.8 0h-47.36L128 51.2L97.92 0H50.56Z"></path></svg>`