Compare commits
12 Commits
feature/ag
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
1b4c07fec6 | ||
|
|
87c48a74a5 | ||
|
|
85afabcd94 | ||
|
|
33065f6c09 | ||
|
|
039a3e1bdc | ||
|
|
4250a7b221 | ||
|
|
e572a26e6f | ||
|
|
1cb1b42395 | ||
|
|
30c6532f23 | ||
|
|
badcce5d2d | ||
|
|
9e10bf11cf | ||
|
|
d3f5c82d98 |
84
README.md
84
README.md
@ -28,7 +28,7 @@ pam_deploy_graph/
|
|||||||
params_loader.py # 读取 JSON 或 config.txt 风格参数文件
|
params_loader.py # 读取 JSON 或 config.txt 风格参数文件
|
||||||
llm/ # LLM structured output 接口、真实 HTTP client、提示词、规则 fallback 和 guardrails
|
llm/ # LLM structured output 接口、真实 HTTP client、提示词、规则 fallback 和 guardrails
|
||||||
graph.py # LangGraph StateGraph 集成入口
|
graph.py # LangGraph StateGraph 集成入口
|
||||||
langgraph_runtime.py # chat 人工确认点的 LangGraph interrupt 运行器
|
langgraph_runtime.py # action 级 LangGraph 运行器
|
||||||
mcp_client.py # MCP stdio/HTTP/SSE client、鉴权 token 和配置读取
|
mcp_client.py # MCP stdio/HTTP/SSE client、鉴权 token 和配置读取
|
||||||
interactive.py # 常驻式 CLI 对话框,会话命令、确认和续跑
|
interactive.py # 常驻式 CLI 对话框,会话命令、确认和续跑
|
||||||
cli.py # CLI 入口
|
cli.py # CLI 入口
|
||||||
@ -64,8 +64,8 @@ packaging/
|
|||||||
- 实现 `config.txt.example` 风格和 JSON 风格参数读取。
|
- 实现 `config.txt.example` 风格和 JSON 风格参数读取。
|
||||||
- 实现 fake 全局流程和完整部署流程,便于不触碰真实环境地验证 Agent 路由。
|
- 实现 fake 全局流程和完整部署流程,便于不触碰真实环境地验证 Agent 路由。
|
||||||
- 实现逐 IP 处理骨架:升级、轮询、启动、校验、日志下载。
|
- 实现逐 IP 处理骨架:升级、轮询、启动、校验、日志下载。
|
||||||
- 实现单 IP 失败后的待回滚确认状态,不自动执行回滚。
|
- 实现 action 失败或审核阻断后暂停并保留当前 action,修复后 `resume` 会从当前 action 重试。
|
||||||
- 实现人工确认入口:`confirm --decision approve|reject` 只处理待确认回滚。
|
- 回滚已从主 workflow 中拆出,改为 chat/CLI 的显式 `rollback` 命令;旧 `confirm` 入口仅作为兼容保留。
|
||||||
- 实现 checkpoint 自动保存和 `resume` 续跑:全局步骤、成功 IP、单 IP 已完成 action 会跳过。
|
- 实现 checkpoint 自动保存和 `resume` 续跑:全局步骤、成功 IP、单 IP 已完成 action 会跳过。
|
||||||
- 实现 LLM structured output 骨架:意图识别、参数抽取、部署计划生成。
|
- 实现 LLM structured output 骨架:意图识别、参数抽取、部署计划生成。
|
||||||
- 实现 OpenAI-compatible 真实 LLM client,支持 `base_url` / `model` 配置,`api_key` 可为空。
|
- 实现 OpenAI-compatible 真实 LLM client,支持 `base_url` / `model` 配置,`api_key` 可为空。
|
||||||
@ -73,22 +73,27 @@ packaging/
|
|||||||
- 增加规则 fallback `RuleBasedLlmClient`,用于本地开发和测试。
|
- 增加规则 fallback `RuleBasedLlmClient`,用于本地开发和测试。
|
||||||
- 增加 LLM 输出 guardrails,禁止计划中出现可执行脚本命令和非法 action。
|
- 增加 LLM 输出 guardrails,禁止计划中出现可执行脚本命令和非法 action。
|
||||||
- 引入 `langgraph` 依赖,CLI/chat 执行流程统一通过 action 级 LangGraph runtime 调度。
|
- 引入 `langgraph` 依赖,CLI/chat 执行流程统一通过 action 级 LangGraph runtime 调度。
|
||||||
- chat/CLI 人工确认点已接入 LangGraph interrupt/checkpointer:运行到待回滚确认时暂停,`approve/reject` 通过 `Command(resume=...)` 恢复。
|
- CLI/chat 执行流程统一通过 action 级 LangGraph runtime 调度;失败暂停状态写入业务 checkpoint,`resume` 会重新进入图并从断点继续。
|
||||||
- 引入 MCP client adapter,可包装 SDK session、普通 callable、stdio server、HTTP/SSE server,并提供 JSON client 配置读取。
|
- 引入 MCP client adapter,可包装 SDK session、普通 callable、stdio server、HTTP/SSE server,并提供 JSON client 配置读取。
|
||||||
- CLI/chat 支持 `--mcp-config` 直接加载 MCP server URL、鉴权和可选 tool 覆盖配置。
|
- CLI/chat 支持 `--mcp-config` 直接加载 MCP server URL、鉴权和可选 tool 覆盖配置。
|
||||||
- 本地已安装 `langgraph` 和 `mcp`,并完成 LangGraph fake 全局流程 smoke。
|
- 本地已安装 `langgraph` 和 `mcp`,并完成 LangGraph fake 全局流程 smoke。
|
||||||
- CLI `analyze` 输出已做敏感字段脱敏。
|
- CLI `analyze` 输出已做敏感字段脱敏。
|
||||||
- 增加 `chat` 常驻式 CLI 对话框,支持自然语言分析、参数设置、执行确认、回滚确认、状态查看、事件查看、checkpoint 选择和续跑。
|
- 增加 `chat` 常驻式 CLI 对话框,支持自然语言分析、参数设置、执行确认、显式回滚、状态查看、事件查看、checkpoint 选择和续跑。
|
||||||
- chat 在开发环境可选启用 `rich` / `prompt_toolkit`;PyInstaller 打包环境默认使用普通文本输入,避免交互兼容问题。
|
- chat 在开发环境和默认发布包中都会优先启用 `rich` / `prompt_toolkit`;如果增强输入初始化失败,会自动降级到普通 `input()`。
|
||||||
- chat 执行前会归一化参数并展示实际写入脚本配置的值;`script_only` / `hybrid_node_mcp` 会提前检查 `ZIP_FILE_PATH` 是否存在。
|
- chat 执行前会归一化参数并展示实际写入脚本配置的值;`script_only` / `hybrid_node_mcp` 会提前检查 `ZIP_FILE_PATH` 是否存在。
|
||||||
- chat 执行中会播报每个 action 的开始、完成或失败;action 执行失败会停在当前 checkpoint,不再误报 LangGraph 不可用。
|
- chat 执行中会播报每个 action 的开始、完成或失败;action 执行失败会停在当前 checkpoint,不再误报 LangGraph 不可用。
|
||||||
- 每个 action 完成后都会进入一次 LLM/规则审核;如果审核建议停止,流程会暂停并给出建议,等待用户 `resume`。
|
- 每个 action 完成后都会进入一次 LLM/规则审核;只有审核通过才会把 action 记为 completed,如果审核建议停止,流程会暂停并等待用户 `resume` 重试当前 action。
|
||||||
|
- `poll-download-progress` 和 `poll-upgrade-progress` 已改为单次进度查询;workflow 负责按配置重复调用,每次查询结果都会交给 LLM/规则审核判断是否完成,并通过 chat 播报进度。
|
||||||
- `--analyze-actions` 和 `llm action-analysis on` 改为只控制是否把详细审核结果写入 `events`,不再控制审核是否执行。
|
- `--analyze-actions` 和 `llm action-analysis on` 改为只控制是否把详细审核结果写入 `events`,不再控制审核是否执行。
|
||||||
- chat 会播报 action 审核开始、审核完成和审核失败,避免黑盒执行。
|
- chat 会播报 action 审核开始、审核完成和审核失败,避免黑盒执行。
|
||||||
- chat 支持执行中按 `Ctrl+C` 中断,保存 checkpoint 后再 `resume`。
|
- chat 支持执行中按 `Ctrl+C` 中断,保存 checkpoint 后再 `resume`。
|
||||||
|
- chat 支持普通 LLM 对话、日志尾部分析和单 action 执行:`ask <问题>`、`log analyze <路径>`、`action propose <需求>`、`action run ...`。
|
||||||
|
- chat 普通对话会优先使用 OpenAI-compatible streaming 输出;如果服务端不支持流式,会自动退回普通请求。`<think>...</think>` 思考内容会被过滤,不展示也不写入运行日志。
|
||||||
- chat 支持 `set KEY=VALUE` 和 `load params <路径>` 热更新当前运行参数,并同步回写运行中的 `config.txt` 与 checkpoint。
|
- chat 支持 `set KEY=VALUE` 和 `load params <路径>` 热更新当前运行参数,并同步回写运行中的 `config.txt` 与 checkpoint。
|
||||||
- 支持通过 `--llm-action-analysis-prompt-file`、`PAM_LLM_ACTION_ANALYSIS_PROMPT_FILE` 或 chat 内 `llm config action_analysis_prompt_file=...` 自定义 action 审核提示词。
|
- 支持通过 `--llm-action-analysis-prompt-file`、`PAM_LLM_ACTION_ANALYSIS_PROMPT_FILE` 或 chat 内 `llm config action_analysis_prompt_file=...` 自定义 action 审核提示词。
|
||||||
- 添加基础测试,当前本地结果为 `57 passed, 2 skipped`。
|
- 增加统一运行日志,默认写入 `logs/pam_deploy_agent.log`,覆盖 CLI/chat、LLM 调用、action 路由、脚本/MCP 调用、LangGraph、checkpoint 等关键流程,并按天切分、默认保留 14 个历史日切文件。
|
||||||
|
- chat 支持 `llm test [文本]`,可用当前 LLM client 做一次轻量调用,确认真实 LLM 或规则 fallback 是否正常加载。
|
||||||
|
- 添加基础测试,当前本地结果为 `83 passed, 3 skipped`。
|
||||||
|
|
||||||
未完成:
|
未完成:
|
||||||
|
|
||||||
@ -131,7 +136,13 @@ python -m pam_deploy_graph.cli analyze \
|
|||||||
|
|
||||||
仓库内已提供 [prompts/action_review.txt](/e:/AIcoding/agent_deply/prompts/action_review.txt) 作为“当前默认 action 审核提示词”的落地副本,后续自定义时可以先复制它再改,便于和内置默认行为对照。
|
仓库内已提供 [prompts/action_review.txt](/e:/AIcoding/agent_deply/prompts/action_review.txt) 作为“当前默认 action 审核提示词”的落地副本,后续自定义时可以先复制它再改,便于和内置默认行为对照。
|
||||||
|
|
||||||
真实 LLM 调用位置在 `pam_deploy_graph/llm/openai_compatible.py`,提示词在 `pam_deploy_graph/llm/prompts.py`。发送给 LLM 的 `base_params` 会脱敏,`CLIENT_SECRET` 不会进入 prompt;本地生成计划后仍会执行 guardrails 校验。
|
真实 LLM 调用位置在 `pam_deploy_graph/llm/openai_compatible.py`,提示词在 `pam_deploy_graph/llm/prompts.py`。发送给 LLM 的 `base_params` 会脱敏,`CLIENT_SECRET` 不会进入 prompt;本地生成计划后仍会执行 guardrails 校验。chat 普通对话优先使用 `/chat/completions` streaming;服务端不支持时会自动退回非流式请求。普通对话和日志分析会过滤 `<think>...</think>`、未闭合 `<think>` 及内部思考内容。
|
||||||
|
|
||||||
|
chat 内可以用当前 client 做一次轻量测试,确认真实 LLM 或规则 fallback 是否正常加载:
|
||||||
|
|
||||||
|
```text
|
||||||
|
PAM> llm test 请返回一次连通性测试结果
|
||||||
|
```
|
||||||
|
|
||||||
如果服务需要鉴权,再补充:
|
如果服务需要鉴权,再补充:
|
||||||
|
|
||||||
@ -280,17 +291,49 @@ PAM> run
|
|||||||
PAM> status
|
PAM> status
|
||||||
PAM> params
|
PAM> params
|
||||||
PAM> events 5
|
PAM> events 5
|
||||||
|
PAM> ask 这个 agent 能做什么
|
||||||
|
PAM> log analyze logs/pam_deploy_agent.log 请帮我看最近异常 --tail 400
|
||||||
|
PAM> action propose 请单独执行 verify-ip 192.168.1.10
|
||||||
|
PAM> action run verify-ip ip=192.168.1.10
|
||||||
|
PAM> action run llm 请单独执行 get-online-ips
|
||||||
|
PAM> llm test
|
||||||
PAM> llm action-analysis on
|
PAM> llm action-analysis on
|
||||||
PAM> llm config action_analysis_prompt_file=prompts/action_review.txt
|
PAM> llm config action_analysis_prompt_file=prompts/action_review.txt
|
||||||
PAM> mcp config mcp_client.example.json
|
PAM> mcp config mcp_client.example.json
|
||||||
PAM> list checkpoints
|
PAM> list checkpoints
|
||||||
PAM> load checkpoint runtime/checkpoints/chat-demo.json
|
PAM> load checkpoint runtime/checkpoints/chat-demo.json
|
||||||
PAM> approve
|
PAM> rollback
|
||||||
PAM> resume
|
PAM> resume
|
||||||
PAM> exit
|
PAM> exit
|
||||||
```
|
```
|
||||||
|
|
||||||
`chat` 默认仍要求在会话内显式输入 `run`,并确认参数、目标 IP 范围和最终执行后才会执行 action。输入 `你好`、`hello` 这类问候不会触发 LLM/结构化分析;需要分析部署需求时可直接描述部署任务,或显式使用 `analyze <需求>`。每个 action 完成后都会自动进入一次 LLM/规则审核,并播报审核开始/结束;如果审核建议停止或审核本身失败,流程会暂停并输出建议,等待用户决定是否 `resume`。`--analyze-actions` 仅控制详细审核结果是否写入 `events`。执行中可按 `Ctrl+C` 中断,chat 会保存当前 checkpoint 并把流程标记为 `user_interrupted`。`set KEY=VALUE` 和 `load params <路径>` 会把更新同步到当前运行 state、`config.txt` 和 checkpoint。`chat` 也支持 `--llm-base-url` / `--llm-api-key` / `--llm-model` / `--llm-action-analysis-prompt-file`、`--mcp-config` 和 `--analyze-actions`。
|
`chat` 默认把非内置命令交给当前 LLM 做普通对话,不会自动触发部署 workflow;普通对话优先流式展示,`<think>...</think>` 思考内容会被过滤。需要结构化分析部署需求时请显式使用 `analyze <需求>`,完整部署仍要求输入 `run`,并确认参数、目标 IP 范围和最终执行后才会执行 action。`log analyze <路径> [问题] [--tail N] [--max-bytes N]` 默认只读取日志尾部并脱敏后交给 LLM;`action propose <需求>` 只让 LLM 解析单 action 计划,不执行;`action run <action> [ip=...] [KEY=VALUE...]` 或 `action run llm <需求>` 会展示 action、backend、ip、风险和参数,用户输入 `yes` 后才会复用现有 ActionRouter 执行单 action。每个 workflow action 和单 action 完成后都会自动进入一次 LLM/规则审核,并播报审核开始/结束;审核输入只包含当前 action 的结构化结果和必要诊断日志,不会把完整运行态 `state_summary` 交给大模型,避免跨步骤状态干扰判断;只有 workflow 审核通过才会把 action 记为 completed;如果审核建议停止或审核本身失败,流程会暂停并输出建议,等待用户决定是否 `resume` 重试当前 action。`create-download-task` 支持可选 `PARENT_VERSION_NUMBER`,非空时会在云下载接口中传入 `parentVersionNumber`,用于指定继承哪个版本的规则;默认空值不传,沿用正在使用的版本规则。`poll-download-progress` 和 `poll-upgrade-progress` 每次只查询一次进度,workflow 会按 `POLL_INTERVAL_SEC`、`DOWNLOAD_POLL_MAX_ATTEMPTS`、`UPGRADE_POLL_MAX_ATTEMPTS` 重复调用,并在每次返回后让 LLM/规则判断是否完成、播报进度;未完成时不会跳到下一个 action。`verify-ip` 用于应用启动后的健康检查,失败时 workflow 会按 `VERIFY_INTERVAL_SEC` 重试,最多 `VERIFY_MAX_ATTEMPTS` 次;默认约每 10 秒一次、最多 12 次,仍未通过才暂停。逐 IP action 失败时也会暂停,修复外部环境后输入 `resume` 会从当前 action 重试;如果确实需要回滚,使用 `rollback [IP]` 显式执行。`llm test [文本]` 可测试当前 LLM client 是否可用。`--analyze-actions` 仅控制详细审核结果是否写入 `events`。执行中可按 `Ctrl+C` 中断,chat 会保存当前 checkpoint 并把流程标记为 `user_interrupted`。`set KEY=VALUE` 和 `load params <路径>` 会把更新同步到当前运行 state、`config.txt` 和 checkpoint。`chat` 也支持 `--llm-base-url` / `--llm-api-key` / `--llm-model` / `--llm-action-analysis-prompt-file`、`--mcp-config` 和 `--analyze-actions`。
|
||||||
|
|
||||||
|
云下载相关参数:
|
||||||
|
|
||||||
|
- `PARENT_VERSION_NUMBER`:可选,创建云下载任务时映射为接口参数 `parentVersionNumber`;默认空值不发送,表示继承正在使用的版本规则。
|
||||||
|
|
||||||
|
重试和进度查询相关参数:
|
||||||
|
|
||||||
|
- `POLL_INTERVAL_SEC`:两次进度查询之间的等待秒数,默认 `2`。
|
||||||
|
- `DOWNLOAD_POLL_MAX_ATTEMPTS`:云下载进度最大查询次数,默认 `60`。
|
||||||
|
- `UPGRADE_POLL_MAX_ATTEMPTS`:单 IP 推送进度最大查询次数,默认 `600`。
|
||||||
|
- `VERIFY_INTERVAL_SEC`:`verify-ip` 健康检查失败后的重试间隔秒数,默认 `10`。
|
||||||
|
- `VERIFY_MAX_ATTEMPTS`:`verify-ip` 健康检查最大尝试次数,默认 `12`。
|
||||||
|
|
||||||
|
## 日志
|
||||||
|
|
||||||
|
Agent 默认写入运行日志到 `logs/pam_deploy_agent.log`。日志覆盖 CLI/chat 输入、LLM 请求和响应摘要、action 路由、脚本/MCP 调用、LangGraph 节点、checkpoint 保存、暂停/续跑等关键流程。日志会在本地时间每日 0 点后首次写入时自动切分,历史文件形如 `pam_deploy_agent.log.YYYY-MM-DD`,默认保留 14 个历史日切文件。日志会递归脱敏 `CLIENT_SECRET`、`MCP_CLIENT_SECRET`、token、Authorization、api_key、password 等字段,并截断长文本;chat 普通对话和日志分析的 `<think>` 内容会先过滤,不记录原始思考过程。
|
||||||
|
|
||||||
|
可通过环境变量调整日志位置、级别和保留策略:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export PAM_AGENT_LOG_FILE=logs/pam_deploy_agent.log
|
||||||
|
export PAM_AGENT_LOG_LEVEL=INFO
|
||||||
|
export PAM_AGENT_LOG_RETENTION_DAYS=14
|
||||||
|
```
|
||||||
|
|
||||||
|
调试 LLM 或 MCP 调用时可临时把 `PAM_AGENT_LOG_LEVEL` 设为 `DEBUG`。`PAM_AGENT_LOG_RETENTION_DAYS` 表示保留的历史日切文件数量,设为 `0` 时不自动清理历史切分文件;仍建议把日志目录放在受控位置。
|
||||||
|
|
||||||
预演:
|
预演:
|
||||||
|
|
||||||
@ -310,18 +353,23 @@ fake 完整部署流程验证:
|
|||||||
python -m pam_deploy_graph.cli run-deploy --config doc_scripts/config.txt.example --strategy fake --checkpoint runtime/checkpoints/demo.json --confirm
|
python -m pam_deploy_graph.cli run-deploy --config doc_scripts/config.txt.example --strategy fake --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
```
|
```
|
||||||
|
|
||||||
如果某个 IP 失败并进入待回滚确认,先查看输出中的 `confirmation`,再人工决定:
|
如果 action 失败或审核阻断,流程会保存 checkpoint 并暂停;修复外部环境后可直接续跑,Agent 会从当前 action 重试:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python -m pam_deploy_graph.cli confirm --checkpoint runtime/checkpoints/demo.json --decision approve --confirm
|
python -m pam_deploy_graph.cli resume --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
```
|
```
|
||||||
|
|
||||||
`confirm` 会通过 LangGraph interrupt resume 处理确认,并在确认后继续执行后续图节点;如果流程此前处于 `paused` 状态,`resume` 会先清理暂停标记,再从 checkpoint 继续执行。
|
如果需要回滚失败 IP,请显式执行 rollback。未传 `--ip` 时会使用当前失败 IP;执行完成后再用 `resume` 继续主流程。
|
||||||
|
|
||||||
拒绝回滚:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python -m pam_deploy_graph.cli confirm --checkpoint runtime/checkpoints/demo.json --decision reject --note "人工决定暂不回滚" --confirm
|
python -m pam_deploy_graph.cli rollback --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
|
python -m pam_deploy_graph.cli resume --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
|
```
|
||||||
|
|
||||||
|
也可以指定 IP 和停机策略:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python -m pam_deploy_graph.cli rollback --checkpoint runtime/checkpoints/demo.json --ip 192.168.1.10 --stop-first --note "人工决定回滚该 IP" --confirm
|
||||||
```
|
```
|
||||||
|
|
||||||
checkpoint 用于断点续跑,会保存完整运行状态和参数。为了支持真实续跑,Agent 写入 checkpoint 时不会脱敏参数;请把 checkpoint 放在受控目录中。如果不传 `--checkpoint`,流程仍可运行,但不能跨进程 `resume`。
|
checkpoint 用于断点续跑,会保存完整运行状态和参数。为了支持真实续跑,Agent 写入 checkpoint 时不会脱敏参数;请把 checkpoint 放在受控目录中。如果不传 `--checkpoint`,流程仍可运行,但不能跨进程 `resume`。
|
||||||
@ -342,5 +390,5 @@ pytest -q
|
|||||||
|
|
||||||
1. 接入真实 PAM_NODE MCP session,并用 `SessionMcpToolClient` 包装。
|
1. 接入真实 PAM_NODE MCP session,并用 `SessionMcpToolClient` 包装。
|
||||||
2. 在测试环境中做 smoke:HOME 脚本 `get-token/get-node-url` + NODE MCP `get-online-ips`。
|
2. 在测试环境中做 smoke:HOME 脚本 `get-token/get-node-url` + NODE MCP `get-online-ips`。
|
||||||
3. 在测试环境验证真实脚本 action 的失败、回滚确认和续跑链路。
|
3. 在测试环境验证真实脚本 action 的失败重试、显式回滚和续跑链路。
|
||||||
4. 继续细化参数确认、IP 范围确认的交互式 UI 或上层编排。
|
4. 继续细化参数确认、IP 范围确认的交互式 UI 或上层编排。
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
name: pam-auto-deply
|
name: pam-auto-deply
|
||||||
description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解用户需求、收集并确认参数、选择执行模式、编排主流程、控制回滚确认与最终汇总;由现有 deploy.sh / deploy.ps1 提供 action 能力执行建版、上传、发布、节点发现、云下载、升级、启停、校验、日志下载和手动回滚。禁止自动生成或修改脚本,禁止使用脚本主流程做部署。
|
description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解用户需求、收集并确认参数、选择执行模式、编排主流程、控制进度查询与最终汇总;由现有 deploy.sh / deploy.ps1 提供 action 能力执行建版、上传、发布、节点发现、云下载、升级、启停、校验、日志下载和手动回滚。禁止自动生成或修改脚本,禁止使用脚本主流程做部署。
|
||||||
---
|
---
|
||||||
|
|
||||||
# PAM_AUTO_DEPLY Skill
|
# PAM_AUTO_DEPLY Skill
|
||||||
@ -22,7 +22,7 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
- 禁止自动生成、重建、覆盖或修改 `deploy.sh`、`deploy.ps1`、`deploy.bat`、`test_deploy.sh`、`test_deploy.ps1`、`test_deploy.bat`。
|
- 禁止自动生成、重建、覆盖或修改 `deploy.sh`、`deploy.ps1`、`deploy.bat`、`test_deploy.sh`、`test_deploy.ps1`、`test_deploy.bat`。
|
||||||
- 在任何真实调用前,必须先向用户展示归一化后的参数并得到确认。
|
- 在任何真实调用前,必须先向用户展示归一化后的参数并得到确认。
|
||||||
- 在真实部署执行过程中,必须持续向用户展示当前阶段、下一步动作和阶段结果,禁止长时间静默执行。
|
- 在真实部署执行过程中,必须持续向用户展示当前阶段、下一步动作和阶段结果,禁止长时间静默执行。
|
||||||
- 回滚不得自动执行。脚本只能输出 `PENDING_AGENT_CONFIRMATION(...)`,必须由 Agent 先向用户确认。
|
- 回滚不得自动执行;主 workflow 失败后只暂停在当前 action。需要回滚时,必须由用户显式输入 `rollback [IP]` 或直接调用 `rollback-ip` action。
|
||||||
|
|
||||||
## 2. 执行模式选择
|
## 2. 执行模式选择
|
||||||
|
|
||||||
@ -68,6 +68,12 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
| `actionType` | `ACTION_TYPE` | 否 | 升级类型,默认 `FULL` |
|
| `actionType` | `ACTION_TYPE` | 否 | 升级类型,默认 `FULL` |
|
||||||
| `timeOut` | `TIMEOUT` | 否 | 接口级超时参数,默认 `120` |
|
| `timeOut` | `TIMEOUT` | 否 | 接口级超时参数,默认 `120` |
|
||||||
| `logName` | `LOG_NAME` | 否 | 日志文件名,默认 `app.log` |
|
| `logName` | `LOG_NAME` | 否 | 日志文件名,默认 `app.log` |
|
||||||
|
| `parentVersionNumber` | `PARENT_VERSION_NUMBER` | 否 | 云下载时指定继承哪个版本的规则;默认空值不传,继承正在使用的版本规则 |
|
||||||
|
| `pollIntervalSec` | `POLL_INTERVAL_SEC` | 否 | 两次进度查询间隔,默认 `2` 秒 |
|
||||||
|
| `downloadPollMaxAttempts` | `DOWNLOAD_POLL_MAX_ATTEMPTS` | 否 | 云下载进度最大查询次数,默认 `60` |
|
||||||
|
| `upgradePollMaxAttempts` | `UPGRADE_POLL_MAX_ATTEMPTS` | 否 | 单 IP 推送进度最大查询次数,默认 `600` |
|
||||||
|
| `verifyIntervalSec` | `VERIFY_INTERVAL_SEC` | 否 | `verify-ip` 健康检查失败后的重试间隔,默认 `10` 秒 |
|
||||||
|
| `verifyMaxAttempts` | `VERIFY_MAX_ATTEMPTS` | 否 | `verify-ip` 健康检查最大尝试次数,默认 `12` |
|
||||||
|
|
||||||
### 3.2 运行控制参数
|
### 3.2 运行控制参数
|
||||||
|
|
||||||
@ -77,13 +83,12 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
- `showUsageOnly`: 是否只说明现有脚本用法而不执行
|
- `showUsageOnly`: 是否只说明现有脚本用法而不执行
|
||||||
- `userSpecifiedIps`: 用户指定的目标 IP 子集
|
- `userSpecifiedIps`: 用户指定的目标 IP 子集
|
||||||
- `allOrNothing`: 是否要求全有或全无
|
- `allOrNothing`: 是否要求全有或全无
|
||||||
- `rollbackApproved`: 用户是否已确认回滚
|
- `rollbackApproved`: 用户是否已明确要求执行回滚
|
||||||
- `osTarget`: 目标脚本入口环境
|
- `osTarget`: 目标脚本入口环境
|
||||||
- `checkpointPath`: 检查点文件路径
|
- `checkpointPath`: 检查点文件路径
|
||||||
- `resumeFromCheckpoint`: 是否按已有检查点断点续试
|
- `resumeFromCheckpoint`: 是否按已有检查点断点续试
|
||||||
- `traceFilePath`: 当前部署统一复用的接口跟踪日志文件路径
|
- `traceFilePath`: 当前部署统一复用的接口跟踪日志文件路径
|
||||||
- `stepIntervalSec`: 全局 action 与 action 之间的执行间隔
|
- `stepIntervalSec`: 全局 action 与 action 之间的执行间隔
|
||||||
- `firstPollDelaySec`: 创建下载任务后,到首次轮询下载进度前的等待间隔
|
|
||||||
- `perIpStepIntervalSec`: 同一台 IP 内部步骤之间的执行间隔
|
- `perIpStepIntervalSec`: 同一台 IP 内部步骤之间的执行间隔
|
||||||
- `perIpIntervalSec`: 一台 IP 完成后到下一台 IP 开始前的间隔
|
- `perIpIntervalSec`: 一台 IP 完成后到下一台 IP 开始前的间隔
|
||||||
- `failurePauseSec`: 某步骤失败后进入下一分支前的等待间隔
|
- `failurePauseSec`: 某步骤失败后进入下一分支前的等待间隔
|
||||||
@ -91,7 +96,6 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
推荐默认值:
|
推荐默认值:
|
||||||
|
|
||||||
- `stepIntervalSec = 2`
|
- `stepIntervalSec = 2`
|
||||||
- `firstPollDelaySec = 2`
|
|
||||||
- `perIpStepIntervalSec = 1`
|
- `perIpStepIntervalSec = 1`
|
||||||
- `perIpIntervalSec = 3`
|
- `perIpIntervalSec = 3`
|
||||||
- `failurePauseSec = 0`
|
- `failurePauseSec = 0`
|
||||||
@ -111,6 +115,7 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
- `actionType`
|
- `actionType`
|
||||||
- `timeOut`
|
- `timeOut`
|
||||||
- `logName`
|
- `logName`
|
||||||
|
- `parentVersionNumber`(可选;空值表示不传)
|
||||||
- 用户指定 IP 子集(如有)
|
- 用户指定 IP 子集(如有)
|
||||||
|
|
||||||
确认规则:
|
确认规则:
|
||||||
@ -135,6 +140,7 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
- ACTION_TYPE: FULL
|
- ACTION_TYPE: FULL
|
||||||
- TIMEOUT: 120
|
- TIMEOUT: 120
|
||||||
- LOG_NAME: app.log
|
- LOG_NAME: app.log
|
||||||
|
- PARENT_VERSION_NUMBER: -
|
||||||
- 指定IP: 192.168.1.10, 192.168.1.11
|
- 指定IP: 192.168.1.10, 192.168.1.11
|
||||||
- CLIENT_ID: 已提供
|
- CLIENT_ID: 已提供
|
||||||
- CLIENT_SECRET: 已提供
|
- CLIENT_SECRET: 已提供
|
||||||
@ -160,6 +166,12 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
- `ACTION_TYPE`
|
- `ACTION_TYPE`
|
||||||
- `TIMEOUT`
|
- `TIMEOUT`
|
||||||
- `LOG_NAME`
|
- `LOG_NAME`
|
||||||
|
- `PARENT_VERSION_NUMBER`
|
||||||
|
- `POLL_INTERVAL_SEC`
|
||||||
|
- `DOWNLOAD_POLL_MAX_ATTEMPTS`
|
||||||
|
- `UPGRADE_POLL_MAX_ATTEMPTS`
|
||||||
|
- `VERIFY_INTERVAL_SEC`
|
||||||
|
- `VERIFY_MAX_ATTEMPTS`
|
||||||
- 命令行只传 action 级控制参数:
|
- 命令行只传 action 级控制参数:
|
||||||
- `--action` / `-Action`
|
- `--action` / `-Action`
|
||||||
- `--ip` / `-Ip`
|
- `--ip` / `-Ip`
|
||||||
@ -168,7 +180,9 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
- 不要把整套业务参数直接拼接到命令行。
|
- 不要把整套业务参数直接拼接到命令行。
|
||||||
- `client_secret` 等敏感字段不得通过命令行透传。
|
- `client_secret` 等敏感字段不得通过命令行透传。
|
||||||
- 如果用户明确要求“不落地配置文件”,则本 Skill 不执行真实部署,只说明限制和原因。
|
- 如果用户明确要求“不落地配置文件”,则本 Skill 不执行真实部署,只说明限制和原因。
|
||||||
- `traceFilePath` 与间隔控制参数不写入 `config.txt`,由 Agent 在运行时持有并应用。
|
- `traceFilePath` 不写入 `config.txt`,由 Agent 在运行时持有并应用。
|
||||||
|
- `PARENT_VERSION_NUMBER` 写入 `config.txt` 但默认可为空;只有非空时,`create-download-task` 才把它作为云下载接口参数 `parentVersionNumber` 发送。
|
||||||
|
- 进度查询和健康检查重试参数写入 `config.txt`,由 Agent workflow 和脚本调试流程共同读取。
|
||||||
|
|
||||||
## 4. 主流程(硬约束)
|
## 4. 主流程(硬约束)
|
||||||
|
|
||||||
@ -193,39 +207,42 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
11. 调用 `get-node-url`。
|
11. 调用 `get-node-url`。
|
||||||
12. 调用 `get-online-ips`。
|
12. 调用 `get-online-ips`。
|
||||||
13. 若用户指定了目标 IP,则基于在线 IP 列表做过滤。
|
13. 若用户指定了目标 IP,则基于在线 IP 列表做过滤。
|
||||||
14. 调用 `create-download-task`。
|
14. 调用 `create-download-task`;如 `PARENT_VERSION_NUMBER` 非空,则云下载接口携带 `parentVersionNumber`,否则不传该参数。
|
||||||
15. 调用 `poll-download-progress`,直到下载完成、失败或超时。
|
15. 重复调用 `poll-download-progress` 单次查询进度;每次返回后交给 LLM/规则判断,直到下载完成、失败或达到最大查询次数。
|
||||||
16. 按在线 IP 或过滤后的目标 IP 列表逐台执行:
|
16. 按在线 IP 或过滤后的目标 IP 列表逐台执行:
|
||||||
- `upgrade-ip`
|
- `upgrade-ip`
|
||||||
- `poll-upgrade-progress`
|
- 重复调用 `poll-upgrade-progress` 单次查询进度;每次返回后交给 LLM/规则判断,直到推送完成、失败或达到最大查询次数
|
||||||
- `start-ip`
|
- `start-ip`
|
||||||
- `verify-ip`
|
- 重复调用 `verify-ip` 健康检查;`SUCCESS=false` 时按 `VERIFY_INTERVAL_SEC` 等待后重试,直到成功或达到 `VERIFY_MAX_ATTEMPTS`
|
||||||
- `download-log`
|
- `download-log`
|
||||||
17. 汇总每台 IP 的结果。
|
17. 汇总每台 IP 的结果。
|
||||||
18. 若出现 `PENDING_AGENT_CONFIRMATION(...)`,立即中止自动后续动作,转入回滚确认分支。
|
18. 若 action 失败、LLM/规则审核要求停止,或出现 legacy `PENDING_AGENT_CONFIRMATION(...)`,暂停在当前 action 并输出建议。
|
||||||
19. 输出最终报告。
|
19. 输出最终报告;需要回滚时,等待用户显式执行 `rollback [IP]`。
|
||||||
|
|
||||||
主流程补充规则:
|
主流程补充规则:
|
||||||
|
|
||||||
1. 一次完整部署中的所有 action 调用,应复用同一个 `traceFilePath`,禁止每个 action 各自新建独立 trace 文件。
|
1. 一次完整部署中的所有 action 调用,应复用同一个 `traceFilePath`,禁止每个 action 各自新建独立 trace 文件。
|
||||||
2. 全局 action 与下一 action 之间,按 `stepIntervalSec` 等待。
|
2. 全局 action 与下一 action 之间,按 `stepIntervalSec` 等待。
|
||||||
3. `create-download-task` 成功后,到首次 `poll-download-progress` 前,按 `firstPollDelaySec` 等待。
|
3. `create-download-task` 成功后,直接进入 `poll-download-progress`;未完成时按 `POLL_INTERVAL_SEC` 等待后再次查询当前 action。
|
||||||
4. 同一台 IP 内部:
|
4. `PARENT_VERSION_NUMBER` 只影响 `create-download-task` / `download-cloud`,不得透传到推送、启动、校验或日志下载 action。
|
||||||
|
5. 同一台 IP 内部:
|
||||||
- `upgrade-ip -> poll-upgrade-progress`
|
- `upgrade-ip -> poll-upgrade-progress`
|
||||||
- `poll-upgrade-progress -> start-ip`
|
- `poll-upgrade-progress -> start-ip`
|
||||||
- `start-ip -> verify-ip`
|
- `start-ip -> verify-ip`
|
||||||
- `verify-ip -> download-log`
|
- `verify-ip -> download-log`
|
||||||
之间按 `perIpStepIntervalSec` 等待。
|
之间按 `perIpStepIntervalSec` 等待。
|
||||||
5. 当前一台 IP 处理完成后,到下一台 IP 开始前,按 `perIpIntervalSec` 等待。
|
6. 当前一台 IP 处理完成后,到下一台 IP 开始前,按 `perIpIntervalSec` 等待。
|
||||||
6. 若某步骤失败后需要进入提示、确认或分支流程,可按 `failurePauseSec` 等待。
|
7. 若某步骤失败后需要进入提示、确认或分支流程,可按 `failurePauseSec` 等待。
|
||||||
7. 若某个间隔值为 `0`,表示该层级不等待,直接进入下一动作。
|
8. 若某个间隔值为 `0`,表示该层级不等待,直接进入下一动作。
|
||||||
|
9. `poll-download-progress` 和 `poll-upgrade-progress` 的脚本 action 只执行一次进度查询;正式 workflow 的循环、checkpoint、LLM 判断和进度播报由 Agent Runtime 负责。
|
||||||
|
10. `verify-ip` 失败但未达到 `VERIFY_MAX_ATTEMPTS` 时,不进入 `download-log`,也不把当前 action 记为 completed;正式 workflow 会播报健康检查进度、保存 checkpoint,并按 `VERIFY_INTERVAL_SEC` 重试当前 action。
|
||||||
|
|
||||||
### 4.2 主流程中的强制确认点
|
### 4.2 主流程中的强制确认点
|
||||||
|
|
||||||
以下节点必须等待用户确认,不能自动越过:
|
以下节点必须等待用户确认,不能自动越过:
|
||||||
|
|
||||||
1. 参数确认单确认前。
|
1. 参数确认单确认前。
|
||||||
2. 出现回滚条件时。
|
2. 执行 `rollback [IP]` 或 `rollback-ip` 前。
|
||||||
3. 用户指定 IP 与在线 IP 过滤结果不一致,且会影响部署范围时。
|
3. 用户指定 IP 与在线 IP 过滤结果不一致,且会影响部署范围时。
|
||||||
4. 用户显式要求修改默认间隔策略时。
|
4. 用户显式要求修改默认间隔策略时。
|
||||||
|
|
||||||
@ -238,14 +255,15 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
3. 在每个全局步骤成功后,告知用户该步骤已完成,并说明关键结果。
|
3. 在每个全局步骤成功后,告知用户该步骤已完成,并说明关键结果。
|
||||||
4. 在每个全局步骤失败后,立即告知用户失败阶段、失败原因和后续处理。
|
4. 在每个全局步骤失败后,立即告知用户失败阶段、失败原因和后续处理。
|
||||||
5. 在逐台 IP 处理时,必须告知当前正在处理哪一台 IP。
|
5. 在逐台 IP 处理时,必须告知当前正在处理哪一台 IP。
|
||||||
6. 在云下载进度轮询阶段,必须持续汇报当前进度,不能静默等待完成。
|
6. 在云下载和单 IP 推送进度查询阶段,每次 `poll-*` 返回后都必须汇报当前进度,不能静默等待完成。
|
||||||
7. 若执行耗时较长,必须按阶段持续播报,不能等全部结束后一次性汇总。
|
7. 在 `verify-ip` 健康检查阶段,每次未通过都必须播报当前检查次数、最大次数和返回信息,不能静默等待应用启动。
|
||||||
8. 若进入回滚确认状态,必须明确告诉用户:
|
8. 若执行耗时较长,必须按阶段持续播报,不能等全部结束后一次性汇总。
|
||||||
|
9. 若失败后建议回滚,必须明确告诉用户:
|
||||||
- 哪一台 IP 失败
|
- 哪一台 IP 失败
|
||||||
- 失败阶段
|
- 失败阶段
|
||||||
- 建议是否回滚
|
- 建议是否回滚
|
||||||
- 是否需要 `stopFirst`
|
- 是否需要 `stopFirst`
|
||||||
9. 若当前处于 action 间隔等待中,也必须告诉用户等待时长和下一步动作。
|
10. 若当前处于 action 间隔等待中,也必须告诉用户等待时长和下一步动作。
|
||||||
|
|
||||||
建议的阶段播报格式:
|
建议的阶段播报格式:
|
||||||
|
|
||||||
@ -349,9 +367,10 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
4. 若部分 IP 已成功完成:
|
4. 若部分 IP 已成功完成:
|
||||||
- 默认跳过成功 IP
|
- 默认跳过成功 IP
|
||||||
- 只继续未完成或失败的 IP
|
- 只继续未完成或失败的 IP
|
||||||
5. 若存在 `PENDING_AGENT_CONFIRMATION(...)`:
|
5. 若存在失败暂停或 legacy `PENDING_AGENT_CONFIRMATION(...)`:
|
||||||
- 检查点中必须保留该状态
|
- 检查点中必须保留失败阶段、失败原因和审核建议
|
||||||
- 未得到用户确认前,不得自动继续后续动作
|
- 修复后 `resume` 默认从当前失败 action 重试
|
||||||
|
- 需要回滚时必须由用户显式执行 `rollback [IP]`
|
||||||
6. 若用户要求“从头重新开始”:
|
6. 若用户要求“从头重新开始”:
|
||||||
- 先明确说明将忽略现有检查点
|
- 先明确说明将忽略现有检查点
|
||||||
- 再从第 1 步重新执行
|
- 再从第 1 步重新执行
|
||||||
@ -430,14 +449,14 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
| 12 | 获取在线 IP | `get-online-ips` | 返回 `COUNT>0` 且有 `IP=...` 行 | 停止并报告 `GET_ONLINE_IPS` 失败 |
|
| 12 | 获取在线 IP | `get-online-ips` | 返回 `COUNT>0` 且有 `IP=...` 行 | 停止并报告 `GET_ONLINE_IPS` 失败 |
|
||||||
| 13 | 过滤目标 IP | 按用户指定 IP 与在线 IP 交集过滤 | 过滤结果明确 | 过滤后为空时停止;范围变化需确认 |
|
| 13 | 过滤目标 IP | 按用户指定 IP 与在线 IP 交集过滤 | 过滤结果明确 | 过滤后为空时停止;范围变化需确认 |
|
||||||
| 14 | 创建云下载任务 | `create-download-task` | 返回 `RESULT=TASK_CREATED` | 停止并报告 `CREATE_DOWNLOAD_TASK` 失败 |
|
| 14 | 创建云下载任务 | `create-download-task` | 返回 `RESULT=TASK_CREATED` | 停止并报告 `CREATE_DOWNLOAD_TASK` 失败 |
|
||||||
| 15 | 轮询下载进度 | `poll-download-progress` | `STEP=DONE` 或 `MSG=success` 且 `RATE_OF_PROGRESS=100` | 停止并报告 `POLL_DOWNLOAD_PROGRESS` 失败或超时 |
|
| 15 | 查询下载进度 | 重复调用单次 `poll-download-progress` | LLM/规则判断 `progress_complete=true`;或 `STEP=DONE` / `MSG=success` 且 `RATE_OF_PROGRESS=100` | 停止并报告 `POLL_DOWNLOAD_PROGRESS` 失败或超时 |
|
||||||
| 16.1 | 创建单 IP 推送任务 | `upgrade-ip --ip ...` | 返回 `RESULT=TASK_CREATED` | 记录失败,标记 `PENDING_AGENT_CONFIRMATION(stopFirst=false)` |
|
| 16.1 | 创建单 IP 推送任务 | `upgrade-ip --ip ...` | 返回 `RESULT=TASK_CREATED` | 暂停在当前 action,修复后 `resume` 重试;需要回滚时显式执行 rollback |
|
||||||
| 16.2 | 轮询单 IP 推送进度 | `poll-upgrade-progress --ip ...` | `STEP=DONE` 或 `FINISH=true` 或 `MSG=success` 且 `RATE_OF_PROGRESS=100` | 记录失败,标记 `PENDING_AGENT_CONFIRMATION(stopFirst=false)` |
|
| 16.2 | 查询单 IP 推送进度 | 重复调用单次 `poll-upgrade-progress --ip ...` | LLM/规则判断 `progress_complete=true`;或 `STEP=DONE` / `FINISH=true` / `MSG=success` 且 `RATE_OF_PROGRESS=100` | 暂停在当前 action,修复后 `resume` 重试;需要回滚时显式执行 rollback |
|
||||||
| 16.3 | 启动单 IP | `start-ip --ip ...` | action 成功返回 | 记录失败,标记 `PENDING_AGENT_CONFIRMATION(stopFirst=true)` |
|
| 16.3 | 启动单 IP | `start-ip --ip ...` | action 成功返回 | 暂停在当前 action,修复后 `resume` 重试;需要回滚时显式执行 rollback |
|
||||||
| 16.4 | 校验单 IP | `verify-ip --ip ...` | 返回 `SUCCESS=true` | 记录失败,标记 `PENDING_AGENT_CONFIRMATION(stopFirst=true)` |
|
| 16.4 | 校验单 IP | 重复调用单次 `verify-ip --ip ...` | 返回 `SUCCESS=true` | 按 `VERIFY_INTERVAL_SEC` 重试,达到 `VERIFY_MAX_ATTEMPTS` 后仍失败才暂停在当前 action;需要回滚时显式执行 rollback |
|
||||||
| 16.5 | 下载日志 | `download-log --ip ...` | 返回 `LOG_FILE=...` | 记录日志下载失败,但不覆盖原主失败原因 |
|
| 16.5 | 下载日志 | `download-log --ip ...` | 返回 `LOG_FILE=...` | 记录日志下载失败,但不覆盖原主失败原因 |
|
||||||
| 17 | 汇总结果 | 汇总每台 IP 的阶段、失败原因、回滚状态、日志路径 | 报告内容完整 | 若汇总失败,至少保留原始 action 输出 |
|
| 17 | 汇总结果 | 汇总每台 IP 的阶段、失败原因、回滚状态、日志路径 | 报告内容完整 | 若汇总失败,至少保留原始 action 输出 |
|
||||||
| 18 | 回滚确认分支 | 发现 `PENDING_AGENT_CONFIRMATION(...)` 时进入回滚确认 | 用户明确是否回滚 | 未确认时停止,不自动回滚 |
|
| 18 | 失败暂停或显式回滚 | 失败后默认停在当前 action;用户输入 `rollback [IP]` 后才执行回滚 | 用户明确要求回滚或修复后 `resume` | 未显式要求回滚时不自动回滚 |
|
||||||
| 19 | 最终报告 | 输出最终报告 | 报告包含模式、入口、阶段结果、日志、回滚状态 | 不省略失败细节 |
|
| 19 | 最终报告 | 输出最终报告 | 报告包含模式、入口、阶段结果、日志、回滚状态 | 不省略失败细节 |
|
||||||
|
|
||||||
## 5. 通用执行原则
|
## 5. 通用执行原则
|
||||||
@ -456,7 +475,7 @@ description: 面向 PAM HOME/NODE 的智能部署 Skill。由 Skill 负责理解
|
|||||||
- `[FLOW][FAIL]`
|
- `[FLOW][FAIL]`
|
||||||
10. 只允许调用脚本 `action` 入口,禁止调用脚本主流程。
|
10. 只允许调用脚本 `action` 入口,禁止调用脚本主流程。
|
||||||
11. 脚本 action 输出以 `key=value` 为主,Agent 应优先读取这些结果行。
|
11. 脚本 action 输出以 `key=value` 为主,Agent 应优先读取这些结果行。
|
||||||
12. 遇到需要回滚的场景,脚本只返回 `PENDING_AGENT_CONFIRMATION(stopFirst=...)`,Agent 必须先确认。
|
12. 遇到需要回滚的场景,Agent 只能提示风险和建议;不得自动回滚,必须等待用户显式执行 rollback。
|
||||||
|
|
||||||
## 6. 脚本 action 能力
|
## 6. 脚本 action 能力
|
||||||
|
|
||||||
@ -485,15 +504,15 @@ powershell -File .\deploy.ps1 -ConfigPath .\config.txt -Action <ActionName> [-Ip
|
|||||||
| `get-node-url` | 获取目标 Node 地址 | 无 |
|
| `get-node-url` | 获取目标 Node 地址 | 无 |
|
||||||
| `get-online-ips` | 获取在线工作站 IP 列表 | 无 |
|
| `get-online-ips` | 获取在线工作站 IP 列表 | 无 |
|
||||||
| `create-download-task` | 创建云下载任务 | 无 |
|
| `create-download-task` | 创建云下载任务 | 无 |
|
||||||
| `poll-download-progress` | 轮询下载进度 | 无 |
|
| `poll-download-progress` | 单次查询下载进度;是否继续查询由 Agent workflow 和 LLM/规则决定 | 无 |
|
||||||
| `download-cloud-to-node` | 创建下载任务并轮询至完成,仅调试使用,不得进入正式主流程 | 无 |
|
| `download-cloud-to-node` | 创建下载任务并轮询至完成,仅调试使用,不得进入正式主流程 | 无 |
|
||||||
| `upgrade-ip` | 为指定 IP 创建推送任务,固定使用 `timeOut=0` | `--ip` / `-Ip` |
|
| `upgrade-ip` | 为指定 IP 创建推送任务,固定使用 `timeOut=0` | `--ip` / `-Ip` |
|
||||||
| `poll-upgrade-progress` | 轮询指定 IP 的推送进度 | `--ip` / `-Ip` |
|
| `poll-upgrade-progress` | 单次查询指定 IP 的推送进度;是否继续查询由 Agent workflow 和 LLM/规则决定 | `--ip` / `-Ip` |
|
||||||
| `start-ip` | 启动指定 IP 应用 | `--ip` / `-Ip` |
|
| `start-ip` | 启动指定 IP 应用 | `--ip` / `-Ip` |
|
||||||
| `stop-ip` | 停止指定 IP 应用 | `--ip` / `-Ip` |
|
| `stop-ip` | 停止指定 IP 应用 | `--ip` / `-Ip` |
|
||||||
| `verify-ip` | 校验指定 IP | `--ip` / `-Ip` |
|
| `verify-ip` | 校验指定 IP | `--ip` / `-Ip` |
|
||||||
| `download-log` | 下载指定 IP 日志压缩包,返回 zip 文件路径 | `--ip` / `-Ip` |
|
| `download-log` | 下载指定 IP 日志压缩包,返回 zip 文件路径 | `--ip` / `-Ip` |
|
||||||
| `rollback-ip` | 执行指定 IP 回滚 | `--ip` / `-Ip`,可带 `--stop-first` / `-RollbackStopFirst` |
|
| `rollback-ip` | 执行指定 IP 回滚;接口参数使用 URL query,不使用表单 body | `--ip` / `-Ip`,可带 `--stop-first` / `-RollbackStopFirst` |
|
||||||
|
|
||||||
### 6.4 action 输出约定
|
### 6.4 action 输出约定
|
||||||
|
|
||||||
@ -559,9 +578,9 @@ Agent 读取时:
|
|||||||
- `create-download-task`
|
- `create-download-task`
|
||||||
- `upgrade-ip`
|
- `upgrade-ip`
|
||||||
|
|
||||||
### 7.4 手动回滚分支
|
### 7.4 显式回滚命令
|
||||||
|
|
||||||
当部署结果出现 `PENDING_AGENT_CONFIRMATION(...)` 且用户明确同意回滚时:
|
当用户明确输入 `rollback [IP]` 或直接要求对指定 IP 回滚时:
|
||||||
|
|
||||||
1. 再次向用户确认目标 IP 和 `stopFirst` 值。
|
1. 再次向用户确认目标 IP 和 `stopFirst` 值。
|
||||||
2. 调用 `rollback-ip` action。
|
2. 调用 `rollback-ip` action。
|
||||||
@ -613,19 +632,16 @@ Agent 读取时:
|
|||||||
|
|
||||||
### 8.3 回滚规则
|
### 8.3 回滚规则
|
||||||
|
|
||||||
回滚只允许在 Agent 与用户确认后执行。
|
回滚只允许在用户显式要求后执行。
|
||||||
|
|
||||||
回滚状态有三类:
|
回滚状态包括:
|
||||||
|
|
||||||
- `ROLLBACK_NOT_RUN`
|
- `ROLLBACK_NOT_RUN`
|
||||||
- `PENDING_AGENT_CONFIRMATION(stopFirst=true|false)`
|
- `ROLLBACK_DONE`
|
||||||
- 真正执行后的结果:
|
|
||||||
- `ROLLBACK_SUCCESS`
|
|
||||||
- `ROLLBACK_FAILED`
|
- `ROLLBACK_FAILED`
|
||||||
- `ROLLBACK_REQUEST_FAILED`
|
- `REJECTED_BY_OPERATOR`
|
||||||
- `ROLLBACK_VERIFY_FAILED`
|
|
||||||
|
|
||||||
默认确认逻辑:
|
默认建议:
|
||||||
|
|
||||||
- 升级失败:建议回滚,`stopFirst=false`
|
- 升级失败:建议回滚,`stopFirst=false`
|
||||||
- 启动失败:建议回滚,`stopFirst=true`
|
- 启动失败:建议回滚,`stopFirst=true`
|
||||||
@ -674,7 +690,9 @@ powershell -File .\deploy.ps1 -ConfigPath .\config.txt -Action rollback-ip -Ip 1
|
|||||||
- 失败: 1
|
- 失败: 1
|
||||||
- 间隔控制:
|
- 间隔控制:
|
||||||
- stepIntervalSec: 2
|
- stepIntervalSec: 2
|
||||||
- firstPollDelaySec: 2
|
- pollIntervalSec: 2
|
||||||
|
- downloadPollMaxAttempts: 60
|
||||||
|
- upgradePollMaxAttempts: 600
|
||||||
- perIpStepIntervalSec: 1
|
- perIpStepIntervalSec: 1
|
||||||
- perIpIntervalSec: 3
|
- perIpIntervalSec: 3
|
||||||
- failurePauseSec: 0
|
- failurePauseSec: 0
|
||||||
@ -684,7 +702,7 @@ powershell -File .\deploy.ps1 -ConfigPath .\config.txt -Action rollback-ip -Ip 1
|
|||||||
| --- | --- | --- | --- | --- |
|
| --- | --- | --- | --- | --- |
|
||||||
| 192.168.1.10 | SUCCESS | - | - | logs/deploy_192.168.1.10.zip |
|
| 192.168.1.10 | SUCCESS | - | - | logs/deploy_192.168.1.10.zip |
|
||||||
| 192.168.1.11 | SUCCESS | - | - | logs/deploy_192.168.1.11.zip |
|
| 192.168.1.11 | SUCCESS | - | - | logs/deploy_192.168.1.11.zip |
|
||||||
| 192.168.1.12 | FAILED | VERIFY | PENDING_AGENT_CONFIRMATION(stopFirst=true) | logs/deploy_192.168.1.12.zip |
|
| 192.168.1.12 | FAILED | VERIFY | ROLLBACK_NOT_RUN | logs/deploy_192.168.1.12.zip |
|
||||||
```
|
```
|
||||||
|
|
||||||
更完整的最终报告模板:
|
更完整的最终报告模板:
|
||||||
@ -709,7 +727,7 @@ powershell -File .\deploy.ps1 -ConfigPath .\config.txt -Action rollback-ip -Ip 1
|
|||||||
| IP | 状态 | 失败阶段 | 失败原因 | 回滚状态 | 日志 |
|
| IP | 状态 | 失败阶段 | 失败原因 | 回滚状态 | 日志 |
|
||||||
| --- | --- | --- | --- | --- | --- |
|
| --- | --- | --- | --- | --- | --- |
|
||||||
| 192.168.1.10 | SUCCESS | - | - | - | logs/deploy_192.168.1.10.log |
|
| 192.168.1.10 | SUCCESS | - | - | - | logs/deploy_192.168.1.10.log |
|
||||||
| 192.168.1.12 | FAILED | VERIFY | Health check failed | PENDING_AGENT_CONFIRMATION(stopFirst=true) | logs/deploy_192.168.1.12.log |
|
| 192.168.1.12 | FAILED | VERIFY | Health check failed | ROLLBACK_NOT_RUN | logs/deploy_192.168.1.12.log |
|
||||||
|
|
||||||
## 检查点摘要
|
## 检查点摘要
|
||||||
|
|
||||||
@ -724,9 +742,10 @@ powershell -File .\deploy.ps1 -ConfigPath .\config.txt -Action rollback-ip -Ip 1
|
|||||||
- get-online-ips
|
- get-online-ips
|
||||||
- create-download-task
|
- create-download-task
|
||||||
|
|
||||||
## 待确认事项
|
## 后续建议
|
||||||
|
|
||||||
- 是否对 192.168.1.12 执行回滚
|
- 192.168.1.12 停在 verify-ip;修复后可 resume 重试当前 action
|
||||||
|
- 如确认需要回滚,可执行 rollback 192.168.1.12
|
||||||
```
|
```
|
||||||
|
|
||||||
## 10. Agent 执行建议
|
## 10. Agent 执行建议
|
||||||
@ -740,7 +759,7 @@ powershell -File .\deploy.ps1 -ConfigPath .\config.txt -Action rollback-ip -Ip 1
|
|||||||
- 回滚需要确认
|
- 回滚需要确认
|
||||||
4. 参数未确认前,不触发任何真实部署 action。
|
4. 参数未确认前,不触发任何真实部署 action。
|
||||||
5. 用户只要求“生成脚本不执行”时,由于本 Skill 禁止自动生成或修改脚本,应直接说明限制,而不是自动产出脚本文件。
|
5. 用户只要求“生成脚本不执行”时,由于本 Skill 禁止自动生成或修改脚本,应直接说明限制,而不是自动产出脚本文件。
|
||||||
6. 如果 action 输出中出现 `PENDING_AGENT_CONFIRMATION(...)`,立即中止自动后续动作并请求确认。
|
6. 如果 action 输出中出现 legacy `PENDING_AGENT_CONFIRMATION(...)`,立即暂停当前 workflow,输出建议;需要回滚时等待用户显式执行 rollback。
|
||||||
7. 如果存在检查点,优先评估能否从断点续试,而不是默认从头执行。
|
7. 如果存在检查点,优先评估能否从断点续试,而不是默认从头执行。
|
||||||
8. 任何长耗时阶段都要主动播报进度,尤其是:
|
8. 任何长耗时阶段都要主动播报进度,尤其是:
|
||||||
- `create-download-task`
|
- `create-download-task`
|
||||||
|
|||||||
@ -9,3 +9,9 @@ ZIP_FILE_PATH=C:\path\to\pam-2.0.5.zip
|
|||||||
ACTION_TYPE=FULL
|
ACTION_TYPE=FULL
|
||||||
TIMEOUT=120
|
TIMEOUT=120
|
||||||
LOG_NAME=app.log
|
LOG_NAME=app.log
|
||||||
|
PARENT_VERSION_NUMBER=
|
||||||
|
POLL_INTERVAL_SEC=2
|
||||||
|
DOWNLOAD_POLL_MAX_ATTEMPTS=60
|
||||||
|
UPGRADE_POLL_MAX_ATTEMPTS=600
|
||||||
|
VERIFY_INTERVAL_SEC=10
|
||||||
|
VERIFY_MAX_ATTEMPTS=12
|
||||||
|
|||||||
@ -23,6 +23,8 @@ Notes:
|
|||||||
- deploy.bat is only a wrapper for this script.
|
- deploy.bat is only a wrapper for this script.
|
||||||
- The wrapper avoids cmd.exe delayed-expansion issues with CLIENT_SECRET values
|
- The wrapper avoids cmd.exe delayed-expansion issues with CLIENT_SECRET values
|
||||||
containing exclamation marks.
|
containing exclamation marks.
|
||||||
|
- poll-download-progress and poll-upgrade-progress only query progress once.
|
||||||
|
The Agent workflow repeats them and asks LLM/rules to judge completion.
|
||||||
'@ | Write-Host
|
'@ | Write-Host
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -366,6 +368,12 @@ function Get-PamConfig {
|
|||||||
'ACTION_TYPE' { $config[$key] = $value }
|
'ACTION_TYPE' { $config[$key] = $value }
|
||||||
'TIMEOUT' { $config[$key] = $value }
|
'TIMEOUT' { $config[$key] = $value }
|
||||||
'LOG_NAME' { $config[$key] = $value }
|
'LOG_NAME' { $config[$key] = $value }
|
||||||
|
'PARENT_VERSION_NUMBER' { $config[$key] = $value }
|
||||||
|
'POLL_INTERVAL_SEC' { $config[$key] = $value }
|
||||||
|
'DOWNLOAD_POLL_MAX_ATTEMPTS' { $config[$key] = $value }
|
||||||
|
'UPGRADE_POLL_MAX_ATTEMPTS' { $config[$key] = $value }
|
||||||
|
'VERIFY_INTERVAL_SEC' { $config[$key] = $value }
|
||||||
|
'VERIFY_MAX_ATTEMPTS' { $config[$key] = $value }
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
@ -384,6 +392,12 @@ function Get-PamConfig {
|
|||||||
ACTION_TYPE = 'FULL'
|
ACTION_TYPE = 'FULL'
|
||||||
TIMEOUT = '120'
|
TIMEOUT = '120'
|
||||||
LOG_NAME = 'app.log'
|
LOG_NAME = 'app.log'
|
||||||
|
PARENT_VERSION_NUMBER = ''
|
||||||
|
POLL_INTERVAL_SEC = '2'
|
||||||
|
DOWNLOAD_POLL_MAX_ATTEMPTS = '60'
|
||||||
|
UPGRADE_POLL_MAX_ATTEMPTS = '600'
|
||||||
|
VERIFY_INTERVAL_SEC = '10'
|
||||||
|
VERIFY_MAX_ATTEMPTS = '12'
|
||||||
}
|
}
|
||||||
|
|
||||||
foreach ($name in $defaults.Keys) {
|
foreach ($name in $defaults.Keys) {
|
||||||
@ -647,8 +661,14 @@ function Wait-DownloadProgress {
|
|||||||
RateOfProgress = ''
|
RateOfProgress = ''
|
||||||
RawResponse = ''
|
RawResponse = ''
|
||||||
}
|
}
|
||||||
|
$maxAttempts = 60
|
||||||
|
[int]::TryParse([string]$Config.DOWNLOAD_POLL_MAX_ATTEMPTS, [ref]$maxAttempts) | Out-Null
|
||||||
|
if ($maxAttempts -lt 1) { $maxAttempts = 60 }
|
||||||
|
$pollIntervalSec = 2
|
||||||
|
[int]::TryParse([string]$Config.POLL_INTERVAL_SEC, [ref]$pollIntervalSec) | Out-Null
|
||||||
|
if ($pollIntervalSec -lt 0) { $pollIntervalSec = 2 }
|
||||||
|
|
||||||
for ($attempt = 0; $attempt -lt 60; $attempt++) {
|
for ($attempt = 0; $attempt -lt $maxAttempts; $attempt++) {
|
||||||
$response = Invoke-PamWebRequest -Method GET -Url $progressUrl -Token $Token -Headers @{
|
$response = Invoke-PamWebRequest -Method GET -Url $progressUrl -Token $Token -Headers @{
|
||||||
'Target-Node' = $NodeUrl
|
'Target-Node' = $NodeUrl
|
||||||
}
|
}
|
||||||
@ -681,7 +701,7 @@ function Wait-DownloadProgress {
|
|||||||
if ($progressParts.Count -gt 0) {
|
if ($progressParts.Count -gt 0) {
|
||||||
Write-Info ("Step 3.3b: async download progress -> {0}" -f ($progressParts -join ', '))
|
Write-Info ("Step 3.3b: async download progress -> {0}" -f ($progressParts -join ', '))
|
||||||
} else {
|
} else {
|
||||||
Write-Info ("Step 3.3b: async download progress polling... ({0}/60)" -f ($attempt + 1))
|
Write-Info ("Step 3.3b: async download progress polling... ({0}/{1})" -f ($attempt + 1), $maxAttempts)
|
||||||
}
|
}
|
||||||
|
|
||||||
if ($step -eq 'DONE' -or $status -eq 'completed' -or $successFlag -eq 'true' -or (($msg -eq 'success') -and ($progressValue -eq '100'))) {
|
if ($step -eq 'DONE' -or $status -eq 'completed' -or $successFlag -eq 'true' -or (($msg -eq 'success') -and ($progressValue -eq '100'))) {
|
||||||
@ -694,12 +714,64 @@ function Wait-DownloadProgress {
|
|||||||
throw "Node download failed: $message"
|
throw "Node download failed: $message"
|
||||||
}
|
}
|
||||||
|
|
||||||
Start-Sleep -Seconds 2
|
Start-Sleep -Seconds $pollIntervalSec
|
||||||
}
|
}
|
||||||
|
|
||||||
throw 'Node download timed out.'
|
throw 'Node download timed out.'
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function Read-DownloadProgress {
|
||||||
|
param($Config, [string]$Token, [string]$NodeUrl)
|
||||||
|
|
||||||
|
$query = Join-RequestPairs ([ordered]@{
|
||||||
|
applicationName = $Config.APP_NAME
|
||||||
|
moduleName = $Config.MODULE_NAME
|
||||||
|
airportCode = $Config.AIRPORT_CODE
|
||||||
|
versionNumber = $Config.VERSION_NUMBER
|
||||||
|
})
|
||||||
|
$progressUrl = "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/download-cloud/progress?$query"
|
||||||
|
$response = Invoke-PamWebRequest -Method GET -Url $progressUrl -Token $Token -Headers @{
|
||||||
|
'Target-Node' = $NodeUrl
|
||||||
|
}
|
||||||
|
|
||||||
|
$status = Get-ResponseValue -Response $response -Candidates @('status')
|
||||||
|
$successFlag = Get-ResponseValue -Response $response -Candidates @('success')
|
||||||
|
$step = Get-ResponseValue -Response $response -Candidates @('step')
|
||||||
|
$msg = Get-ResponseValue -Response $response -Candidates @('msg')
|
||||||
|
$progressValue = Get-ResponseValue -Response $response -Candidates @('rateOfProgress', 'progress', 'percent', 'data.rateOfProgress', 'data.progress', 'data.percent')
|
||||||
|
$message = Get-ResponseValue -Response $response -Candidates @('message')
|
||||||
|
if (-not $message) { $message = $msg }
|
||||||
|
$script:DownloadProgressState = [ordered]@{
|
||||||
|
Status = [string]$status
|
||||||
|
Success = [string]$successFlag
|
||||||
|
Step = [string]$step
|
||||||
|
Msg = [string]$msg
|
||||||
|
Message = [string]$message
|
||||||
|
RateOfProgress = [string]$progressValue
|
||||||
|
RawResponse = [string]$response
|
||||||
|
}
|
||||||
|
|
||||||
|
$progressParts = [System.Collections.Generic.List[string]]::new()
|
||||||
|
if ($msg) { $progressParts.Add("msg=$msg") }
|
||||||
|
if ($step) { $progressParts.Add("step=$step") }
|
||||||
|
if ($progressValue) { $progressParts.Add("rateOfProgress=$progressValue") }
|
||||||
|
if ($status) { $progressParts.Add("status=$status") }
|
||||||
|
if ($successFlag) { $progressParts.Add("success=$successFlag") }
|
||||||
|
if ($message -and $message -ne $msg) { $progressParts.Add("message=$message") }
|
||||||
|
|
||||||
|
if ($progressParts.Count -gt 0) {
|
||||||
|
Write-Info ("Step 3.3b: async download progress single query -> {0}" -f ($progressParts -join ', '))
|
||||||
|
} else {
|
||||||
|
Write-Info 'Step 3.3b: async download progress single query returned no explicit progress fields.'
|
||||||
|
}
|
||||||
|
|
||||||
|
if ((@($step, $message, $msg, $status) -join ' ') -match '(?i)fail|error') {
|
||||||
|
if (-not $message) { $message = $step }
|
||||||
|
if (-not $message) { $message = $msg }
|
||||||
|
throw "Node download failed: $message"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
function Create-DownloadTask {
|
function Create-DownloadTask {
|
||||||
param($Config, [string]$Token, [string]$NodeUrl)
|
param($Config, [string]$Token, [string]$NodeUrl)
|
||||||
|
|
||||||
@ -710,6 +782,11 @@ function Create-DownloadTask {
|
|||||||
moduleName = $Config.MODULE_NAME
|
moduleName = $Config.MODULE_NAME
|
||||||
timeOut = '0'
|
timeOut = '0'
|
||||||
})
|
})
|
||||||
|
if ($Config.PARENT_VERSION_NUMBER) {
|
||||||
|
$query += '&' + (Join-RequestPairs ([ordered]@{
|
||||||
|
parentVersionNumber = $Config.PARENT_VERSION_NUMBER
|
||||||
|
}))
|
||||||
|
}
|
||||||
|
|
||||||
[void](Invoke-PamWebRequest -Method GET -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/download-cloud?$query" -Token $Token -Headers @{
|
[void](Invoke-PamWebRequest -Method GET -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/download-cloud?$query" -Token $Token -Headers @{
|
||||||
'Target-Node' = $NodeUrl
|
'Target-Node' = $NodeUrl
|
||||||
@ -751,8 +828,14 @@ function Wait-UpgradeProgress {
|
|||||||
LastModify = ''
|
LastModify = ''
|
||||||
RawResponse = ''
|
RawResponse = ''
|
||||||
}
|
}
|
||||||
|
$maxAttempts = 600
|
||||||
|
[int]::TryParse([string]$Config.UPGRADE_POLL_MAX_ATTEMPTS, [ref]$maxAttempts) | Out-Null
|
||||||
|
if ($maxAttempts -lt 1) { $maxAttempts = 600 }
|
||||||
|
$pollIntervalSec = 2
|
||||||
|
[int]::TryParse([string]$Config.POLL_INTERVAL_SEC, [ref]$pollIntervalSec) | Out-Null
|
||||||
|
if ($pollIntervalSec -lt 0) { $pollIntervalSec = 2 }
|
||||||
|
|
||||||
for ($attempt = 0; $attempt -lt 60; $attempt++) {
|
for ($attempt = 0; $attempt -lt $maxAttempts; $attempt++) {
|
||||||
$response = Invoke-PamWebRequest -Method GET -Url $progressUrl -Token $Token -Headers @{
|
$response = Invoke-PamWebRequest -Method GET -Url $progressUrl -Token $Token -Headers @{
|
||||||
'Target-Node' = $NodeUrl
|
'Target-Node' = $NodeUrl
|
||||||
}
|
}
|
||||||
@ -797,7 +880,7 @@ function Wait-UpgradeProgress {
|
|||||||
if ($progressParts.Count -gt 1) {
|
if ($progressParts.Count -gt 1) {
|
||||||
Write-Info ("Step 3.4a: async upgrade progress -> {0}" -f ($progressParts -join ', '))
|
Write-Info ("Step 3.4a: async upgrade progress -> {0}" -f ($progressParts -join ', '))
|
||||||
} else {
|
} else {
|
||||||
Write-Info ("Step 3.4a: async upgrade progress polling... ip={0} ({1}/60)" -f $Ip, ($attempt + 1))
|
Write-Info ("Step 3.4a: async upgrade progress polling... ip={0} ({1}/{2})" -f $Ip, ($attempt + 1), $maxAttempts)
|
||||||
}
|
}
|
||||||
|
|
||||||
if ($step -eq 'DONE' -or $finish -eq 'true' -or $status -eq 'completed' -or $successFlag -eq 'true') {
|
if ($step -eq 'DONE' -or $finish -eq 'true' -or $status -eq 'completed' -or $successFlag -eq 'true') {
|
||||||
@ -821,12 +904,88 @@ function Wait-UpgradeProgress {
|
|||||||
throw "Node upgrade failed: ip=$Ip, message=$message"
|
throw "Node upgrade failed: ip=$Ip, message=$message"
|
||||||
}
|
}
|
||||||
|
|
||||||
Start-Sleep -Seconds 2
|
Start-Sleep -Seconds $pollIntervalSec
|
||||||
}
|
}
|
||||||
|
|
||||||
throw "Node upgrade timed out: ip=$Ip"
|
throw "Node upgrade timed out: ip=$Ip"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function Read-UpgradeProgress {
|
||||||
|
param(
|
||||||
|
$Config,
|
||||||
|
[string]$Token,
|
||||||
|
[string]$NodeUrl,
|
||||||
|
[string]$Ip
|
||||||
|
)
|
||||||
|
|
||||||
|
$query = Join-RequestPairs ([ordered]@{
|
||||||
|
applicationName = $Config.APP_NAME
|
||||||
|
moduleName = $Config.MODULE_NAME
|
||||||
|
airportCode = $Config.AIRPORT_CODE
|
||||||
|
versionNumber = $Config.VERSION_NUMBER
|
||||||
|
})
|
||||||
|
$progressUrl = "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/progress?$query"
|
||||||
|
$response = Invoke-PamWebRequest -Method GET -Url $progressUrl -Token $Token -Headers @{
|
||||||
|
'Target-Node' = $NodeUrl
|
||||||
|
}
|
||||||
|
$progressResponse = Get-ScopedResponseObject -Response $response -ScopeKey $Ip
|
||||||
|
|
||||||
|
$status = Get-ResponseValue -Response $progressResponse -Candidates @('status')
|
||||||
|
$successFlag = Get-ResponseValue -Response $progressResponse -Candidates @('success')
|
||||||
|
$step = Get-ResponseValue -Response $progressResponse -Candidates @('step')
|
||||||
|
$msg = Get-ResponseValue -Response $progressResponse -Candidates @('msg')
|
||||||
|
$progressValue = Get-ResponseValue -Response $progressResponse -Candidates @('rateOfProgress', 'progress', 'percent', 'data.rateOfProgress', 'data.progress', 'data.percent')
|
||||||
|
$message = Get-ResponseValue -Response $progressResponse -Candidates @('message')
|
||||||
|
$code = Get-ResponseValue -Response $progressResponse -Candidates @('code')
|
||||||
|
$finish = Get-ResponseValue -Response $progressResponse -Candidates @('finish')
|
||||||
|
$lastModify = Get-ResponseValue -Response $progressResponse -Candidates @('lastModify')
|
||||||
|
if (-not $message) { $message = $msg }
|
||||||
|
|
||||||
|
$script:UpgradeProgressState = [ordered]@{
|
||||||
|
Status = [string]$status
|
||||||
|
Success = [string]$successFlag
|
||||||
|
Step = [string]$step
|
||||||
|
Msg = [string]$msg
|
||||||
|
Message = [string]$message
|
||||||
|
RateOfProgress = [string]$progressValue
|
||||||
|
Code = [string]$code
|
||||||
|
Finish = [string]$finish
|
||||||
|
LastModify = [string]$lastModify
|
||||||
|
RawResponse = [string]$response
|
||||||
|
}
|
||||||
|
|
||||||
|
$progressParts = [System.Collections.Generic.List[string]]::new()
|
||||||
|
$progressParts.Add("ip=$Ip")
|
||||||
|
if ($msg) { $progressParts.Add("msg=$msg") }
|
||||||
|
if ($step) { $progressParts.Add("step=$step") }
|
||||||
|
if ($progressValue) { $progressParts.Add("rateOfProgress=$progressValue") }
|
||||||
|
if ($code) { $progressParts.Add("code=$code") }
|
||||||
|
if ($finish) { $progressParts.Add("finish=$finish") }
|
||||||
|
if ($status) { $progressParts.Add("status=$status") }
|
||||||
|
if ($successFlag) { $progressParts.Add("success=$successFlag") }
|
||||||
|
if ($lastModify) { $progressParts.Add("lastModify=$lastModify") }
|
||||||
|
if ($message -and $message -ne $msg) { $progressParts.Add("message=$message") }
|
||||||
|
|
||||||
|
if ($progressParts.Count -gt 1) {
|
||||||
|
Write-Info ("Step 3.4a: async upgrade progress single query -> {0}" -f ($progressParts -join ', '))
|
||||||
|
} else {
|
||||||
|
Write-Info ("Step 3.4a: async upgrade progress single query returned no explicit progress fields: ip={0}" -f $Ip)
|
||||||
|
}
|
||||||
|
|
||||||
|
if ($code -and $code -ne '0') {
|
||||||
|
if (-not $message) { $message = $msg }
|
||||||
|
if (-not $message) { $message = $step }
|
||||||
|
if (-not $message) { $message = "code=$code" }
|
||||||
|
throw "Node upgrade failed: ip=$Ip, message=$message"
|
||||||
|
}
|
||||||
|
|
||||||
|
if ((@($step, $message, $msg, $status) -join ' ') -match '(?i)fail|error') {
|
||||||
|
if (-not $message) { $message = $step }
|
||||||
|
if (-not $message) { $message = $msg }
|
||||||
|
throw "Node upgrade failed: ip=$Ip, message=$message"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
function Invoke-UpgradeRequest {
|
function Invoke-UpgradeRequest {
|
||||||
param($Config, [string]$Token, [string]$NodeUrl, [string]$Ip)
|
param($Config, [string]$Token, [string]$NodeUrl, [string]$Ip)
|
||||||
|
|
||||||
@ -945,16 +1104,16 @@ function Invoke-Rollback {
|
|||||||
}
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
$body = Join-RequestPairs ([ordered]@{
|
$query = Join-RequestPairs ([ordered]@{
|
||||||
airportCode = $Config.AIRPORT_CODE
|
airportCode = $Config.AIRPORT_CODE
|
||||||
targetIp = $Ip
|
targetIp = $Ip
|
||||||
applicationName = $Config.APP_NAME
|
applicationName = $Config.APP_NAME
|
||||||
moduleName = $Config.MODULE_NAME
|
moduleName = $Config.MODULE_NAME
|
||||||
timeOut = $Config.TIMEOUT
|
timeOut = $Config.TIMEOUT
|
||||||
})
|
})
|
||||||
$response = Invoke-PamWebRequest -Method POST -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/rollback" -Token $Token -Headers @{
|
$response = Invoke-PamWebRequest -Method POST -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/rollback?$query" -Token $Token -Headers @{
|
||||||
'Target-Node' = $NodeUrl
|
'Target-Node' = $NodeUrl
|
||||||
} -Body $body -ContentType 'application/x-www-form-urlencoded'
|
}
|
||||||
|
|
||||||
$rollbackSuccess = Get-ResponseValue -Response $response -Candidates @('success')
|
$rollbackSuccess = Get-ResponseValue -Response $response -Candidates @('success')
|
||||||
if ($rollbackSuccess -and $rollbackSuccess -ne 'true') {
|
if ($rollbackSuccess -and $rollbackSuccess -ne 'true') {
|
||||||
@ -1273,7 +1432,7 @@ function Invoke-PamAction {
|
|||||||
'poll-download-progress' {
|
'poll-download-progress' {
|
||||||
$token = Invoke-FlowStep -Name 'Get-Token' -Action { Get-Token -Config $config }
|
$token = Invoke-FlowStep -Name 'Get-Token' -Action { Get-Token -Config $config }
|
||||||
$nodeUrl = Invoke-FlowStep -Name 'Get-NodeUrl' -Action { Get-NodeUrl -Config $config -Token $token }
|
$nodeUrl = Invoke-FlowStep -Name 'Get-NodeUrl' -Action { Get-NodeUrl -Config $config -Token $token }
|
||||||
Invoke-FlowStep -Name 'Wait-DownloadProgress' -Action { Wait-DownloadProgress -Config $config -Token $token -NodeUrl $nodeUrl } | Out-Null
|
Invoke-FlowStep -Name 'Read-DownloadProgress' -Action { Read-DownloadProgress -Config $config -Token $token -NodeUrl $nodeUrl } | Out-Null
|
||||||
Write-DownloadProgressResult
|
Write-DownloadProgressResult
|
||||||
}
|
}
|
||||||
'download-cloud-to-node' {
|
'download-cloud-to-node' {
|
||||||
@ -1287,7 +1446,7 @@ function Invoke-PamAction {
|
|||||||
Require-IpArgument -TargetIp $Ip
|
Require-IpArgument -TargetIp $Ip
|
||||||
$token = Invoke-FlowStep -Name 'Get-Token' -Action { Get-Token -Config $config }
|
$token = Invoke-FlowStep -Name 'Get-Token' -Action { Get-Token -Config $config }
|
||||||
$nodeUrl = Invoke-FlowStep -Name 'Get-NodeUrl' -Action { Get-NodeUrl -Config $config -Token $token }
|
$nodeUrl = Invoke-FlowStep -Name 'Get-NodeUrl' -Action { Get-NodeUrl -Config $config -Token $token }
|
||||||
Invoke-FlowStep -Name "Wait-UpgradeProgress[$Ip]" -Action { Wait-UpgradeProgress -Config $config -Token $token -NodeUrl $nodeUrl -Ip $Ip } | Out-Null
|
Invoke-FlowStep -Name "Read-UpgradeProgress[$Ip]" -Action { Read-UpgradeProgress -Config $config -Token $token -NodeUrl $nodeUrl -Ip $Ip } | Out-Null
|
||||||
Write-UpgradeProgressResult -Ip $Ip
|
Write-UpgradeProgressResult -Ip $Ip
|
||||||
}
|
}
|
||||||
'upgrade-ip' {
|
'upgrade-ip' {
|
||||||
|
|||||||
@ -57,6 +57,16 @@ usage() {
|
|||||||
ACTION_TYPE
|
ACTION_TYPE
|
||||||
TIMEOUT
|
TIMEOUT
|
||||||
LOG_NAME
|
LOG_NAME
|
||||||
|
PARENT_VERSION_NUMBER
|
||||||
|
POLL_INTERVAL_SEC
|
||||||
|
DOWNLOAD_POLL_MAX_ATTEMPTS
|
||||||
|
UPGRADE_POLL_MAX_ATTEMPTS
|
||||||
|
VERIFY_INTERVAL_SEC
|
||||||
|
VERIFY_MAX_ATTEMPTS
|
||||||
|
|
||||||
|
说明:
|
||||||
|
--action poll-download-progress 和 poll-upgrade-progress 只执行一次进度查询。
|
||||||
|
Agent workflow 会重复调用单次进度查询,并在每次返回后交给 LLM/规则审核判断是否完成。
|
||||||
EOF
|
EOF
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -342,6 +352,12 @@ set_defaults() {
|
|||||||
: "${ACTION_TYPE:=FULL}"
|
: "${ACTION_TYPE:=FULL}"
|
||||||
: "${TIMEOUT:=120}"
|
: "${TIMEOUT:=120}"
|
||||||
: "${LOG_NAME:=app.log}"
|
: "${LOG_NAME:=app.log}"
|
||||||
|
: "${PARENT_VERSION_NUMBER:=}"
|
||||||
|
: "${POLL_INTERVAL_SEC:=2}"
|
||||||
|
: "${DOWNLOAD_POLL_MAX_ATTEMPTS:=60}"
|
||||||
|
: "${UPGRADE_POLL_MAX_ATTEMPTS:=600}"
|
||||||
|
: "${VERIFY_INTERVAL_SEC:=10}"
|
||||||
|
: "${VERIFY_MAX_ATTEMPTS:=12}"
|
||||||
}
|
}
|
||||||
|
|
||||||
load_config() {
|
load_config() {
|
||||||
@ -366,7 +382,7 @@ load_config() {
|
|||||||
value="$(strip_inline_comment "$value")"
|
value="$(strip_inline_comment "$value")"
|
||||||
|
|
||||||
case "$key" in
|
case "$key" in
|
||||||
HOME_BASE_URL|CLIENT_ID|CLIENT_SECRET|AIRPORT_CODE|APP_NAME|MODULE_NAME|VERSION_NUMBER|ZIP_FILE_PATH|ACTION_TYPE|TIMEOUT|LOG_NAME)
|
HOME_BASE_URL|CLIENT_ID|CLIENT_SECRET|AIRPORT_CODE|APP_NAME|MODULE_NAME|VERSION_NUMBER|ZIP_FILE_PATH|ACTION_TYPE|TIMEOUT|LOG_NAME|PARENT_VERSION_NUMBER|POLL_INTERVAL_SEC|DOWNLOAD_POLL_MAX_ATTEMPTS|UPGRADE_POLL_MAX_ATTEMPTS|VERIFY_INTERVAL_SEC|VERIFY_MAX_ATTEMPTS)
|
||||||
printf -v "$key" '%s' "$value"
|
printf -v "$key" '%s' "$value"
|
||||||
;;
|
;;
|
||||||
esac
|
esac
|
||||||
@ -961,8 +977,6 @@ get_online_ips() {
|
|||||||
|
|
||||||
poll_download_progress() {
|
poll_download_progress() {
|
||||||
local progress_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud/progress?applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&airportCode=${AIRPORT_CODE}&versionNumber=${VERSION_NUMBER}"
|
local progress_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud/progress?applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&airportCode=${AIRPORT_CODE}&versionNumber=${VERSION_NUMBER}"
|
||||||
local attempt=0
|
|
||||||
local max_attempts=60
|
|
||||||
local error_regex='[Ff]ail|[Ee]rror'
|
local error_regex='[Ff]ail|[Ee]rror'
|
||||||
|
|
||||||
DOWNLOAD_PROGRESS_STATUS=""
|
DOWNLOAD_PROGRESS_STATUS=""
|
||||||
@ -973,7 +987,6 @@ poll_download_progress() {
|
|||||||
DOWNLOAD_PROGRESS_RATE=""
|
DOWNLOAD_PROGRESS_RATE=""
|
||||||
DOWNLOAD_PROGRESS_RESPONSE=""
|
DOWNLOAD_PROGRESS_RESPONSE=""
|
||||||
|
|
||||||
while (( attempt < max_attempts )); do
|
|
||||||
local response
|
local response
|
||||||
response=$(http_request "GET" "$progress_url" "" "" "Target-Node: ${NODE_URL}") || return 1
|
response=$(http_request "GET" "$progress_url" "" "" "Target-Node: ${NODE_URL}") || return 1
|
||||||
|
|
||||||
@ -1010,28 +1023,42 @@ poll_download_progress() {
|
|||||||
[[ -n "$status" ]] && progress_parts+=("status=${status}")
|
[[ -n "$status" ]] && progress_parts+=("status=${status}")
|
||||||
[[ -n "$success_flag" ]] && progress_parts+=("success=${success_flag}")
|
[[ -n "$success_flag" ]] && progress_parts+=("success=${success_flag}")
|
||||||
[[ -n "$message" && "$message" != "$msg_value" ]] && progress_parts+=("message=${message}")
|
[[ -n "$message" && "$message" != "$msg_value" ]] && progress_parts+=("message=${message}")
|
||||||
log_info "Step 3.3b: 异步下载进度 -> ${progress_parts[*]}"
|
log_info "Step 3.3b: 异步下载进度单次查询 -> ${progress_parts[*]}"
|
||||||
else
|
else
|
||||||
log_info "Step 3.3b: 异步下载进度轮询中... ($((attempt + 1))/${max_attempts})"
|
log_info "Step 3.3b: 异步下载进度单次查询未返回明确进度字段。"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [[ "$step_value" == "DONE" || "$status" == "completed" || "$success_flag" == "true" ]]; then
|
if [[ "${step_value} ${message} ${msg_value} ${status}" =~ $error_regex ]]; then
|
||||||
return 0
|
|
||||||
fi
|
|
||||||
|
|
||||||
if [[ "$msg_value" == "success" && "$progress_value" == "100" ]]; then
|
|
||||||
return 0
|
|
||||||
fi
|
|
||||||
|
|
||||||
if [[ "${step_value} ${message} ${msg_value}" =~ $error_regex ]]; then
|
|
||||||
[[ -z "$message" ]] && message="$step_value"
|
[[ -z "$message" ]] && message="$step_value"
|
||||||
[[ -z "$message" ]] && message="$msg_value"
|
[[ -z "$message" ]] && message="$msg_value"
|
||||||
log_error "Node 下载失败: $message"
|
log_error "Node 下载失败: $message"
|
||||||
return 1
|
return 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
download_progress_complete() {
|
||||||
|
[[ "$DOWNLOAD_PROGRESS_STEP" == "DONE" || "$DOWNLOAD_PROGRESS_STATUS" == "completed" || "$DOWNLOAD_PROGRESS_SUCCESS" == "true" ]] && return 0
|
||||||
|
[[ "$DOWNLOAD_PROGRESS_MSG" == "success" && "$DOWNLOAD_PROGRESS_RATE" == "100" ]] && return 0
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
wait_download_progress() {
|
||||||
|
local attempt=0
|
||||||
|
local max_attempts="${DOWNLOAD_POLL_MAX_ATTEMPTS:-60}"
|
||||||
|
local interval_sec="${POLL_INTERVAL_SEC:-2}"
|
||||||
|
[[ "$max_attempts" =~ ^[0-9]+$ ]] || max_attempts=60
|
||||||
|
[[ -n "$interval_sec" ]] || interval_sec=2
|
||||||
|
|
||||||
|
while (( attempt < max_attempts )); do
|
||||||
|
poll_download_progress || return 1
|
||||||
|
if download_progress_complete; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
attempt=$((attempt + 1))
|
attempt=$((attempt + 1))
|
||||||
sleep 2
|
log_info "Step 3.3b: 异步下载进度未完成,等待下一次查询... (${attempt}/${max_attempts})"
|
||||||
|
sleep "$interval_sec"
|
||||||
done
|
done
|
||||||
|
|
||||||
log_error "Node 下载超时。"
|
log_error "Node 下载超时。"
|
||||||
@ -1040,8 +1067,13 @@ poll_download_progress() {
|
|||||||
|
|
||||||
create_download_task() {
|
create_download_task() {
|
||||||
log_info "Step 3.3: 下载软件包到 Node..."
|
log_info "Step 3.3: 下载软件包到 Node..."
|
||||||
|
local download_query="versionNumber=${VERSION_NUMBER}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=0"
|
||||||
|
if [[ -n "${PARENT_VERSION_NUMBER:-}" ]]; then
|
||||||
|
download_query="${download_query}&parentVersionNumber=${PARENT_VERSION_NUMBER}"
|
||||||
|
fi
|
||||||
|
|
||||||
http_request "GET" \
|
http_request "GET" \
|
||||||
"${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud?versionNumber=${VERSION_NUMBER}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=0" \
|
"${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud?${download_query}" \
|
||||||
"" \
|
"" \
|
||||||
"" \
|
"" \
|
||||||
"Target-Node: ${NODE_URL}" \
|
"Target-Node: ${NODE_URL}" \
|
||||||
@ -1050,14 +1082,12 @@ create_download_task() {
|
|||||||
|
|
||||||
download_cloud_to_node() {
|
download_cloud_to_node() {
|
||||||
create_download_task || return 1
|
create_download_task || return 1
|
||||||
poll_download_progress
|
wait_download_progress
|
||||||
}
|
}
|
||||||
|
|
||||||
poll_upgrade_progress() {
|
poll_upgrade_progress() {
|
||||||
local ip="$1"
|
local ip="$1"
|
||||||
local progress_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/progress?applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&airportCode=${AIRPORT_CODE}&versionNumber=${VERSION_NUMBER}"
|
local progress_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/progress?applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&airportCode=${AIRPORT_CODE}&versionNumber=${VERSION_NUMBER}"
|
||||||
local attempt=0
|
|
||||||
local max_attempts=600
|
|
||||||
local error_regex='[Ff]ail|[Ee]rror'
|
local error_regex='[Ff]ail|[Ee]rror'
|
||||||
|
|
||||||
UPGRADE_PROGRESS_STATUS=""
|
UPGRADE_PROGRESS_STATUS=""
|
||||||
@ -1071,7 +1101,6 @@ poll_upgrade_progress() {
|
|||||||
UPGRADE_PROGRESS_LAST_MODIFY=""
|
UPGRADE_PROGRESS_LAST_MODIFY=""
|
||||||
UPGRADE_PROGRESS_RESPONSE=""
|
UPGRADE_PROGRESS_RESPONSE=""
|
||||||
|
|
||||||
while (( attempt < max_attempts )); do
|
|
||||||
local response
|
local response
|
||||||
response=$(http_request "GET" "$progress_url" "" "" "Target-Node: ${NODE_URL}") || return 1
|
response=$(http_request "GET" "$progress_url" "" "" "Target-Node: ${NODE_URL}") || return 1
|
||||||
|
|
||||||
@ -1120,17 +1149,9 @@ poll_upgrade_progress() {
|
|||||||
[[ -n "$success_flag" ]] && progress_parts+=("success=${success_flag}")
|
[[ -n "$success_flag" ]] && progress_parts+=("success=${success_flag}")
|
||||||
[[ -n "$last_modify_value" ]] && progress_parts+=("lastModify=${last_modify_value}")
|
[[ -n "$last_modify_value" ]] && progress_parts+=("lastModify=${last_modify_value}")
|
||||||
[[ -n "$message" && "$message" != "$msg_value" ]] && progress_parts+=("message=${message}")
|
[[ -n "$message" && "$message" != "$msg_value" ]] && progress_parts+=("message=${message}")
|
||||||
log_info "Step 3.4a: async push progress -> ${progress_parts[*]}"
|
log_info "Step 3.4a: async push progress single query -> ${progress_parts[*]}"
|
||||||
else
|
else
|
||||||
log_info "Step 3.4a: async push progress polling... ip=${ip} ($((attempt + 1))/${max_attempts})"
|
log_info "Step 3.4a: async push progress single query returned no explicit progress fields: ip=${ip}"
|
||||||
fi
|
|
||||||
|
|
||||||
if [[ "$step_value" == "DONE" || "$finish_value" == "true" || "$status" == "completed" || "$success_flag" == "true" ]]; then
|
|
||||||
return 0
|
|
||||||
fi
|
|
||||||
|
|
||||||
if [[ "$msg_value" == "success" && "$progress_value" == "100" ]] && [[ -z "$code_value" || "$code_value" == "0" ]]; then
|
|
||||||
return 0
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [[ -n "$code_value" && "$code_value" != "0" ]]; then
|
if [[ -n "$code_value" && "$code_value" != "0" ]]; then
|
||||||
@ -1148,8 +1169,31 @@ poll_upgrade_progress() {
|
|||||||
return 1
|
return 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
upgrade_progress_complete() {
|
||||||
|
[[ "$UPGRADE_PROGRESS_STEP" == "DONE" || "$UPGRADE_PROGRESS_FINISH" == "true" || "$UPGRADE_PROGRESS_STATUS" == "completed" || "$UPGRADE_PROGRESS_SUCCESS" == "true" ]] && return 0
|
||||||
|
[[ "$UPGRADE_PROGRESS_MSG" == "success" && "$UPGRADE_PROGRESS_RATE" == "100" ]] && [[ -z "$UPGRADE_PROGRESS_CODE" || "$UPGRADE_PROGRESS_CODE" == "0" ]] && return 0
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
wait_upgrade_progress() {
|
||||||
|
local ip="$1"
|
||||||
|
local attempt=0
|
||||||
|
local max_attempts="${UPGRADE_POLL_MAX_ATTEMPTS:-600}"
|
||||||
|
local interval_sec="${POLL_INTERVAL_SEC:-2}"
|
||||||
|
[[ "$max_attempts" =~ ^[0-9]+$ ]] || max_attempts=600
|
||||||
|
[[ -n "$interval_sec" ]] || interval_sec=2
|
||||||
|
|
||||||
|
while (( attempt < max_attempts )); do
|
||||||
|
poll_upgrade_progress "$ip" || return 1
|
||||||
|
if upgrade_progress_complete; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
attempt=$((attempt + 1))
|
attempt=$((attempt + 1))
|
||||||
sleep 2
|
log_info "Step 3.4a: async push progress not complete, waiting for next query... ip=${ip} (${attempt}/${max_attempts})"
|
||||||
|
sleep "$interval_sec"
|
||||||
done
|
done
|
||||||
|
|
||||||
log_error "Node push timed out: ip=${ip}"
|
log_error "Node push timed out: ip=${ip}"
|
||||||
@ -1257,10 +1301,11 @@ rollback_ip() {
|
|||||||
fi
|
fi
|
||||||
|
|
||||||
local response
|
local response
|
||||||
|
local rollback_query="airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=${TIMEOUT}"
|
||||||
if ! response=$(http_request "POST" \
|
if ! response=$(http_request "POST" \
|
||||||
"${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/rollback" \
|
"${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/rollback?${rollback_query}" \
|
||||||
"airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=${TIMEOUT}" \
|
"" \
|
||||||
"application/x-www-form-urlencoded" \
|
"" \
|
||||||
"Target-Node: ${NODE_URL}"); then
|
"Target-Node: ${NODE_URL}"); then
|
||||||
printf '%s' "ROLLBACK_REQUEST_FAILED"
|
printf '%s' "ROLLBACK_REQUEST_FAILED"
|
||||||
return 0
|
return 0
|
||||||
@ -1522,7 +1567,7 @@ deploy_one_ip() {
|
|||||||
return
|
return
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if ! run_flow_step "poll_upgrade_progress[${ip}]" poll_upgrade_progress "$ip"; then
|
if ! run_flow_step "wait_upgrade_progress[${ip}]" wait_upgrade_progress "$ip"; then
|
||||||
local message
|
local message
|
||||||
message="$UPGRADE_PROGRESS_MESSAGE"
|
message="$UPGRADE_PROGRESS_MESSAGE"
|
||||||
[[ -z "$message" ]] && message="$UPGRADE_PROGRESS_MSG"
|
[[ -z "$message" ]] && message="$UPGRADE_PROGRESS_MSG"
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
# 当前整体逻辑结构流程图
|
# 当前整体逻辑结构流程图
|
||||||
|
|
||||||
本文描述当前 PAM 部署 Agent 的主要模块、运行路径、LLM 审核、人工确认点、热更新和断点续跑逻辑。
|
本文描述当前 PAM 部署 Agent 的主要模块、运行路径、LLM 审核、失败重试、显式回滚、热更新和断点续跑逻辑。
|
||||||
|
|
||||||
## 模块结构
|
## 模块结构
|
||||||
|
|
||||||
@ -27,7 +27,7 @@ flowchart TD
|
|||||||
REAL --> AGENT
|
REAL --> AGENT
|
||||||
|
|
||||||
LGR --> AGENT
|
LGR --> AGENT
|
||||||
LGR --> LGCHECK[LangGraph InMemorySaver checkpointer/interrupt]
|
LGR --> LGCHECK[LangGraph InMemorySaver checkpointer]
|
||||||
AGENT --> ROUTER[ActionRouter]
|
AGENT --> ROUTER[ActionRouter]
|
||||||
ROUTER --> SCRIPT[ScriptActionRunner]
|
ROUTER --> SCRIPT[ScriptActionRunner]
|
||||||
ROUTER --> MCP[McpActionRunner]
|
ROUTER --> MCP[McpActionRunner]
|
||||||
@ -68,8 +68,8 @@ flowchart TD
|
|||||||
C --> D[build_action_backends 生成 action 路由表]
|
C --> D[build_action_backends 生成 action 路由表]
|
||||||
D --> E[LangGraph entry 节点]
|
D --> E[LangGraph entry 节点]
|
||||||
|
|
||||||
E --> F{是否存在 pending_confirmation}
|
E --> F{是否已暂停}
|
||||||
F -- 是 --> P[confirm interrupt 节点]
|
F -- 是 --> R[render_report 输出报告]
|
||||||
F -- 否 --> G[global_action 节点循环]
|
F -- 否 --> G[global_action 节点循环]
|
||||||
|
|
||||||
G --> G1[get-token]
|
G --> G1[get-token]
|
||||||
@ -79,15 +79,24 @@ flowchart TD
|
|||||||
G4 --> G5[get-node-url]
|
G4 --> G5[get-node-url]
|
||||||
G5 --> G6[get-online-ips]
|
G5 --> G6[get-online-ips]
|
||||||
G6 --> G7[create-download-task]
|
G6 --> G7[create-download-task]
|
||||||
G7 --> G8[poll-download-progress]
|
G7 --> G8[poll-download-progress 单次查询]
|
||||||
G8 --> H[prepare_ip 节点选择下一个 IP action]
|
G8 --> G9{LLM/规则判断下载完成}
|
||||||
|
G9 -- 未完成且正常 --> G8
|
||||||
|
G9 -- 已完成 --> H[prepare_ip 节点选择下一个 IP action]
|
||||||
|
G9 -- 异常或超时 --> R
|
||||||
|
|
||||||
H --> I[resolve_target_ips 计算目标 IP]
|
H --> I[resolve_target_ips 计算目标 IP]
|
||||||
I --> J[ip_action 节点执行 upgrade-ip]
|
I --> J[ip_action 节点执行 upgrade-ip]
|
||||||
J --> K[ip_action 节点执行 poll-upgrade-progress]
|
J --> K[ip_action 节点执行 poll-upgrade-progress 单次查询]
|
||||||
K --> L[ip_action 节点执行 start-ip]
|
K --> K1{LLM/规则判断推送完成}
|
||||||
|
K1 -- 未完成且正常 --> K
|
||||||
|
K1 -- 已完成 --> L[ip_action 节点执行 start-ip]
|
||||||
|
K1 -- 异常或超时 --> R
|
||||||
L --> M[ip_action 节点执行 verify-ip]
|
L --> M[ip_action 节点执行 verify-ip]
|
||||||
M --> N[ip_action 节点执行 download-log]
|
M --> M1{健康检查通过或达到最大次数}
|
||||||
|
M1 -- 未通过且未超时 --> M
|
||||||
|
M1 -- 已通过 --> N[ip_action 节点执行 download-log]
|
||||||
|
M1 -- 仍未通过且超时 --> R
|
||||||
N --> O{还有下一个 IP}
|
N --> O{还有下一个 IP}
|
||||||
O -- 是 --> J
|
O -- 是 --> J
|
||||||
O -- 否 --> R[render_report 输出报告]
|
O -- 否 --> R[render_report 输出报告]
|
||||||
@ -109,15 +118,15 @@ flowchart LR
|
|||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
flowchart TD
|
flowchart TD
|
||||||
A[action 执行完成] --> C[整理 ActionResult 和 AgentState 摘要]
|
A[action 执行完成] --> C[整理当前 ActionResult]
|
||||||
C --> D[敏感字段脱敏并截断长日志]
|
C --> D[敏感字段脱敏;仅在异常时附带必要诊断日志]
|
||||||
D --> E{真实 LLM 是否配置}
|
D --> E{真实 LLM 是否配置}
|
||||||
E -- 是 --> F[OpenAICompatibleLlmClient 输出结构化审核]
|
E -- 是 --> F[OpenAICompatibleLlmClient 输出结构化审核]
|
||||||
E -- 否 --> G[RuleBasedLlmClient 本地规则审核]
|
E -- 否 --> G[RuleBasedLlmClient 本地规则审核]
|
||||||
F --> H{should_continue}
|
F --> H{should_continue}
|
||||||
G --> H
|
G --> H
|
||||||
H -- true --> I[继续后续 action]
|
H -- true --> I[标记 action completed 并继续后续 action]
|
||||||
H -- false --> J[暂停流程并写入 review_context]
|
H -- false --> J[不写 completed,暂停流程并写入 review_context]
|
||||||
J --> K[chat/CLI 播报审核建议并等待 resume]
|
J --> K[chat/CLI 播报审核建议并等待 resume]
|
||||||
F --> L{是否开启 analyze-actions}
|
F --> L{是否开启 analyze-actions}
|
||||||
G --> L
|
G --> L
|
||||||
@ -128,37 +137,78 @@ flowchart TD
|
|||||||
说明:
|
说明:
|
||||||
|
|
||||||
- 每个 action 完成后都会进入一次审核,不再依赖 `--analyze-actions` 开关。
|
- 每个 action 完成后都会进入一次审核,不再依赖 `--analyze-actions` 开关。
|
||||||
|
- 审核输入只包含当前 action 的结构化结果和必要诊断日志,不再传入完整运行态 `state_summary`,避免历史状态干扰大模型判断。
|
||||||
- `--analyze-actions` 或 `llm action-analysis on` 只控制是否把详细审核结果写入 `events`。
|
- `--analyze-actions` 或 `llm action-analysis on` 只控制是否把详细审核结果写入 `events`。
|
||||||
|
- 只有 action 执行成功且审核允许继续时,才会写入 `completed_global_steps` 或 `ip_states[ip].completed_steps`。
|
||||||
|
- 如果审核建议停止或审核本身失败,当前 action 不会计入 completed,`resume` 会重试当前 action。
|
||||||
- 如果审核本身失败,也会生成“停止继续”的审核结果并暂停流程,避免黑盒继续执行。
|
- 如果审核本身失败,也会生成“停止继续”的审核结果并暂停流程,避免黑盒继续执行。
|
||||||
|
|
||||||
## 失败、人工确认和续跑
|
## verify-ip 健康检查重试
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart TD
|
||||||
|
A[执行 verify-ip] --> B[LLM/规则审核单次返回]
|
||||||
|
B --> C{SUCCESS 是否为 true}
|
||||||
|
C -- 是 --> D[清理重试计数,标记 verify-ip completed]
|
||||||
|
C -- 否 --> E{是否达到 VERIFY_MAX_ATTEMPTS}
|
||||||
|
E -- 否 --> F[播报 ACTION_PROGRESS 并保存 checkpoint]
|
||||||
|
F --> G[等待 VERIFY_INTERVAL_SEC]
|
||||||
|
G --> A
|
||||||
|
E -- 是 --> H[暂停在 verify-ip,写入 review_context]
|
||||||
|
```
|
||||||
|
|
||||||
|
说明:
|
||||||
|
|
||||||
|
- `verify-ip` 用于应用启动后的健康检查,失败时默认每 `10` 秒重试一次,最多 `12` 次,约两分钟。
|
||||||
|
- 重试参数来自 `VERIFY_INTERVAL_SEC` 和 `VERIFY_MAX_ATTEMPTS`,支持通过 `config.txt`、chat `set` 或 `load params` 热更新。
|
||||||
|
- 未达到最大次数时不会把 `verify-ip` 写入 completed,也不会进入 `download-log`;中断或失败后 `resume` 仍从 `verify-ip` 继续。
|
||||||
|
|
||||||
|
## 进度查询 action 语义
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart TD
|
||||||
|
A[poll-download-progress / poll-upgrade-progress] --> B[执行一次进度查询]
|
||||||
|
B --> C[ActionResult 返回结构化进度字段]
|
||||||
|
C --> D[LLM/规则审核 progress_complete]
|
||||||
|
D --> E{是否完成}
|
||||||
|
E -- 是 --> F[写入 completed,进入下一个 action]
|
||||||
|
E -- 否但正常 --> G[追加 ACTION_PROGRESS,保存 checkpoint]
|
||||||
|
G --> H[按 POLL_INTERVAL_SEC 等待]
|
||||||
|
H --> A
|
||||||
|
E -- 异常 --> I[暂停在当前 progress action]
|
||||||
|
G --> J{达到最大查询次数}
|
||||||
|
J -- 是 --> I
|
||||||
|
J -- 否 --> H
|
||||||
|
```
|
||||||
|
|
||||||
|
- `poll-download-progress` 和 `poll-upgrade-progress` 不再在脚本内部长时间循环;脚本/MCP/fake 每次只返回一次进度查询结果。
|
||||||
|
- LLM/规则通过 `progress_complete` 判断进度是否完成。未完成但正常时,`should_continue=true`、`progress_complete=false`,workflow 会保留当前 action 并再次查询。
|
||||||
|
- 查询间隔由 `POLL_INTERVAL_SEC` 控制,下载最大次数由 `DOWNLOAD_POLL_MAX_ATTEMPTS` 控制,单 IP 推送最大次数由 `UPGRADE_POLL_MAX_ATTEMPTS` 控制。
|
||||||
|
- 每次进度查询都会播报 `ACTION_PROGRESS` 并保存 checkpoint;中断或失败后 `resume` 会从同一个 progress action 继续。
|
||||||
|
|
||||||
|
## 失败、显式回滚和续跑
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
flowchart TD
|
flowchart TD
|
||||||
A[逐 IP action 执行] --> B{action 失败或业务校验失败}
|
A[逐 IP action 执行] --> B{action 失败或业务校验失败}
|
||||||
B -- 否 --> C[记录 completed_steps 并保存 checkpoint]
|
B -- 否 --> C{LLM 审核是否允许继续}
|
||||||
C --> C1{LLM 审核是否允许继续}
|
C -- 是 --> C1[记录 completed_steps 并保存 checkpoint]
|
||||||
C1 -- 是 --> C2[继续后续 action]
|
C1 --> C2[继续后续 action]
|
||||||
C1 -- 否 --> G[保存 checkpoint 并暂停]
|
C -- 否 --> G[不记录 completed_steps,保存 checkpoint 并暂停]
|
||||||
B -- 是 --> D[记录 ip_state 为 FAILED]
|
B -- 是 --> D[记录 ip_state 为 FAILED]
|
||||||
D --> E[download-log 尽力下载日志]
|
D --> F[保存 failed_stage 和 failure_reason]
|
||||||
E --> F[设置 pending_confirmation=rollback-ip:IP]
|
|
||||||
F --> G[保存 checkpoint 并暂停]
|
F --> G[保存 checkpoint 并暂停]
|
||||||
|
|
||||||
G --> LG{是否来自 CLI/chat 图运行}
|
G --> H{用户决定}
|
||||||
LG -- 是 --> LGI[LangGraph interrupt 输出确认请求]
|
H -- 修复后继续 --> I[resume 清理 paused]
|
||||||
LGI --> LGRS[approve/reject 通过 Command resume 恢复]
|
I --> J[next_ip_action 返回 failed_stage]
|
||||||
LGRS --> H{用户决定}
|
J --> K[重试当前 action]
|
||||||
LG -- 否 --> H{用户决定}
|
H -- 需要回滚 --> L[rollback IP 显式执行 rollback-ip]
|
||||||
H -- approve --> I[confirm_pending 执行 rollback-ip]
|
L --> M{rollback 是否成功}
|
||||||
I --> J{rollback 是否成功}
|
M -- 是 --> N[标记 ROLLBACK_DONE]
|
||||||
J -- 是 --> K[清空 pending_confirmation]
|
M -- 否 --> O[暂停为 rollback_failed]
|
||||||
J -- 否 --> L[保持 pending_confirmation,等待再次处理]
|
N --> P[resume 续跑]
|
||||||
H -- reject --> M[标记 REJECTED_BY_OPERATOR 并清空 pending_confirmation]
|
P --> Q[跳过已完成全局步骤、成功 IP、已回滚 IP 和单 IP 已完成 action]
|
||||||
|
|
||||||
K --> N[resume 续跑]
|
|
||||||
M --> N
|
|
||||||
N --> O[跳过已完成全局步骤、成功 IP 和单 IP 已完成 action]
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## 用户中断与热更新
|
## 用户中断与热更新
|
||||||
@ -183,13 +233,16 @@ flowchart TD
|
|||||||
## checkpoint 续跑语义
|
## checkpoint 续跑语义
|
||||||
|
|
||||||
- `completed_global_steps`:全局阶段已经完成的 action 会跳过。
|
- `completed_global_steps`:全局阶段已经完成的 action 会跳过。
|
||||||
|
- `completed_global_steps` 只记录“执行成功且审核通过”的全局 action;审核阻断时不会提前写入,`resume` 会重试该 action。
|
||||||
- `ip_states[ip].status == SUCCESS`:成功 IP 会跳过。
|
- `ip_states[ip].status == SUCCESS`:成功 IP 会跳过。
|
||||||
- `ip_states[ip].completed_steps`:同一个 IP 已完成的 action 会跳过。
|
- `ip_states[ip].rollback_status == ROLLBACK_DONE`:已显式回滚的失败 IP 会跳过,继续后续目标。
|
||||||
- `pending_confirmation`:存在待确认事项时,部署流程不继续执行,必须先 `approve` 或 `reject`。
|
- `ip_states[ip].failed_stage`:失败 IP 未回滚时,`resume` 会从该 action 重试。
|
||||||
- `paused` / `pause_reason`:流程可能因 LLM 审核阻断、用户中断、回滚失败等原因暂停;`resume` 会先清理暂停标记,再继续执行。
|
- `ip_states[ip].completed_steps`:同一个 IP 已完成且审核通过的 action 会跳过;审核阻断时不会提前写入,`resume` 会重试当前 action。
|
||||||
|
- `pending_confirmation`:仅保留为旧 checkpoint/旧 confirm 入口的兼容字段,新失败流程不再自动设置。
|
||||||
|
- `paused` / `pause_reason`:流程可能因 action 失败、LLM 审核阻断、用户中断、回滚失败等原因暂停;`resume` 会先清理暂停标记,再继续执行。
|
||||||
- `review_context`:保存最近一次暂停时的审核建议、失败原因、IP 和阶段,供 chat/CLI 输出给用户。
|
- `review_context`:保存最近一次暂停时的审核建议、失败原因、IP 和阶段,供 chat/CLI 输出给用户。
|
||||||
- CLI/chat 的运行调度由 `langgraph_runtime.py` 通过 action 级 LangGraph 节点执行;chat 和 CLI confirm 的确认点使用 LangGraph interrupt 和 InMemorySaver。
|
- CLI/chat 的运行调度由 `langgraph_runtime.py` 通过 action 级 LangGraph 节点执行;失败暂停和续跑依赖业务 checkpoint JSON。
|
||||||
- 跨进程续跑仍读取业务 checkpoint JSON;LangGraph checkpointer 负责单进程图恢复和 interrupt resume。
|
- 跨进程续跑读取业务 checkpoint JSON;LangGraph checkpointer 负责单进程图状态保存。
|
||||||
- checkpoint 为了真实续跑会保存完整参数,请放在受控目录中。
|
- checkpoint 为了真实续跑会保存完整参数,请放在受控目录中。
|
||||||
|
|
||||||
## 真实外部能力接入点
|
## 真实外部能力接入点
|
||||||
|
|||||||
@ -11,14 +11,14 @@
|
|||||||
- [x] 增加参数确认和目标 IP 范围确认,不只在回滚阶段确认。
|
- [x] 增加参数确认和目标 IP 范围确认,不只在回滚阶段确认。
|
||||||
- [x] 增加 LLM/MCP 配置热加载,例如 `llm config`、`mcp config`。
|
- [x] 增加 LLM/MCP 配置热加载,例如 `llm config`、`mcp config`。
|
||||||
- [x] 增加执行中 `Ctrl+C` 中断处理:保存 checkpoint、标记 `user_interrupted`,再由 `resume` 继续。
|
- [x] 增加执行中 `Ctrl+C` 中断处理:保存 checkpoint、标记 `user_interrupted`,再由 `resume` 继续。
|
||||||
- [x] 将 chat 的人工确认点接入 LangGraph interrupt/checkpointer;`run` 执行到回滚确认点后由 interrupt 暂停,`approve/reject` 通过 `Command(resume=...)` 恢复同一图线程。跨进程续跑仍保留业务 checkpoint JSON。
|
- [x] 将 chat 执行接入 action 级 LangGraph runtime;action 失败或审核阻断后保存 checkpoint 并暂停,`resume` 从当前 action 重试,`rollback [IP]` 作为显式命令单独执行。
|
||||||
|
|
||||||
## LLM action 后分析
|
## LLM action 后分析
|
||||||
|
|
||||||
- [x] 每次 action 完成后,可把 `action`、`backend`、`ok`、`values`、`stderr`、`error_summary` 和当前 `AgentState` 摘要交给 LLM 分析。
|
- [x] 每次 action 完成后,可把 `action`、`backend`、`ok`、`values`、`stderr`、`error_summary` 和当前 `AgentState` 摘要交给 LLM 分析。
|
||||||
- [x] LLM 输出结构化结果:是否异常、异常等级、可能原因、建议动作、是否需要人工确认。
|
- [x] LLM 输出结构化结果:是否异常、异常等级、可能原因、建议动作、是否需要人工确认。
|
||||||
- [x] LLM 分析结果会影响流程是否继续:`should_continue=false` 时自动暂停,并把建议输出给用户。
|
- [x] LLM 分析结果会影响流程是否继续:`should_continue=false` 时自动暂停,并把建议输出给用户。
|
||||||
- [x] 本地保留规则兜底:exit code、`verify-ip SUCCESS=false`、pending confirmation 等硬规则优先于 LLM。
|
- [x] 本地保留规则兜底:exit code、`verify-ip SUCCESS=false`、旧版 pending confirmation 等硬规则优先于 LLM。
|
||||||
- [x] 对 LLM 输入做脱敏,禁止把 `CLIENT_SECRET`、token、Authorization、完整日志原文发送给模型。
|
- [x] 对 LLM 输入做脱敏,禁止把 `CLIENT_SECRET`、token、Authorization、完整日志原文发送给模型。
|
||||||
- [x] 每个 action 都会执行审核;`--analyze-actions` 或 `llm action-analysis on` 只控制是否把详细审核结果写入 `events`。
|
- [x] 每个 action 都会执行审核;`--analyze-actions` 或 `llm action-analysis on` 只控制是否把详细审核结果写入 `events`。
|
||||||
- [x] 支持通过 `--llm-action-analysis-prompt-file`、环境变量或 chat 命令热加载自定义 action 审核提示词。
|
- [x] 支持通过 `--llm-action-analysis-prompt-file`、环境变量或 chat 命令热加载自定义 action 审核提示词。
|
||||||
|
|||||||
@ -70,11 +70,23 @@ cd pam-deploy-agent-linux-x86_64
|
|||||||
|
|
||||||
本次发布包对应的运行时行为也已同步到包内 `README.md`:
|
本次发布包对应的运行时行为也已同步到包内 `README.md`:
|
||||||
|
|
||||||
- 每个 action 完成后都会自动执行一次 LLM/规则审核。
|
- 每个 action 完成后都会自动执行一次 LLM/规则审核,只有审核通过才会把 action 记为 completed。
|
||||||
|
- action 审核输入不包含完整运行态 `state_summary`,只包含当前 action 的结构化结果和必要诊断日志,避免历史状态干扰大模型判断。
|
||||||
|
- `create-download-task` 支持可选 `PARENT_VERSION_NUMBER`,非空时会传给云下载接口的 `parentVersionNumber`;默认空值不发送,继承正在使用的版本规则。
|
||||||
|
- `poll-download-progress` 和 `poll-upgrade-progress` 是单次进度查询 action;Agent workflow 会按配置重复调用,每次返回后交给 LLM/规则判断是否完成并播报进度。
|
||||||
|
- `verify-ip` 会按 `VERIFY_INTERVAL_SEC` / `VERIFY_MAX_ATTEMPTS` 做应用健康检查重试,默认每 10 秒一次、最多 12 次,仍未通过才暂停。
|
||||||
- `--analyze-actions` 只控制是否把详细审核结果写入 `events`。
|
- `--analyze-actions` 只控制是否把详细审核结果写入 `events`。
|
||||||
- chat 支持执行中 `Ctrl+C` 中断后保存 checkpoint,再通过 `resume` 继续。
|
- action 失败或审核阻断后会保存 checkpoint 并暂停;修复外部环境后通过 `resume` 从当前 action 重试。
|
||||||
|
- 回滚不再属于主 workflow 自动分支;需要时使用 chat 内 `rollback [IP]` 或 CLI `rollback --checkpoint ...` 显式执行。
|
||||||
|
- chat 中非内置命令默认交给当前 LLM 普通对话,不会自动触发部署 workflow;普通对话优先流式展示,`<think>...</think>` 思考内容会被过滤;完整部署仍需 `analyze` / `run` 并人工确认。
|
||||||
|
- chat 支持 `ask <问题>`、`log analyze <路径>`、`action propose <需求>`、`action run ...`,可用于普通问答、日志尾部分析和确认后执行单 action;日志分析输出同样会过滤 `<think>` 内容。
|
||||||
|
- chat 支持执行中 `Ctrl+C` 中断后保存 checkpoint,再通过 `resume` 重试当前 action。
|
||||||
- chat 支持 `set KEY=VALUE` 和 `load params <路径>` 热更新当前运行任务参数。
|
- chat 支持 `set KEY=VALUE` 和 `load params <路径>` 热更新当前运行任务参数。
|
||||||
|
- 进度查询和健康检查重试参数可通过 `POLL_INTERVAL_SEC`、`DOWNLOAD_POLL_MAX_ATTEMPTS`、`UPGRADE_POLL_MAX_ATTEMPTS`、`VERIFY_INTERVAL_SEC`、`VERIFY_MAX_ATTEMPTS` 配置。
|
||||||
- 支持通过 `--llm-action-analysis-prompt-file` 或 chat 内 `llm config action_analysis_prompt_file=...` 自定义 action 审核提示词。
|
- 支持通过 `--llm-action-analysis-prompt-file` 或 chat 内 `llm config action_analysis_prompt_file=...` 自定义 action 审核提示词。
|
||||||
|
- chat 支持 `llm test [文本]` 测试当前 LLM client 是否正常加载。
|
||||||
|
- 默认运行日志写入 `logs/pam_deploy_agent.log`,按天切分并默认保留 14 个历史日切文件,可通过 `PAM_AGENT_LOG_FILE`、`PAM_AGENT_LOG_LEVEL` 和 `PAM_AGENT_LOG_RETENTION_DAYS` 调整。
|
||||||
|
- 日志会脱敏 token、secret、api_key、Authorization 等字段;chat 普通对话和日志分析不会记录原始 `<think>` 内容;checkpoint 仍保存完整运行参数,请放在受控目录。
|
||||||
|
|
||||||
## 包大小评估
|
## 包大小评估
|
||||||
|
|
||||||
|
|||||||
@ -34,9 +34,10 @@ pam-deploy-agent-linux-x86_64/
|
|||||||
./run.sh run-deploy --help
|
./run.sh run-deploy --help
|
||||||
```
|
```
|
||||||
|
|
||||||
发布包默认使用普通文本输入,避免 PyInstaller 环境下 `prompt_toolkit` 兼容性问题;输出仍会在可用时使用 `rich` 做更清晰的文本展示。
|
发布包默认会优先使用 `prompt_toolkit` 增强输入,支持更稳定的退格、历史记录和补全;如果增强输入初始化失败,会自动降级到普通 `input()`。输出仍会在可用时使用 `rich` 做更清晰的文本展示。
|
||||||
chat 内的失败回滚确认由 LangGraph interrupt 托管;执行停在确认点后,输入 `approve` 或 `reject [原因]` 会恢复同一个图线程继续处理。
|
action 失败或审核阻断后会保存 checkpoint 并暂停;修复外部环境后输入 `resume` 会从当前 action 重试。回滚不再属于主 workflow 自动分支,需要时在 chat 内输入 `rollback [IP]` 显式执行。
|
||||||
chat 会在执行前归一化并展示实际写入脚本配置的参数;`script_only` / `hybrid_node_mcp` 会先检查 `ZIP_FILE_PATH` 是否存在,避免脚本运行后才用默认路径失败。执行过程中每个 action 都会输出开始、完成或失败状态;每个 action 完成后还会自动进入一次 LLM/规则审核,并播报审核开始和审核结果。
|
chat 会在执行前归一化并展示实际写入脚本配置的参数;`script_only` / `hybrid_node_mcp` 会先检查 `ZIP_FILE_PATH` 是否存在,避免脚本运行后才用默认路径失败。执行过程中每个 action 都会输出开始、完成或失败状态;每个 action 完成后还会自动进入一次 LLM/规则审核,并播报审核开始和审核结果;审核输入只包含当前 action 的结构化结果和必要诊断日志,不会把完整运行态 `state_summary` 交给大模型;只有审核通过才会把 action 记为 completed。`create-download-task` 支持可选 `PARENT_VERSION_NUMBER`,非空时会作为云下载接口参数 `parentVersionNumber` 传入;默认空值不发送,表示继承正在使用的版本规则。
|
||||||
|
`poll-download-progress` 和 `poll-upgrade-progress` 每次只查询一次进度,Agent workflow 会按 `POLL_INTERVAL_SEC`、`DOWNLOAD_POLL_MAX_ATTEMPTS`、`UPGRADE_POLL_MAX_ATTEMPTS` 重复调用,并在每次返回后交给 LLM/规则判断是否完成、向 chat 播报进度。`verify-ip` 健康检查失败时,Agent workflow 会按 `VERIFY_INTERVAL_SEC` 重试,最多 `VERIFY_MAX_ATTEMPTS` 次;默认每 10 秒一次、最多 12 次,仍未通过才暂停。
|
||||||
|
|
||||||
## 交互式使用
|
## 交互式使用
|
||||||
|
|
||||||
@ -71,12 +72,18 @@ PAM> run
|
|||||||
PAM> status
|
PAM> status
|
||||||
PAM> params
|
PAM> params
|
||||||
PAM> events 5
|
PAM> events 5
|
||||||
|
PAM> ask 这个 agent 能做什么
|
||||||
|
PAM> log analyze logs/pam_deploy_agent.log 请帮我看最近异常 --tail 400
|
||||||
|
PAM> action propose 请单独执行 verify-ip 192.168.1.10
|
||||||
|
PAM> action run verify-ip ip=192.168.1.10
|
||||||
|
PAM> action run llm 请单独执行 get-online-ips
|
||||||
|
PAM> llm test
|
||||||
PAM> llm action-analysis on
|
PAM> llm action-analysis on
|
||||||
PAM> llm config action_analysis_prompt_file=prompts/action_review.txt
|
PAM> llm config action_analysis_prompt_file=prompts/action_review.txt
|
||||||
PAM> mcp config mcp_client.example.json
|
PAM> mcp config mcp_client.example.json
|
||||||
PAM> list checkpoints
|
PAM> list checkpoints
|
||||||
PAM> load checkpoint runtime/checkpoints/demo.json
|
PAM> load checkpoint runtime/checkpoints/demo.json
|
||||||
PAM> approve
|
PAM> rollback
|
||||||
PAM> resume
|
PAM> resume
|
||||||
PAM> exit
|
PAM> exit
|
||||||
```
|
```
|
||||||
@ -123,18 +130,23 @@ PAM> exit
|
|||||||
--confirm
|
--confirm
|
||||||
```
|
```
|
||||||
|
|
||||||
处理失败后的回滚确认:
|
失败后从断点重试:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./run.sh confirm --checkpoint runtime/checkpoints/demo.json --decision approve --confirm
|
./run.sh resume --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
```
|
```
|
||||||
|
|
||||||
`confirm` 会通过 LangGraph interrupt resume 处理确认,并在确认后继续执行后续图节点;进程中断或需要再次续跑时,再使用 `resume`。
|
需要回滚失败 IP 时显式执行 rollback,未指定 `--ip` 时会使用当前失败 IP:
|
||||||
|
|
||||||
拒绝回滚:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./run.sh confirm --checkpoint runtime/checkpoints/demo.json --decision reject --note "人工决定暂不回滚" --confirm
|
./run.sh rollback --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
|
./run.sh resume --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
|
```
|
||||||
|
|
||||||
|
也可以指定 IP 和停机策略:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./run.sh rollback --checkpoint runtime/checkpoints/demo.json --ip 192.168.1.10 --stop-first --note "人工决定回滚该 IP" --confirm
|
||||||
```
|
```
|
||||||
|
|
||||||
## LLM 配置
|
## LLM 配置
|
||||||
@ -176,10 +188,27 @@ chat 内也可以热加载 LLM:
|
|||||||
```text
|
```text
|
||||||
PAM> llm config base_url=https://your-llm.example.com/v1 api_key=your-api-key model=your-model-name
|
PAM> llm config base_url=https://your-llm.example.com/v1 api_key=your-api-key model=your-model-name
|
||||||
PAM> llm config action_analysis_prompt_file=prompts/action_review.txt
|
PAM> llm config action_analysis_prompt_file=prompts/action_review.txt
|
||||||
|
PAM> llm test 请返回一次连通性测试结果
|
||||||
PAM> llm action-analysis on
|
PAM> llm action-analysis on
|
||||||
PAM> llm fallback
|
PAM> llm fallback
|
||||||
```
|
```
|
||||||
|
|
||||||
|
`llm test [文本]` 会使用当前 LLM client 做一次轻量意图识别调用,并输出 client 类型、intent、strategy 和 confidence,便于确认真实 LLM 或规则 fallback 是否正常加载。
|
||||||
|
|
||||||
|
## 日志
|
||||||
|
|
||||||
|
Agent 默认写入运行日志到 `logs/pam_deploy_agent.log`。日志覆盖 chat/CLI 输入、LLM 请求和响应摘要、action 路由、脚本/MCP 调用、LangGraph 节点、checkpoint 保存、暂停/续跑等关键流程。日志会在本地时间每日 0 点后首次写入时自动切分,历史文件形如 `pam_deploy_agent.log.YYYY-MM-DD`,默认保留 14 个历史日切文件。
|
||||||
|
|
||||||
|
可通过环境变量调整日志位置、级别和保留策略:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export PAM_AGENT_LOG_FILE=logs/pam_deploy_agent.log
|
||||||
|
export PAM_AGENT_LOG_LEVEL=INFO
|
||||||
|
export PAM_AGENT_LOG_RETENTION_DAYS=14
|
||||||
|
```
|
||||||
|
|
||||||
|
日志会递归脱敏 `CLIENT_SECRET`、`MCP_CLIENT_SECRET`、token、Authorization、api_key、password 等字段,并截断长文本。`PAM_AGENT_LOG_RETENTION_DAYS` 表示保留的历史日切文件数量,设为 `0` 时不自动清理历史切分文件。checkpoint 仍会保存完整运行参数,请放在受控目录。
|
||||||
|
|
||||||
## 策略说明
|
## 策略说明
|
||||||
|
|
||||||
- `fake`:全部使用 fake runner,不访问真实环境。
|
- `fake`:全部使用 fake runner,不访问真实环境。
|
||||||
@ -219,9 +248,16 @@ MCP token 获取方式与 HOME 一致,默认按 `client_credentials` POST 到
|
|||||||
## 注意事项
|
## 注意事项
|
||||||
|
|
||||||
- 执行真实 action 前请确认配置文件中的 `HOME_BASE_URL`、`CLIENT_ID`、`CLIENT_SECRET`、`AIRPORT_CODE`、`APP_NAME`、`MODULE_NAME`、`VERSION_NUMBER`、`ZIP_FILE_PATH`。
|
- 执行真实 action 前请确认配置文件中的 `HOME_BASE_URL`、`CLIENT_ID`、`CLIENT_SECRET`、`AIRPORT_CODE`、`APP_NAME`、`MODULE_NAME`、`VERSION_NUMBER`、`ZIP_FILE_PATH`。
|
||||||
- `chat` 中输入 `你好`、`hello` 这类问候不会触发 LLM/结构化分析;需要分析部署需求时请直接描述部署任务,或显式使用 `analyze <需求>`。
|
- `PARENT_VERSION_NUMBER` 是云下载可选参数;非空时会传给 `download-cloud` 的 `parentVersionNumber`,空值不会发送。
|
||||||
|
- `chat` 中非内置命令默认交给当前 LLM 做普通对话,不会自动触发部署 workflow;普通对话优先流式展示,`<think>...</think>` 思考内容会被过滤;需要分析部署需求时请显式使用 `analyze <需求>`,完整部署仍需 `run` 并逐步确认。
|
||||||
|
- `ask <问题>` 可显式普通对话;`log analyze <路径> [问题] [--tail N] [--max-bytes N]` 默认只读取日志尾部并脱敏后交给 LLM,日志分析输出同样会过滤 `<think>` 内容。
|
||||||
|
- `action propose <需求>` 只展示 LLM 解析出的单 action 计划;`action run <action> [ip=...] [KEY=VALUE...]` 和 `action run llm <需求>` 会在用户输入 `yes` 后才执行单 action。
|
||||||
- 每个 action 完成后都会自动执行一次 LLM/规则审核;`--analyze-actions` 和 `llm action-analysis on` 只控制是否把详细审核结果写入 `events`。
|
- 每个 action 完成后都会自动执行一次 LLM/规则审核;`--analyze-actions` 和 `llm action-analysis on` 只控制是否把详细审核结果写入 `events`。
|
||||||
- 如果审核建议停止、审核本身失败,或用户在执行中按下 `Ctrl+C`,流程都会保存 checkpoint 并进入暂停状态;后续可使用 `resume` 继续。
|
- action 审核输入不包含完整运行态 `state_summary`,只包含当前 action 的结构化结果和必要诊断日志。
|
||||||
|
- `poll-download-progress` 和 `poll-upgrade-progress` 是单次进度查询 action,未完成时不会进入下一个 action;最大查询次数和间隔可通过 `config.txt` 或 chat `set` 热更新。
|
||||||
|
- `verify-ip` 会按 `VERIFY_INTERVAL_SEC` / `VERIFY_MAX_ATTEMPTS` 做健康检查重试,默认每 10 秒一次、最多 12 次。
|
||||||
|
- `llm test [文本]` 可测试当前 LLM client 是否可用。
|
||||||
|
- 如果审核建议停止、审核本身失败,或用户在执行中按下 `Ctrl+C`,流程都会保存 checkpoint 并进入暂停状态;后续可使用 `resume` 重试当前 action。
|
||||||
- `set KEY=VALUE` 和 `load params <路径>` 会热更新当前运行任务的参数,并回写运行中的 `config.txt` 和 checkpoint。
|
- `set KEY=VALUE` 和 `load params <路径>` 会热更新当前运行任务的参数,并回写运行中的 `config.txt` 和 checkpoint。
|
||||||
- `checkpoint` 会保存完整运行参数,请放在受控目录。
|
- `checkpoint` 会保存完整运行参数,请放在受控目录。
|
||||||
- `hybrid_node_mcp`、`resume`、`confirm` 如果需要执行 MCP action,请同时传入 `--mcp-config`。
|
- `hybrid_node_mcp`、`resume`、`rollback` 如果需要执行 MCP action,请同时传入 `--mcp-config`。
|
||||||
|
|||||||
@ -44,6 +44,14 @@ else
|
|||||||
python -m pip install -e .
|
python -m pip install -e .
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
PYINSTALLER_EXTRA_ARGS=()
|
||||||
|
if python -c "import importlib.util; raise SystemExit(0 if importlib.util.find_spec('prompt_toolkit') else 1)"; then
|
||||||
|
PYINSTALLER_EXTRA_ARGS+=(--collect-submodules prompt_toolkit --collect-data prompt_toolkit)
|
||||||
|
fi
|
||||||
|
if python -c "import importlib.util; raise SystemExit(0 if importlib.util.find_spec('rich') else 1)"; then
|
||||||
|
PYINSTALLER_EXTRA_ARGS+=(--collect-submodules rich)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "==> 使用 PyInstaller 生成自带 Python 运行时的可执行目录"
|
echo "==> 使用 PyInstaller 生成自带 Python 运行时的可执行目录"
|
||||||
python -m PyInstaller \
|
python -m PyInstaller \
|
||||||
--clean \
|
--clean \
|
||||||
@ -57,6 +65,7 @@ python -m PyInstaller \
|
|||||||
--collect-submodules pam_deploy_graph \
|
--collect-submodules pam_deploy_graph \
|
||||||
--collect-submodules langgraph \
|
--collect-submodules langgraph \
|
||||||
--hidden-import pam_deploy_graph.cli \
|
--hidden-import pam_deploy_graph.cli \
|
||||||
|
"${PYINSTALLER_EXTRA_ARGS[@]}" \
|
||||||
packaging/pyinstaller_entry.py
|
packaging/pyinstaller_entry.py
|
||||||
|
|
||||||
echo "==> 组装发布目录"
|
echo "==> 组装发布目录"
|
||||||
@ -96,7 +105,8 @@ PAM 部署 Agent 解压即用包
|
|||||||
run-global 执行全局阶段:token、版本、上传、发布、Node URL、下载任务。
|
run-global 执行全局阶段:token、版本、上传、发布、Node URL、下载任务。
|
||||||
run-deploy 执行完整部署流程:全局阶段 + 逐 IP 阶段。
|
run-deploy 执行完整部署流程:全局阶段 + 逐 IP 阶段。
|
||||||
resume 从 checkpoint 继续执行。
|
resume 从 checkpoint 继续执行。
|
||||||
confirm 处理待人工确认事项,目前用于失败 IP 回滚确认。
|
rollback 显式回滚失败 IP;不传 --ip 时使用当前失败 IP。
|
||||||
|
confirm 兼容旧 checkpoint 的人工确认命令,新流程通常不需要使用。
|
||||||
|
|
||||||
通用参数:
|
通用参数:
|
||||||
--config <路径>
|
--config <路径>
|
||||||
@ -110,7 +120,7 @@ PAM 部署 Agent 解压即用包
|
|||||||
hybrid_node_mcp PAM_HOME 走脚本,PAM_NODE 走 MCP。
|
hybrid_node_mcp PAM_HOME 走脚本,PAM_NODE 走 MCP。
|
||||||
|
|
||||||
--checkpoint <路径>
|
--checkpoint <路径>
|
||||||
checkpoint JSON 路径。用于断点续跑和人工确认恢复。
|
checkpoint JSON 路径。用于断点续跑和显式回滚。
|
||||||
示例:runtime/checkpoints/demo.json
|
示例:runtime/checkpoints/demo.json
|
||||||
|
|
||||||
--target-ip <IP>
|
--target-ip <IP>
|
||||||
@ -119,7 +129,7 @@ PAM 部署 Agent 解压即用包
|
|||||||
--mcp-config <路径>
|
--mcp-config <路径>
|
||||||
MCP client JSON 配置文件。通常配置 server_url 和独立鉴权信息;
|
MCP client JSON 配置文件。通常配置 server_url 和独立鉴权信息;
|
||||||
Agent 会从 server list_tools 自动发现 tools。hybrid_node_mcp 策略、
|
Agent 会从 server list_tools 自动发现 tools。hybrid_node_mcp 策略、
|
||||||
resume 或 confirm 需要执行 MCP action 时使用。
|
resume 或 rollback 需要执行 MCP action 时使用。
|
||||||
示例:mcp_client.example.json
|
示例:mcp_client.example.json
|
||||||
|
|
||||||
--confirm
|
--confirm
|
||||||
@ -127,8 +137,9 @@ PAM 部署 Agent 解压即用包
|
|||||||
chat 模式会在会话中要求输入 run,并分别确认参数、目标范围和最终执行。
|
chat 模式会在会话中要求输入 run,并分别确认参数、目标范围和最终执行。
|
||||||
|
|
||||||
--analyze-actions
|
--analyze-actions
|
||||||
每个 action 完成后追加 LLM/规则诊断建议。诊断只作为辅助建议,
|
每个 action 完成后的 LLM/规则审核默认都会执行;该参数只控制
|
||||||
不会自动决定继续、回滚或修改参数。
|
是否把详细审核结果写入 events。审核建议停止时流程会暂停,
|
||||||
|
resume 会重试当前 action。
|
||||||
|
|
||||||
LLM 参数:
|
LLM 参数:
|
||||||
--llm-base-url <URL>
|
--llm-base-url <URL>
|
||||||
@ -140,10 +151,25 @@ LLM 参数:
|
|||||||
--llm-model <模型名>
|
--llm-model <模型名>
|
||||||
LLM 模型名称。也可通过环境变量 PAM_LLM_MODEL 提供。
|
LLM 模型名称。也可通过环境变量 PAM_LLM_MODEL 提供。
|
||||||
|
|
||||||
|
--llm-action-analysis-prompt-file <路径>
|
||||||
|
自定义 action 审核提示词文件。打包内置基线:
|
||||||
|
prompts/action_review.txt
|
||||||
|
|
||||||
LLM 环境变量:
|
LLM 环境变量:
|
||||||
PAM_LLM_BASE_URL
|
PAM_LLM_BASE_URL
|
||||||
PAM_LLM_API_KEY
|
PAM_LLM_API_KEY
|
||||||
PAM_LLM_MODEL
|
PAM_LLM_MODEL
|
||||||
|
PAM_LLM_ACTION_ANALYSIS_PROMPT_FILE
|
||||||
|
|
||||||
|
日志环境变量:
|
||||||
|
PAM_AGENT_LOG_FILE
|
||||||
|
运行日志路径,默认 logs/pam_deploy_agent.log。
|
||||||
|
|
||||||
|
PAM_AGENT_LOG_LEVEL
|
||||||
|
日志级别,默认 INFO。排查 LLM/MCP 时可临时设为 DEBUG。
|
||||||
|
|
||||||
|
PAM_AGENT_LOG_RETENTION_DAYS
|
||||||
|
历史日切日志保留数量,默认 14。设为 0 时不自动清理历史切分文件。
|
||||||
|
|
||||||
示例:
|
示例:
|
||||||
./run.sh chat --config doc_scripts/config.txt.example --strategy fake --checkpoint runtime/checkpoints/demo.json
|
./run.sh chat --config doc_scripts/config.txt.example --strategy fake --checkpoint runtime/checkpoints/demo.json
|
||||||
@ -154,8 +180,10 @@ LLM 环境变量:
|
|||||||
|
|
||||||
./run.sh run-deploy --config doc_scripts/config.txt.example --strategy fake --checkpoint runtime/checkpoints/demo.json --confirm
|
./run.sh run-deploy --config doc_scripts/config.txt.example --strategy fake --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
|
|
||||||
./run.sh confirm --checkpoint runtime/checkpoints/demo.json --decision approve --confirm
|
# 失败或审核阻断暂停后,修复外部环境并从当前 action 重试:
|
||||||
# 如果进程中断或需要再次续跑:
|
./run.sh resume --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
|
# 需要回滚失败 IP 时显式执行:
|
||||||
|
./run.sh rollback --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
./run.sh resume --checkpoint runtime/checkpoints/demo.json --confirm
|
./run.sh resume --checkpoint runtime/checkpoints/demo.json --confirm
|
||||||
|
|
||||||
查看子命令原始参数:
|
查看子命令原始参数:
|
||||||
@ -167,11 +195,14 @@ LLM 环境变量:
|
|||||||
2. doc_scripts 只包含运行必需文件:deploy.sh、config.txt.example、PAM_AUTO_DEPLY_SKILL.md。
|
2. doc_scripts 只包含运行必需文件:deploy.sh、config.txt.example、PAM_AUTO_DEPLY_SKILL.md。
|
||||||
3. prompts/action_review.txt 是当前默认 action 审核提示词基线,可复制后自行修改。
|
3. prompts/action_review.txt 是当前默认 action 审核提示词基线,可复制后自行修改。
|
||||||
4. mcp_client.example.json 是 MCP server URL + 独立鉴权配置示例,需要按真实 MCP server 修改。
|
4. mcp_client.example.json 是 MCP server URL + 独立鉴权配置示例,需要按真实 MCP server 修改。
|
||||||
5. confirm 会通过 LangGraph interrupt resume 处理确认,并继续后续图节点;进程中断时再使用 resume。
|
5. action 失败或审核阻断后会暂停;修复后用 resume 从当前 action 重试,需要回滚时用 rollback 显式执行。
|
||||||
6. chat 会在执行前归一化并展示实际写入脚本配置的参数;script_only / hybrid_node_mcp 会先检查 ZIP_FILE_PATH 是否存在。
|
6. chat 会在执行前归一化并展示实际写入脚本配置的参数;script_only / hybrid_node_mcp 会先检查 ZIP_FILE_PATH 是否存在。
|
||||||
7. chat 执行过程中会播报每个 action 的开始、完成或失败;普通问候不会触发 LLM/结构化分析。
|
7. PARENT_VERSION_NUMBER 是云下载可选参数;空值不发送,非空时传给 parentVersionNumber。
|
||||||
8. chat 内可使用 params、events、list checkpoints、load checkpoint、load params、llm config、mcp config 等命令。
|
8. chat 执行过程中会播报每个 action 的开始、完成或失败;非内置输入默认交给 LLM 普通对话,不会自动触发部署 workflow。
|
||||||
9. checkpoint 会保存完整运行参数,请放在受控目录。
|
9. chat 普通对话优先流式展示;模型返回的 <think>...</think> 思考内容会被过滤,不展示也不写入日志。
|
||||||
|
10. chat 内可使用 ask、log analyze、action propose、action run、params、events、rollback、list checkpoints、load checkpoint、load params、llm config、llm test、mcp config 等命令。
|
||||||
|
11. 日志默认写入 logs/pam_deploy_agent.log,按天切分并默认保留 14 个历史日切文件;日志会脱敏 token、secret、api_key、Authorization 等字段。
|
||||||
|
12. checkpoint 会保存完整运行参数,请放在受控目录。
|
||||||
HELP_TEXT
|
HELP_TEXT
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@ -2,9 +2,14 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
|
||||||
from .constants import ALLOWED_ACTIONS, HOME_ACTIONS, NODE_ACTIONS
|
from .constants import ALLOWED_ACTIONS, HOME_ACTIONS, NODE_ACTIONS
|
||||||
|
from .logging_utils import json_for_log
|
||||||
from .models import AgentState, BackendName, ExecutionStrategy, ActionResult
|
from .models import AgentState, BackendName, ExecutionStrategy, ActionResult
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def build_action_backends(strategy: ExecutionStrategy) -> dict[str, BackendName]:
|
def build_action_backends(strategy: ExecutionStrategy) -> dict[str, BackendName]:
|
||||||
"""根据执行策略生成每个 action 对应的后端类型。"""
|
"""根据执行策略生成每个 action 对应的后端类型。"""
|
||||||
@ -33,6 +38,13 @@ class ActionRouter:
|
|||||||
backend = state.action_backends.get(action)
|
backend = state.action_backends.get(action)
|
||||||
if not backend:
|
if not backend:
|
||||||
raise ValueError(f"action 未配置路由: {action}")
|
raise ValueError(f"action 未配置路由: {action}")
|
||||||
|
logger.info(
|
||||||
|
"ActionRouter 路由 action run_id=%s action=%s backend=%s kwargs=%s",
|
||||||
|
state.run_id,
|
||||||
|
action,
|
||||||
|
backend,
|
||||||
|
json_for_log(kwargs),
|
||||||
|
)
|
||||||
if backend == "script":
|
if backend == "script":
|
||||||
return self.script_runner.run(
|
return self.script_runner.run(
|
||||||
action,
|
action,
|
||||||
@ -48,6 +60,13 @@ class ActionRouter:
|
|||||||
mcp_kwargs = dict(kwargs)
|
mcp_kwargs = dict(kwargs)
|
||||||
hash_code = mcp_kwargs.pop("hash_code", None) or state.hash_code
|
hash_code = mcp_kwargs.pop("hash_code", None) or state.hash_code
|
||||||
node_url = mcp_kwargs.pop("node_url", None) or state.node_url
|
node_url = mcp_kwargs.pop("node_url", None) or state.node_url
|
||||||
|
logger.info(
|
||||||
|
"ActionRouter 调用 MCP action run_id=%s action=%s hash_code_present=%s node_url_present=%s",
|
||||||
|
state.run_id,
|
||||||
|
action,
|
||||||
|
bool(hash_code),
|
||||||
|
bool(node_url),
|
||||||
|
)
|
||||||
return self.mcp_runner.run(
|
return self.mcp_runner.run(
|
||||||
action,
|
action,
|
||||||
params=state.params,
|
params=state.params,
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
@ -4,6 +4,7 @@ from __future__ import annotations
|
|||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
import json
|
import json
|
||||||
|
import logging
|
||||||
from dataclasses import asdict
|
from dataclasses import asdict
|
||||||
|
|
||||||
from .agent import PamDeployAgent
|
from .agent import PamDeployAgent
|
||||||
@ -11,9 +12,12 @@ from .checkpoint_store import load_agent_state, redact_mapping
|
|||||||
from .interactive import run_interactive_chat
|
from .interactive import run_interactive_chat
|
||||||
from .langgraph_runtime import LangGraphDeploymentRuntime, LangGraphRunResult
|
from .langgraph_runtime import LangGraphDeploymentRuntime, LangGraphRunResult
|
||||||
from .llm import build_llm_client
|
from .llm import build_llm_client
|
||||||
|
from .logging_utils import configure_logging, json_for_log
|
||||||
from .mcp_factory import build_mcp_runner_from_config
|
from .mcp_factory import build_mcp_runner_from_config
|
||||||
from .params_loader import load_params_file
|
from .params_loader import load_params_file
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def add_llm_args(parser: argparse.ArgumentParser) -> None:
|
def add_llm_args(parser: argparse.ArgumentParser) -> None:
|
||||||
"""为子命令追加真实 LLM 配置参数。"""
|
"""为子命令追加真实 LLM 配置参数。"""
|
||||||
@ -124,8 +128,23 @@ def main() -> None:
|
|||||||
add_mcp_args(confirm)
|
add_mcp_args(confirm)
|
||||||
add_action_analysis_arg(confirm)
|
add_action_analysis_arg(confirm)
|
||||||
|
|
||||||
|
rollback = sub.add_parser("rollback")
|
||||||
|
rollback.add_argument("--checkpoint", required=True)
|
||||||
|
rollback.add_argument("--ip", help="要回滚的 IP;不传时使用当前失败 IP")
|
||||||
|
rollback.add_argument("--stop-first", dest="stop_first", action="store_true", default=None, help="回滚前先停机")
|
||||||
|
rollback.add_argument("--no-stop-first", dest="stop_first", action="store_false", default=None, help="回滚前不先停机")
|
||||||
|
rollback.add_argument("--note", default="")
|
||||||
|
rollback.add_argument("--confirm", action="store_true")
|
||||||
|
add_llm_args(rollback)
|
||||||
|
add_mcp_args(rollback)
|
||||||
|
add_action_analysis_arg(rollback)
|
||||||
|
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
|
log_path = configure_logging()
|
||||||
|
logger.info("CLI 启动 command=%s args=%s log_path=%s", args.command, json_for_log(vars(args)), log_path)
|
||||||
params = load_params_file(args.config) if getattr(args, "config", None) else {}
|
params = load_params_file(args.config) if getattr(args, "config", None) else {}
|
||||||
|
if getattr(args, "config", None):
|
||||||
|
logger.info("参数文件已加载 command=%s config=%s params=%s", args.command, args.config, json_for_log(params))
|
||||||
llm_client = None
|
llm_client = None
|
||||||
if args.command != "preview":
|
if args.command != "preview":
|
||||||
llm_client = build_llm_client(
|
llm_client = build_llm_client(
|
||||||
@ -136,7 +155,9 @@ def main() -> None:
|
|||||||
)
|
)
|
||||||
mcp_runner = None
|
mcp_runner = None
|
||||||
if getattr(args, "mcp_config", None):
|
if getattr(args, "mcp_config", None):
|
||||||
|
logger.info("开始加载 MCP 配置 path=%s", args.mcp_config)
|
||||||
mcp_runner = build_mcp_runner_from_config(args.mcp_config)
|
mcp_runner = build_mcp_runner_from_config(args.mcp_config)
|
||||||
|
logger.info("MCP 配置加载完成 path=%s runner=%s", args.mcp_config, type(mcp_runner).__name__)
|
||||||
agent = PamDeployAgent(
|
agent = PamDeployAgent(
|
||||||
llm_client=llm_client,
|
llm_client=llm_client,
|
||||||
mcp_runner=mcp_runner,
|
mcp_runner=mcp_runner,
|
||||||
@ -144,12 +165,15 @@ def main() -> None:
|
|||||||
)
|
)
|
||||||
|
|
||||||
if args.command == "analyze":
|
if args.command == "analyze":
|
||||||
|
logger.info("开始执行 analyze text_len=%s", len(args.text))
|
||||||
result = agent.analyze_request(args.text, params)
|
result = agent.analyze_request(args.text, params)
|
||||||
payload = redact_mapping({key: asdict(value) for key, value in result.items()})
|
payload = redact_mapping({key: asdict(value) for key, value in result.items()})
|
||||||
|
logger.info("analyze 完成 result=%s", json_for_log(payload))
|
||||||
print(json.dumps(payload, ensure_ascii=False, indent=2))
|
print(json.dumps(payload, ensure_ascii=False, indent=2))
|
||||||
return
|
return
|
||||||
|
|
||||||
if args.command == "chat":
|
if args.command == "chat":
|
||||||
|
logger.info("进入 chat 模式 strategy=%s checkpoint=%s target_ips=%s", args.strategy, args.checkpoint, args.target_ip)
|
||||||
run_interactive_chat(
|
run_interactive_chat(
|
||||||
agent=agent,
|
agent=agent,
|
||||||
params=params,
|
params=params,
|
||||||
@ -160,11 +184,13 @@ def main() -> None:
|
|||||||
return
|
return
|
||||||
|
|
||||||
if args.command == "preview":
|
if args.command == "preview":
|
||||||
|
logger.info("执行 preview strategy=%s", args.strategy)
|
||||||
print(agent.preview(params, args.strategy))
|
print(agent.preview(params, args.strategy))
|
||||||
return
|
return
|
||||||
|
|
||||||
require_confirm(args)
|
require_confirm(args)
|
||||||
if args.command == "run-global":
|
if args.command == "run-global":
|
||||||
|
logger.info("开始 run-global strategy=%s checkpoint=%s", args.strategy, args.checkpoint)
|
||||||
state = agent.create_state(
|
state = agent.create_state(
|
||||||
params=params,
|
params=params,
|
||||||
execution_strategy=args.strategy,
|
execution_strategy=args.strategy,
|
||||||
@ -177,6 +203,7 @@ def main() -> None:
|
|||||||
return
|
return
|
||||||
|
|
||||||
if args.command == "resume":
|
if args.command == "resume":
|
||||||
|
logger.info("开始 resume checkpoint=%s", args.checkpoint)
|
||||||
state = load_agent_state(args.checkpoint)
|
state = load_agent_state(args.checkpoint)
|
||||||
state.checkpoint_path = state.checkpoint_path or args.checkpoint
|
state.checkpoint_path = state.checkpoint_path or args.checkpoint
|
||||||
if state.paused:
|
if state.paused:
|
||||||
@ -186,17 +213,29 @@ def main() -> None:
|
|||||||
return
|
return
|
||||||
|
|
||||||
if args.command == "confirm":
|
if args.command == "confirm":
|
||||||
|
logger.info("开始 confirm checkpoint=%s decision=%s note_len=%s", args.checkpoint, args.decision, len(args.note))
|
||||||
state = load_agent_state(args.checkpoint)
|
state = load_agent_state(args.checkpoint)
|
||||||
state.checkpoint_path = state.checkpoint_path or args.checkpoint
|
state.checkpoint_path = state.checkpoint_path or args.checkpoint
|
||||||
runtime = LangGraphDeploymentRuntime(agent=agent, flow="deploy")
|
if not state.pending_confirmation:
|
||||||
first = runtime.start(state)
|
raise SystemExit("当前 checkpoint 没有待确认事项;新流程请使用 resume 重试,或 rollback 显式回滚。")
|
||||||
if first.interrupted:
|
state = agent.confirm_pending(state, approved=args.decision == "approve", operator_note=args.note)
|
||||||
result = runtime.resume(approved=args.decision == "approve", note=args.note)
|
print(agent.render_report(state))
|
||||||
print_graph_result(agent, result)
|
print_pause_payload(agent, state)
|
||||||
return
|
|
||||||
print_graph_result(agent, first)
|
|
||||||
return
|
return
|
||||||
|
|
||||||
|
if args.command == "rollback":
|
||||||
|
logger.info("开始 rollback checkpoint=%s ip=%s stop_first=%s note_len=%s", args.checkpoint, args.ip, args.stop_first, len(args.note))
|
||||||
|
state = load_agent_state(args.checkpoint)
|
||||||
|
state.checkpoint_path = state.checkpoint_path or args.checkpoint
|
||||||
|
ip = args.ip or _find_current_failed_ip(state)
|
||||||
|
if not ip:
|
||||||
|
raise SystemExit("未找到当前失败 IP,请传入 --ip。")
|
||||||
|
state = agent.rollback_ip(state, ip, stop_first=args.stop_first, operator_note=args.note)
|
||||||
|
print(agent.render_report(state))
|
||||||
|
print_pause_payload(agent, state)
|
||||||
|
return
|
||||||
|
|
||||||
|
logger.info("开始 run-deploy strategy=%s checkpoint=%s target_ips=%s", args.strategy, args.checkpoint, args.target_ip)
|
||||||
state = agent.create_state(
|
state = agent.create_state(
|
||||||
params=params,
|
params=params,
|
||||||
execution_strategy=args.strategy,
|
execution_strategy=args.strategy,
|
||||||
@ -207,5 +246,16 @@ def main() -> None:
|
|||||||
print_graph_result(agent, result)
|
print_graph_result(agent, result)
|
||||||
|
|
||||||
|
|
||||||
|
def _find_current_failed_ip(state) -> str:
|
||||||
|
"""从 checkpoint state 中找一个适合显式回滚的失败 IP。"""
|
||||||
|
context_ip = str((state.review_context or {}).get("ip", ""))
|
||||||
|
if context_ip and context_ip in state.ip_states:
|
||||||
|
return context_ip
|
||||||
|
for ip, ip_state in state.ip_states.items():
|
||||||
|
if ip_state.get("status") == "FAILED":
|
||||||
|
return ip
|
||||||
|
return ""
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
main()
|
main()
|
||||||
|
|||||||
@ -17,6 +17,12 @@ CONFIG_KEYS = (
|
|||||||
"ACTION_TYPE",
|
"ACTION_TYPE",
|
||||||
"TIMEOUT",
|
"TIMEOUT",
|
||||||
"LOG_NAME",
|
"LOG_NAME",
|
||||||
|
"PARENT_VERSION_NUMBER",
|
||||||
|
"POLL_INTERVAL_SEC",
|
||||||
|
"DOWNLOAD_POLL_MAX_ATTEMPTS",
|
||||||
|
"UPGRADE_POLL_MAX_ATTEMPTS",
|
||||||
|
"VERIFY_INTERVAL_SEC",
|
||||||
|
"VERIFY_MAX_ATTEMPTS",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@ -64,6 +64,12 @@ DEFAULT_PARAMS = {
|
|||||||
"ACTION_TYPE": "FULL",
|
"ACTION_TYPE": "FULL",
|
||||||
"TIMEOUT": 120,
|
"TIMEOUT": 120,
|
||||||
"LOG_NAME": "app.log",
|
"LOG_NAME": "app.log",
|
||||||
|
"PARENT_VERSION_NUMBER": "",
|
||||||
|
"POLL_INTERVAL_SEC": 2,
|
||||||
|
"DOWNLOAD_POLL_MAX_ATTEMPTS": 60,
|
||||||
|
"UPGRADE_POLL_MAX_ATTEMPTS": 600,
|
||||||
|
"VERIFY_INTERVAL_SEC": 10,
|
||||||
|
"VERIFY_MAX_ATTEMPTS": 12,
|
||||||
}
|
}
|
||||||
|
|
||||||
# 日志、报告和 LLM 输入中需要脱敏的字段。
|
# 日志、报告和 LLM 输入中需要脱敏的字段。
|
||||||
@ -73,6 +79,12 @@ SENSITIVE_KEYS = {
|
|||||||
"MCP_TOKEN",
|
"MCP_TOKEN",
|
||||||
"TOKEN",
|
"TOKEN",
|
||||||
"Authorization",
|
"Authorization",
|
||||||
|
"authorization",
|
||||||
"access_token",
|
"access_token",
|
||||||
"ACCESS_TOKEN",
|
"ACCESS_TOKEN",
|
||||||
|
"api_key",
|
||||||
|
"API_KEY",
|
||||||
|
"PAM_LLM_API_KEY",
|
||||||
|
"password",
|
||||||
|
"PASSWORD",
|
||||||
}
|
}
|
||||||
|
|||||||
@ -43,6 +43,14 @@ class FakeActionRunner:
|
|||||||
return {"ACTION": action, "NODE_URL": "https://fake-node.local"}
|
return {"ACTION": action, "NODE_URL": "https://fake-node.local"}
|
||||||
if action == "get-online-ips":
|
if action == "get-online-ips":
|
||||||
return {"ACTION": action, "COUNT": "2", "IP": ["192.168.1.10", "192.168.1.11"]}
|
return {"ACTION": action, "COUNT": "2", "IP": ["192.168.1.10", "192.168.1.11"]}
|
||||||
|
if action == "poll-download-progress":
|
||||||
|
return {
|
||||||
|
"ACTION": action,
|
||||||
|
"STEP": "DONE",
|
||||||
|
"RATE_OF_PROGRESS": "100",
|
||||||
|
"MSG": "success",
|
||||||
|
"MESSAGE": "success",
|
||||||
|
}
|
||||||
if action == "upgrade-ip":
|
if action == "upgrade-ip":
|
||||||
return {"ACTION": action, "IP": kwargs.get("ip", ""), "RESULT": "TASK_CREATED"}
|
return {"ACTION": action, "IP": kwargs.get("ip", ""), "RESULT": "TASK_CREATED"}
|
||||||
if action == "poll-upgrade-progress":
|
if action == "poll-upgrade-progress":
|
||||||
@ -51,6 +59,7 @@ class FakeActionRunner:
|
|||||||
"IP": kwargs.get("ip", ""),
|
"IP": kwargs.get("ip", ""),
|
||||||
"STEP": "DONE",
|
"STEP": "DONE",
|
||||||
"RATE_OF_PROGRESS": "100",
|
"RATE_OF_PROGRESS": "100",
|
||||||
|
"MSG": "success",
|
||||||
"MESSAGE": "success",
|
"MESSAGE": "success",
|
||||||
}
|
}
|
||||||
if action == "start-ip":
|
if action == "start-ip":
|
||||||
|
|||||||
@ -13,7 +13,7 @@ def build_langgraph(agent: PamDeployAgent | None = None, flow: GraphFlow = "depl
|
|||||||
|
|
||||||
输入 state 支持直接传 `params`,图内会先调用 `create_state`;CLI/chat
|
输入 state 支持直接传 `params`,图内会先调用 `create_state`;CLI/chat
|
||||||
默认使用 `LangGraphDeploymentRuntime`,该 runtime 直接接收 `AgentState`
|
默认使用 `LangGraphDeploymentRuntime`,该 runtime 直接接收 `AgentState`
|
||||||
并支持 interrupt/checkpointer。
|
并由业务 checkpoint 支撑断点续跑。
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
from langgraph.graph import END, START, StateGraph
|
from langgraph.graph import END, START, StateGraph
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
@ -1,15 +1,18 @@
|
|||||||
"""chat 人工确认点的 LangGraph interrupt 运行器。"""
|
"""PAM 部署 Agent 的 action 级 LangGraph 运行器。"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
from dataclasses import dataclass, field
|
from dataclasses import dataclass, field
|
||||||
from typing import Any, Literal
|
from typing import Any, Literal
|
||||||
from uuid import uuid4
|
from uuid import uuid4
|
||||||
|
|
||||||
from .agent import PamDeployAgent
|
from .agent import PamDeployAgent
|
||||||
|
from .logging_utils import json_for_log
|
||||||
from .models import AgentState
|
from .models import AgentState
|
||||||
|
|
||||||
GraphFlow = Literal["global", "deploy"]
|
GraphFlow = Literal["global", "deploy"]
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
@dataclass(slots=True)
|
@dataclass(slots=True)
|
||||||
@ -24,7 +27,7 @@ class LangGraphRunResult:
|
|||||||
|
|
||||||
|
|
||||||
class LangGraphDeploymentRuntime:
|
class LangGraphDeploymentRuntime:
|
||||||
"""用 LangGraph 节点调度部署 action,并托管人工确认 interrupt。"""
|
"""用 LangGraph 节点调度部署 action。"""
|
||||||
|
|
||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
@ -39,39 +42,56 @@ class LangGraphDeploymentRuntime:
|
|||||||
self.flow = flow
|
self.flow = flow
|
||||||
self._waiting_confirmation = False
|
self._waiting_confirmation = False
|
||||||
self._graph = build_deployment_graph(agent=self.agent, flow=self.flow)
|
self._graph = build_deployment_graph(agent=self.agent, flow=self.flow)
|
||||||
|
logger.info(
|
||||||
|
"LangGraph runtime 初始化 thread_id=%s flow=%s agent=%s",
|
||||||
|
self.thread_id,
|
||||||
|
self.flow,
|
||||||
|
type(self.agent).__name__,
|
||||||
|
)
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def waiting_confirmation(self) -> bool:
|
def waiting_confirmation(self) -> bool:
|
||||||
"""返回当前 LangGraph 会话是否停在 interrupt 确认点。"""
|
"""返回当前 LangGraph 会话是否停在旧版 interrupt 确认点。"""
|
||||||
return self._waiting_confirmation
|
return self._waiting_confirmation
|
||||||
|
|
||||||
def start(self, state: AgentState) -> LangGraphRunResult:
|
def start(self, state: AgentState) -> LangGraphRunResult:
|
||||||
"""从给定 AgentState 开始执行,直到结束或遇到人工确认点。"""
|
"""从给定 AgentState 开始执行,直到结束或业务状态暂停。"""
|
||||||
self._waiting_confirmation = False
|
self._waiting_confirmation = False
|
||||||
|
logger.info(
|
||||||
|
"LangGraph start run_id=%s thread_id=%s flow=%s paused=%s pending=%s",
|
||||||
|
state.run_id,
|
||||||
|
self.thread_id,
|
||||||
|
self.flow,
|
||||||
|
state.paused,
|
||||||
|
state.pending_confirmation,
|
||||||
|
)
|
||||||
return self._consume(self._graph.stream({"agent_state": state}, self._config()))
|
return self._consume(self._graph.stream({"agent_state": state}, self._config()))
|
||||||
|
|
||||||
def resume(self, *, approved: bool, note: str = "") -> LangGraphRunResult:
|
def resume(self, *, approved: bool, note: str = "") -> LangGraphRunResult:
|
||||||
"""把人工确认结果交回 LangGraph,并继续执行。"""
|
"""兼容旧版 LangGraph interrupt 确认恢复;新流程通常不使用。"""
|
||||||
try:
|
try:
|
||||||
from langgraph.types import Command
|
from langgraph.types import Command
|
||||||
except ImportError as exc: # pragma: no cover - 依赖缺失时由调用方降级
|
except ImportError as exc: # pragma: no cover - 依赖缺失时由调用方降级
|
||||||
raise RuntimeError("未安装 langgraph,无法恢复 interrupt。") from exc
|
raise RuntimeError("未安装 langgraph,无法恢复 interrupt。") from exc
|
||||||
|
|
||||||
decision = {"approved": approved, "note": note}
|
decision = {"approved": approved, "note": note}
|
||||||
|
logger.info("LangGraph resume thread_id=%s decision=%s note_len=%s", self.thread_id, approved, len(note))
|
||||||
return self._consume(self._graph.stream(Command(resume=decision), self._config()))
|
return self._consume(self._graph.stream(Command(resume=decision), self._config()))
|
||||||
|
|
||||||
def _config(self) -> dict[str, Any]:
|
def _config(self) -> dict[str, Any]:
|
||||||
"""生成 LangGraph checkpointer 使用的线程配置。"""
|
"""生成 LangGraph checkpointer 使用的线程配置。"""
|
||||||
return {"configurable": {"thread_id": self.thread_id}}
|
return {"configurable": {"thread_id": self.thread_id}, "recursion_limit": 10000}
|
||||||
|
|
||||||
def _consume(self, chunks: Any) -> LangGraphRunResult:
|
def _consume(self, chunks: Any) -> LangGraphRunResult:
|
||||||
"""消费 LangGraph stream 输出,提取状态、报告和 interrupt 请求。"""
|
"""消费 LangGraph stream 输出,提取状态、报告和旧版 interrupt 请求。"""
|
||||||
result = LangGraphRunResult()
|
result = LangGraphRunResult()
|
||||||
for chunk in chunks:
|
for chunk in chunks:
|
||||||
result.chunks.append(chunk)
|
result.chunks.append(chunk)
|
||||||
|
logger.info("LangGraph chunk=%s", json_for_log(chunk, max_text_len=1600))
|
||||||
if "__interrupt__" in chunk:
|
if "__interrupt__" in chunk:
|
||||||
result.interrupted = True
|
result.interrupted = True
|
||||||
result.confirmation = _extract_interrupt_value(chunk["__interrupt__"])
|
result.confirmation = _extract_interrupt_value(chunk["__interrupt__"])
|
||||||
|
logger.info("LangGraph interrupt thread_id=%s confirmation=%s", self.thread_id, json_for_log(result.confirmation))
|
||||||
continue
|
continue
|
||||||
|
|
||||||
for value in chunk.values():
|
for value in chunk.values():
|
||||||
@ -83,20 +103,30 @@ class LangGraphDeploymentRuntime:
|
|||||||
result.report = value["report"]
|
result.report = value["report"]
|
||||||
|
|
||||||
self._waiting_confirmation = result.interrupted
|
self._waiting_confirmation = result.interrupted
|
||||||
|
logger.info(
|
||||||
|
"LangGraph consume 完成 thread_id=%s interrupted=%s waiting=%s state_run_id=%s report_len=%s",
|
||||||
|
self.thread_id,
|
||||||
|
result.interrupted,
|
||||||
|
self._waiting_confirmation,
|
||||||
|
result.state.run_id if result.state else "",
|
||||||
|
len(result.report),
|
||||||
|
)
|
||||||
return result
|
return result
|
||||||
|
|
||||||
|
|
||||||
def build_deployment_graph(*, agent: PamDeployAgent, flow: GraphFlow = "deploy"):
|
def build_deployment_graph(*, agent: PamDeployAgent, flow: GraphFlow = "deploy"):
|
||||||
"""构建 action 级别的 LangGraph 部署图。"""
|
"""构建 action 级别的 LangGraph 部署图。"""
|
||||||
|
logger.info("开始构建 LangGraph 部署图 flow=%s", flow)
|
||||||
try:
|
try:
|
||||||
from langgraph.checkpoint.memory import InMemorySaver
|
from langgraph.checkpoint.memory import InMemorySaver
|
||||||
from langgraph.graph import END, START, StateGraph
|
from langgraph.graph import END, START, StateGraph
|
||||||
from langgraph.types import interrupt
|
|
||||||
except ImportError as exc: # pragma: no cover - 依赖缺失时由调用方降级
|
except ImportError as exc: # pragma: no cover - 依赖缺失时由调用方降级
|
||||||
raise RuntimeError("未安装 langgraph,无法启用部署图。") from exc
|
raise RuntimeError("未安装 langgraph,无法启用部署图。") from exc
|
||||||
|
|
||||||
def entry_node(state: dict[str, Any]) -> dict[str, Any]:
|
def entry_node(state: dict[str, Any]) -> dict[str, Any]:
|
||||||
"""保留入口节点,便于统一路由已有 state 或恢复 state。"""
|
"""保留入口节点,便于统一路由已有 state 或恢复 state。"""
|
||||||
|
agent_state = state["agent_state"]
|
||||||
|
logger.info("LangGraph entry_node run_id=%s pending=%s paused=%s", agent_state.run_id, agent_state.pending_confirmation, agent_state.paused)
|
||||||
return {"agent_state": state["agent_state"]}
|
return {"agent_state": state["agent_state"]}
|
||||||
|
|
||||||
def global_action_node(state: dict[str, Any]) -> dict[str, Any]:
|
def global_action_node(state: dict[str, Any]) -> dict[str, Any]:
|
||||||
@ -104,6 +134,7 @@ def build_deployment_graph(*, agent: PamDeployAgent, flow: GraphFlow = "deploy")
|
|||||||
agent_state = state["agent_state"]
|
agent_state = state["agent_state"]
|
||||||
action = agent.next_global_action(agent_state)
|
action = agent.next_global_action(agent_state)
|
||||||
if action:
|
if action:
|
||||||
|
logger.info("LangGraph global_action_node run_id=%s action=%s", agent_state.run_id, action)
|
||||||
agent.run_global_action(agent_state, action)
|
agent.run_global_action(agent_state, action)
|
||||||
return {"agent_state": agent_state}
|
return {"agent_state": agent_state}
|
||||||
|
|
||||||
@ -112,8 +143,10 @@ def build_deployment_graph(*, agent: PamDeployAgent, flow: GraphFlow = "deploy")
|
|||||||
agent_state = state["agent_state"]
|
agent_state = state["agent_state"]
|
||||||
work = agent.next_ip_action(agent_state)
|
work = agent.next_ip_action(agent_state)
|
||||||
if work is None:
|
if work is None:
|
||||||
|
logger.info("LangGraph prepare_ip_node 无待执行 IP action run_id=%s", agent_state.run_id)
|
||||||
return {"agent_state": agent_state, "current_ip": "", "current_ip_action": ""}
|
return {"agent_state": agent_state, "current_ip": "", "current_ip_action": ""}
|
||||||
ip, action = work
|
ip, action = work
|
||||||
|
logger.info("LangGraph prepare_ip_node run_id=%s ip=%s action=%s", agent_state.run_id, ip, action)
|
||||||
return {"agent_state": agent_state, "current_ip": ip, "current_ip_action": action}
|
return {"agent_state": agent_state, "current_ip": ip, "current_ip_action": action}
|
||||||
|
|
||||||
def ip_action_node(state: dict[str, Any]) -> dict[str, Any]:
|
def ip_action_node(state: dict[str, Any]) -> dict[str, Any]:
|
||||||
@ -122,56 +155,56 @@ def build_deployment_graph(*, agent: PamDeployAgent, flow: GraphFlow = "deploy")
|
|||||||
ip = str(state.get("current_ip", ""))
|
ip = str(state.get("current_ip", ""))
|
||||||
action = str(state.get("current_ip_action", ""))
|
action = str(state.get("current_ip_action", ""))
|
||||||
if ip and action:
|
if ip and action:
|
||||||
|
logger.info("LangGraph ip_action_node run_id=%s ip=%s action=%s", agent_state.run_id, ip, action)
|
||||||
agent.run_ip_action(agent_state, ip, action)
|
agent.run_ip_action(agent_state, ip, action)
|
||||||
return {"agent_state": agent_state, "current_ip": "", "current_ip_action": ""}
|
return {"agent_state": agent_state, "current_ip": "", "current_ip_action": ""}
|
||||||
|
|
||||||
def confirm_node(state: dict[str, Any]) -> dict[str, Any]:
|
|
||||||
"""把确认请求交给 LangGraph interrupt,并在恢复后执行确认动作。"""
|
|
||||||
agent_state = state["agent_state"]
|
|
||||||
request = agent.build_confirmation_request(agent_state)
|
|
||||||
decision = interrupt(request)
|
|
||||||
approved, note = _parse_confirmation_decision(decision)
|
|
||||||
agent_state = agent.confirm_pending(
|
|
||||||
agent_state,
|
|
||||||
approved=approved,
|
|
||||||
operator_note=note,
|
|
||||||
)
|
|
||||||
return {"agent_state": agent_state}
|
|
||||||
|
|
||||||
def report_node(state: dict[str, Any]) -> dict[str, Any]:
|
def report_node(state: dict[str, Any]) -> dict[str, Any]:
|
||||||
"""渲染当前状态报告。"""
|
"""渲染当前状态报告。"""
|
||||||
|
agent_state = state["agent_state"]
|
||||||
|
logger.info("LangGraph report_node run_id=%s pending=%s paused=%s", agent_state.run_id, agent_state.pending_confirmation, agent_state.paused)
|
||||||
return {
|
return {
|
||||||
"agent_state": state["agent_state"],
|
"agent_state": state["agent_state"],
|
||||||
"report": agent.render_report(state["agent_state"]),
|
"report": agent.render_report(state["agent_state"]),
|
||||||
}
|
}
|
||||||
|
|
||||||
def route_entry(state: dict[str, Any]) -> str:
|
def route_entry(state: dict[str, Any]) -> str:
|
||||||
"""从入口决定进入全局、IP、确认或报告节点。"""
|
"""从入口决定进入全局、IP 或报告节点。"""
|
||||||
agent_state = state["agent_state"]
|
agent_state = state["agent_state"]
|
||||||
if agent_state.pending_confirmation:
|
if agent_state.pending_confirmation:
|
||||||
return "confirm"
|
logger.info("LangGraph route_entry -> report legacy_pending run_id=%s", agent_state.run_id)
|
||||||
|
return "report"
|
||||||
if agent.next_global_action(agent_state):
|
if agent.next_global_action(agent_state):
|
||||||
|
logger.info("LangGraph route_entry -> global_action run_id=%s", agent_state.run_id)
|
||||||
return "global_action"
|
return "global_action"
|
||||||
if flow == "global":
|
if flow == "global":
|
||||||
|
logger.info("LangGraph route_entry -> report run_id=%s", agent_state.run_id)
|
||||||
return "report"
|
return "report"
|
||||||
|
logger.info("LangGraph route_entry -> prepare_ip run_id=%s", agent_state.run_id)
|
||||||
return "prepare_ip"
|
return "prepare_ip"
|
||||||
|
|
||||||
def route_after_global(state: dict[str, Any]) -> str:
|
def route_after_global(state: dict[str, Any]) -> str:
|
||||||
"""全局 action 后继续全局循环或进入 IP 阶段。"""
|
"""全局 action 后继续全局循环或进入 IP 阶段。"""
|
||||||
agent_state = state["agent_state"]
|
agent_state = state["agent_state"]
|
||||||
if agent.next_global_action(agent_state):
|
if agent.next_global_action(agent_state):
|
||||||
|
logger.info("LangGraph route_after_global -> global_action run_id=%s", agent_state.run_id)
|
||||||
return "global_action"
|
return "global_action"
|
||||||
if flow == "global":
|
if flow == "global":
|
||||||
|
logger.info("LangGraph route_after_global -> report run_id=%s", agent_state.run_id)
|
||||||
return "report"
|
return "report"
|
||||||
|
logger.info("LangGraph route_after_global -> prepare_ip run_id=%s", agent_state.run_id)
|
||||||
return "prepare_ip"
|
return "prepare_ip"
|
||||||
|
|
||||||
def route_after_prepare_ip(state: dict[str, Any]) -> str:
|
def route_after_prepare_ip(state: dict[str, Any]) -> str:
|
||||||
"""IP 准备节点后进入确认、单 IP action 或报告。"""
|
"""IP 准备节点后进入单 IP action 或报告。"""
|
||||||
agent_state = state["agent_state"]
|
agent_state = state["agent_state"]
|
||||||
if agent_state.pending_confirmation:
|
if agent_state.pending_confirmation:
|
||||||
return "confirm"
|
logger.info("LangGraph route_after_prepare_ip -> report legacy_pending run_id=%s", agent_state.run_id)
|
||||||
|
return "report"
|
||||||
if state.get("current_ip_action"):
|
if state.get("current_ip_action"):
|
||||||
|
logger.info("LangGraph route_after_prepare_ip -> ip_action run_id=%s ip=%s action=%s", agent_state.run_id, state.get("current_ip"), state.get("current_ip_action"))
|
||||||
return "ip_action"
|
return "ip_action"
|
||||||
|
logger.info("LangGraph route_after_prepare_ip -> report run_id=%s", agent_state.run_id)
|
||||||
return "report"
|
return "report"
|
||||||
|
|
||||||
graph = StateGraph(dict)
|
graph = StateGraph(dict)
|
||||||
@ -179,7 +212,6 @@ def build_deployment_graph(*, agent: PamDeployAgent, flow: GraphFlow = "deploy")
|
|||||||
graph.add_node("global_action", global_action_node)
|
graph.add_node("global_action", global_action_node)
|
||||||
graph.add_node("prepare_ip", prepare_ip_node)
|
graph.add_node("prepare_ip", prepare_ip_node)
|
||||||
graph.add_node("ip_action", ip_action_node)
|
graph.add_node("ip_action", ip_action_node)
|
||||||
graph.add_node("confirm", confirm_node)
|
|
||||||
graph.add_node("report", report_node)
|
graph.add_node("report", report_node)
|
||||||
|
|
||||||
graph.add_edge(START, "entry")
|
graph.add_edge(START, "entry")
|
||||||
@ -187,7 +219,6 @@ def build_deployment_graph(*, agent: PamDeployAgent, flow: GraphFlow = "deploy")
|
|||||||
"entry",
|
"entry",
|
||||||
route_entry,
|
route_entry,
|
||||||
{
|
{
|
||||||
"confirm": "confirm",
|
|
||||||
"global_action": "global_action",
|
"global_action": "global_action",
|
||||||
"prepare_ip": "prepare_ip",
|
"prepare_ip": "prepare_ip",
|
||||||
"report": "report",
|
"report": "report",
|
||||||
@ -205,12 +236,13 @@ def build_deployment_graph(*, agent: PamDeployAgent, flow: GraphFlow = "deploy")
|
|||||||
graph.add_conditional_edges(
|
graph.add_conditional_edges(
|
||||||
"prepare_ip",
|
"prepare_ip",
|
||||||
route_after_prepare_ip,
|
route_after_prepare_ip,
|
||||||
{"confirm": "confirm", "ip_action": "ip_action", "report": "report"},
|
{"ip_action": "ip_action", "report": "report"},
|
||||||
)
|
)
|
||||||
graph.add_edge("ip_action", "prepare_ip")
|
graph.add_edge("ip_action", "prepare_ip")
|
||||||
graph.add_edge("confirm", "entry")
|
|
||||||
graph.add_edge("report", END)
|
graph.add_edge("report", END)
|
||||||
return graph.compile(checkpointer=InMemorySaver())
|
compiled = graph.compile(checkpointer=InMemorySaver())
|
||||||
|
logger.info("LangGraph 部署图构建完成 flow=%s", flow)
|
||||||
|
return compiled
|
||||||
|
|
||||||
|
|
||||||
def _extract_interrupt_value(interrupts: Any) -> dict[str, Any]:
|
def _extract_interrupt_value(interrupts: Any) -> dict[str, Any]:
|
||||||
|
|||||||
@ -2,6 +2,7 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from collections.abc import Iterable
|
||||||
from typing import Any, Protocol
|
from typing import Any, Protocol
|
||||||
|
|
||||||
from pam_deploy_graph.models import (
|
from pam_deploy_graph.models import (
|
||||||
@ -11,6 +12,7 @@ from pam_deploy_graph.models import (
|
|||||||
LlmDeployPlan,
|
LlmDeployPlan,
|
||||||
LlmIntentResult,
|
LlmIntentResult,
|
||||||
LlmParamResult,
|
LlmParamResult,
|
||||||
|
LlmSingleActionProposal,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -40,7 +42,28 @@ class LlmClient(Protocol):
|
|||||||
*,
|
*,
|
||||||
action: str,
|
action: str,
|
||||||
result: ActionResult,
|
result: ActionResult,
|
||||||
state_summary: dict[str, Any],
|
|
||||||
) -> LlmActionAnalysis:
|
) -> LlmActionAnalysis:
|
||||||
"""分析 action 执行结果,并给出是否允许继续执行的建议。"""
|
"""分析 action 执行结果,并给出是否允许继续执行的建议。"""
|
||||||
...
|
...
|
||||||
|
|
||||||
|
def chat(self, text: str, context: dict[str, Any] | None = None) -> str:
|
||||||
|
"""进行普通自然语言对话,不触发部署 workflow。"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def chat_stream(self, text: str, context: dict[str, Any] | None = None) -> Iterable[str]:
|
||||||
|
"""流式进行普通自然语言对话,不触发部署 workflow。"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def analyze_log(self, log_text: str, question: str | None = None, source_path: str = "") -> str:
|
||||||
|
"""分析日志文本并给出异常摘要、原因和建议。"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def propose_action(
|
||||||
|
self,
|
||||||
|
text: str,
|
||||||
|
allowed_actions: list[str],
|
||||||
|
params: dict[str, Any],
|
||||||
|
state_summary: dict[str, Any] | None = None,
|
||||||
|
) -> LlmSingleActionProposal:
|
||||||
|
"""把自然语言解析为单次 action 调用建议。"""
|
||||||
|
...
|
||||||
|
|||||||
@ -3,11 +3,15 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import os
|
import os
|
||||||
|
import logging
|
||||||
|
|
||||||
|
from pam_deploy_graph.logging_utils import json_for_log
|
||||||
from .base import LlmClient
|
from .base import LlmClient
|
||||||
from .openai_compatible import OpenAICompatibleLlmClient, load_prompt_text
|
from .openai_compatible import OpenAICompatibleLlmClient, load_prompt_text
|
||||||
from .rule_based import RuleBasedLlmClient
|
from .rule_based import RuleBasedLlmClient
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def build_llm_client(
|
def build_llm_client(
|
||||||
*,
|
*,
|
||||||
@ -25,8 +29,24 @@ def build_llm_client(
|
|||||||
if action_analysis_prompt_path is not None
|
if action_analysis_prompt_path is not None
|
||||||
else os.getenv("PAM_LLM_ACTION_ANALYSIS_PROMPT_FILE", "")
|
else os.getenv("PAM_LLM_ACTION_ANALYSIS_PROMPT_FILE", "")
|
||||||
)
|
)
|
||||||
|
logger.info(
|
||||||
|
"构建 LLM client base_url=%s model=%s has_api_key=%s action_prompt_path=%s explicit=%s",
|
||||||
|
actual_base_url,
|
||||||
|
actual_model,
|
||||||
|
bool(actual_api_key),
|
||||||
|
actual_action_prompt_path,
|
||||||
|
json_for_log(
|
||||||
|
{
|
||||||
|
"base_url": base_url,
|
||||||
|
"api_key": api_key,
|
||||||
|
"model": model,
|
||||||
|
"action_analysis_prompt_path": action_analysis_prompt_path,
|
||||||
|
}
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
if not actual_base_url and not actual_api_key and not actual_model:
|
if not actual_base_url and not actual_api_key and not actual_model:
|
||||||
|
logger.info("未配置真实 LLM,使用 RuleBasedLlmClient fallback")
|
||||||
return RuleBasedLlmClient()
|
return RuleBasedLlmClient()
|
||||||
|
|
||||||
missing = []
|
missing = []
|
||||||
@ -35,11 +55,14 @@ def build_llm_client(
|
|||||||
if not actual_model:
|
if not actual_model:
|
||||||
missing.append("model")
|
missing.append("model")
|
||||||
if missing:
|
if missing:
|
||||||
|
logger.info("LLM 配置不完整 missing=%s", missing)
|
||||||
raise ValueError(f"LLM 配置不完整,缺少: {', '.join(missing)}")
|
raise ValueError(f"LLM 配置不完整,缺少: {', '.join(missing)}")
|
||||||
|
|
||||||
return OpenAICompatibleLlmClient(
|
client = OpenAICompatibleLlmClient(
|
||||||
base_url=actual_base_url,
|
base_url=actual_base_url,
|
||||||
api_key=actual_api_key,
|
api_key=actual_api_key,
|
||||||
model=actual_model,
|
model=actual_model,
|
||||||
action_analysis_prompt=load_prompt_text(actual_action_prompt_path),
|
action_analysis_prompt=load_prompt_text(actual_action_prompt_path),
|
||||||
)
|
)
|
||||||
|
logger.info("真实 LLM client 构建完成 client=%s model=%s has_api_key=%s", type(client).__name__, actual_model, bool(actual_api_key))
|
||||||
|
return client
|
||||||
|
|||||||
@ -7,9 +7,11 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import json
|
import json
|
||||||
|
import logging
|
||||||
|
import time
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
import urllib.request
|
import urllib.request
|
||||||
from collections.abc import Callable
|
from collections.abc import Callable, Iterable, Iterator
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from pam_deploy_graph.constants import (
|
from pam_deploy_graph.constants import (
|
||||||
@ -20,12 +22,25 @@ from pam_deploy_graph.constants import (
|
|||||||
REQUIRED_PARAMS,
|
REQUIRED_PARAMS,
|
||||||
SENSITIVE_KEYS,
|
SENSITIVE_KEYS,
|
||||||
)
|
)
|
||||||
from pam_deploy_graph.models import ExecutionStrategy, LlmDeployPlan, LlmIntentResult, LlmParamResult
|
from pam_deploy_graph.logging_utils import json_for_log, redact_for_log
|
||||||
|
from pam_deploy_graph.models import ExecutionStrategy, LlmDeployPlan, LlmIntentResult, LlmParamResult, LlmSingleActionProposal
|
||||||
from pam_deploy_graph.models import ActionResult, LlmActionAnalysis
|
from pam_deploy_graph.models import ActionResult, LlmActionAnalysis
|
||||||
|
|
||||||
from .prompts import ACTION_ANALYSIS_PROMPT, INTENT_PROMPT, PARAM_PROMPT, PLAN_PROMPT, SYSTEM_PROMPT
|
from .prompts import (
|
||||||
|
ACTION_ANALYSIS_PROMPT,
|
||||||
|
CHAT_PROMPT,
|
||||||
|
INTENT_PROMPT,
|
||||||
|
LOG_ANALYSIS_PROMPT,
|
||||||
|
PARAM_PROMPT,
|
||||||
|
PLAN_PROMPT,
|
||||||
|
SINGLE_ACTION_PROMPT,
|
||||||
|
SYSTEM_PROMPT,
|
||||||
|
)
|
||||||
|
from .text_filter import filter_thinking_chunks, strip_thinking_text
|
||||||
|
|
||||||
JsonTransport = Callable[[str, dict[str, str], dict[str, Any], float], dict[str, Any]]
|
JsonTransport = Callable[[str, dict[str, str], dict[str, Any], float], dict[str, Any]]
|
||||||
|
StreamTransport = Callable[[str, dict[str, str], dict[str, Any], float], Iterable[str]]
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class OpenAICompatibleLlmClient:
|
class OpenAICompatibleLlmClient:
|
||||||
@ -41,6 +56,7 @@ class OpenAICompatibleLlmClient:
|
|||||||
timeout_sec: float = 30,
|
timeout_sec: float = 30,
|
||||||
temperature: float = 0,
|
temperature: float = 0,
|
||||||
transport: JsonTransport | None = None,
|
transport: JsonTransport | None = None,
|
||||||
|
stream_transport: StreamTransport | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
"""保存连接参数、模型参数和可替换的 HTTP transport。"""
|
"""保存连接参数、模型参数和可替换的 HTTP transport。"""
|
||||||
if not base_url:
|
if not base_url:
|
||||||
@ -54,10 +70,22 @@ class OpenAICompatibleLlmClient:
|
|||||||
self.timeout_sec = timeout_sec
|
self.timeout_sec = timeout_sec
|
||||||
self.temperature = temperature
|
self.temperature = temperature
|
||||||
self.transport = transport or _default_transport
|
self.transport = transport or _default_transport
|
||||||
|
self.stream_transport = stream_transport or _default_stream_transport
|
||||||
|
logger.info(
|
||||||
|
"OpenAI-compatible LLM client 初始化 base_url=%s endpoint=%s model=%s has_api_key=%s timeout=%s temperature=%s custom_transport=%s custom_stream_transport=%s",
|
||||||
|
self.base_url,
|
||||||
|
_chat_completions_url(self.base_url),
|
||||||
|
self.model,
|
||||||
|
bool(self.api_key),
|
||||||
|
self.timeout_sec,
|
||||||
|
self.temperature,
|
||||||
|
transport is not None,
|
||||||
|
stream_transport is not None,
|
||||||
|
)
|
||||||
|
|
||||||
def understand_request(self, text: str) -> LlmIntentResult:
|
def understand_request(self, text: str) -> LlmIntentResult:
|
||||||
"""调用 LLM 识别用户意图。"""
|
"""调用 LLM 识别用户意图。"""
|
||||||
payload = self._complete_json(INTENT_PROMPT, {"user_text": text})
|
payload = self._complete_json("understand_request", INTENT_PROMPT, {"user_text": text})
|
||||||
return LlmIntentResult(
|
return LlmIntentResult(
|
||||||
intent=_string(payload, "intent", "deploy"), # type: ignore[arg-type]
|
intent=_string(payload, "intent", "deploy"), # type: ignore[arg-type]
|
||||||
mode_preference=_string(payload, "mode_preference", "未指定"), # type: ignore[arg-type]
|
mode_preference=_string(payload, "mode_preference", "未指定"), # type: ignore[arg-type]
|
||||||
@ -73,6 +101,7 @@ class OpenAICompatibleLlmClient:
|
|||||||
original_base = dict(base_params or {})
|
original_base = dict(base_params or {})
|
||||||
safe_base = _redact_sensitive(original_base)
|
safe_base = _redact_sensitive(original_base)
|
||||||
payload = self._complete_json(
|
payload = self._complete_json(
|
||||||
|
"extract_params",
|
||||||
PARAM_PROMPT,
|
PARAM_PROMPT,
|
||||||
{
|
{
|
||||||
"user_text": text,
|
"user_text": text,
|
||||||
@ -110,6 +139,7 @@ class OpenAICompatibleLlmClient:
|
|||||||
) -> LlmDeployPlan:
|
) -> LlmDeployPlan:
|
||||||
"""调用 LLM 生成部署计划。"""
|
"""调用 LLM 生成部署计划。"""
|
||||||
payload = self._complete_json(
|
payload = self._complete_json(
|
||||||
|
"generate_plan",
|
||||||
PLAN_PROMPT,
|
PLAN_PROMPT,
|
||||||
{
|
{
|
||||||
"params": _redact_sensitive(params),
|
"params": _redact_sensitive(params),
|
||||||
@ -134,23 +164,14 @@ class OpenAICompatibleLlmClient:
|
|||||||
*,
|
*,
|
||||||
action: str,
|
action: str,
|
||||||
result: ActionResult,
|
result: ActionResult,
|
||||||
state_summary: dict[str, Any],
|
|
||||||
) -> LlmActionAnalysis:
|
) -> LlmActionAnalysis:
|
||||||
"""调用 LLM 分析 action 结果,返回结构化诊断建议。"""
|
"""调用 LLM 分析 action 结果,返回结构化诊断建议。"""
|
||||||
payload = self._complete_json(
|
payload = self._complete_json(
|
||||||
|
"analyze_action_result",
|
||||||
self.action_analysis_prompt,
|
self.action_analysis_prompt,
|
||||||
{
|
{
|
||||||
"action": action,
|
"action": action,
|
||||||
"result": {
|
"result": _action_review_result_payload(action, result),
|
||||||
"backend": result.backend,
|
|
||||||
"ok": result.ok,
|
|
||||||
"exit_code": result.exit_code,
|
|
||||||
"tool_name": result.tool_name,
|
|
||||||
"values": _redact_sensitive(result.values),
|
|
||||||
"stderr": _truncate_text(result.stderr),
|
|
||||||
"error_summary": result.error_summary,
|
|
||||||
},
|
|
||||||
"state_summary": _redact_sensitive(state_summary),
|
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
return LlmActionAnalysis(
|
return LlmActionAnalysis(
|
||||||
@ -161,11 +182,78 @@ class OpenAICompatibleLlmClient:
|
|||||||
suggested_action=_string(payload, "suggested_action", ""),
|
suggested_action=_string(payload, "suggested_action", ""),
|
||||||
requires_confirmation=bool(payload.get("requires_confirmation", False)),
|
requires_confirmation=bool(payload.get("requires_confirmation", False)),
|
||||||
should_continue=bool(payload.get("should_continue", True)),
|
should_continue=bool(payload.get("should_continue", True)),
|
||||||
|
progress_complete=_optional_bool(payload.get("progress_complete")),
|
||||||
notes=_string_list(payload.get("notes")),
|
notes=_string_list(payload.get("notes")),
|
||||||
)
|
)
|
||||||
|
|
||||||
def _complete_json(self, instruction: str, input_payload: dict[str, Any]) -> dict[str, Any]:
|
def chat(self, text: str, context: dict[str, Any] | None = None) -> str:
|
||||||
|
"""调用 LLM 做普通对话,不要求 JSON 响应。"""
|
||||||
|
return self._complete_text(
|
||||||
|
"chat",
|
||||||
|
CHAT_PROMPT,
|
||||||
|
{
|
||||||
|
"user_text": text,
|
||||||
|
"context": _redact_sensitive(context or {}),
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
def chat_stream(self, text: str, context: dict[str, Any] | None = None) -> Iterable[str]:
|
||||||
|
"""调用 LLM 做普通流式对话,不要求 JSON 响应。"""
|
||||||
|
return self._complete_text_stream(
|
||||||
|
"chat",
|
||||||
|
CHAT_PROMPT,
|
||||||
|
{
|
||||||
|
"user_text": text,
|
||||||
|
"context": _redact_sensitive(context or {}),
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
def analyze_log(self, log_text: str, question: str | None = None, source_path: str = "") -> str:
|
||||||
|
"""调用 LLM 分析日志尾部摘要。"""
|
||||||
|
return self._complete_text(
|
||||||
|
"analyze_log",
|
||||||
|
LOG_ANALYSIS_PROMPT,
|
||||||
|
{
|
||||||
|
"source_path": source_path,
|
||||||
|
"question": question or "请分析日志中的异常、可能原因和下一步建议。",
|
||||||
|
"log_tail": redact_for_log(log_text, max_text_len=64000),
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
def propose_action(
|
||||||
|
self,
|
||||||
|
text: str,
|
||||||
|
allowed_actions: list[str],
|
||||||
|
params: dict[str, Any],
|
||||||
|
state_summary: dict[str, Any] | None = None,
|
||||||
|
) -> LlmSingleActionProposal:
|
||||||
|
"""调用 LLM 把自然语言解析为单 action 调用建议。"""
|
||||||
|
payload = self._complete_json(
|
||||||
|
"propose_action",
|
||||||
|
SINGLE_ACTION_PROMPT,
|
||||||
|
{
|
||||||
|
"user_text": text,
|
||||||
|
"allowed_actions": allowed_actions,
|
||||||
|
"params": _redact_sensitive(params),
|
||||||
|
"state_summary": _redact_sensitive(state_summary or {}),
|
||||||
|
},
|
||||||
|
)
|
||||||
|
action = _string(payload, "action", "")
|
||||||
|
if action not in allowed_actions:
|
||||||
|
action = ""
|
||||||
|
return LlmSingleActionProposal(
|
||||||
|
action=action,
|
||||||
|
ip=_string(payload, "ip", ""),
|
||||||
|
kwargs=_dict(payload.get("kwargs")),
|
||||||
|
reason=_string(payload, "reason", ""),
|
||||||
|
risk_level=_risk_level(payload.get("risk_level")),
|
||||||
|
requires_confirmation=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
def _complete_json(self, operation: str, instruction: str, input_payload: dict[str, Any]) -> dict[str, Any]:
|
||||||
"""发送 chat/completions 请求,并解析 JSON 对象响应。"""
|
"""发送 chat/completions 请求,并解析 JSON 对象响应。"""
|
||||||
|
started_at = time.perf_counter()
|
||||||
|
endpoint = _chat_completions_url(self.base_url)
|
||||||
request_payload = {
|
request_payload = {
|
||||||
"model": self.model,
|
"model": self.model,
|
||||||
"temperature": self.temperature,
|
"temperature": self.temperature,
|
||||||
@ -183,18 +271,145 @@ class OpenAICompatibleLlmClient:
|
|||||||
headers = {"Content-Type": "application/json"}
|
headers = {"Content-Type": "application/json"}
|
||||||
if self.api_key:
|
if self.api_key:
|
||||||
headers["Authorization"] = f"Bearer {self.api_key}"
|
headers["Authorization"] = f"Bearer {self.api_key}"
|
||||||
|
logger.info(
|
||||||
|
"LLM 请求开始 operation=%s endpoint=%s model=%s timeout=%s has_api_key=%s input=%s",
|
||||||
|
operation,
|
||||||
|
endpoint,
|
||||||
|
self.model,
|
||||||
|
self.timeout_sec,
|
||||||
|
bool(self.api_key),
|
||||||
|
json_for_log(input_payload, max_text_len=1600),
|
||||||
|
)
|
||||||
|
try:
|
||||||
response = self.transport(
|
response = self.transport(
|
||||||
_chat_completions_url(self.base_url),
|
endpoint,
|
||||||
headers,
|
headers,
|
||||||
request_payload,
|
request_payload,
|
||||||
self.timeout_sec,
|
self.timeout_sec,
|
||||||
)
|
)
|
||||||
content = _message_content(response)
|
content = _message_content(response)
|
||||||
|
logger.info(
|
||||||
|
"LLM 原始响应 operation=%s duration_ms=%s content=%s",
|
||||||
|
operation,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
redact_for_log(content, max_text_len=1600),
|
||||||
|
)
|
||||||
parsed = _loads_json_object(content)
|
parsed = _loads_json_object(content)
|
||||||
if not isinstance(parsed, dict):
|
if not isinstance(parsed, dict):
|
||||||
raise ValueError("LLM 响应必须是 JSON object")
|
raise ValueError("LLM 响应必须是 JSON object")
|
||||||
|
except Exception:
|
||||||
|
logger.exception(
|
||||||
|
"LLM 请求失败 operation=%s endpoint=%s duration_ms=%s input=%s",
|
||||||
|
operation,
|
||||||
|
endpoint,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
json_for_log(input_payload, max_text_len=1600),
|
||||||
|
)
|
||||||
|
raise
|
||||||
|
logger.info(
|
||||||
|
"LLM 请求完成 operation=%s duration_ms=%s response_keys=%s response=%s",
|
||||||
|
operation,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
sorted(parsed.keys()),
|
||||||
|
json_for_log(parsed, max_text_len=1600),
|
||||||
|
)
|
||||||
return parsed
|
return parsed
|
||||||
|
|
||||||
|
def _complete_text(self, operation: str, instruction: str, input_payload: dict[str, Any]) -> str:
|
||||||
|
"""发送 chat/completions 请求,并返回普通文本响应。"""
|
||||||
|
started_at = time.perf_counter()
|
||||||
|
endpoint = _chat_completions_url(self.base_url)
|
||||||
|
request_payload = {
|
||||||
|
"model": self.model,
|
||||||
|
"temperature": self.temperature,
|
||||||
|
"messages": [
|
||||||
|
{"role": "system", "content": instruction},
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "输入 JSON:\n" + json.dumps(input_payload, ensure_ascii=False, sort_keys=True),
|
||||||
|
},
|
||||||
|
],
|
||||||
|
}
|
||||||
|
headers = {"Content-Type": "application/json"}
|
||||||
|
if self.api_key:
|
||||||
|
headers["Authorization"] = f"Bearer {self.api_key}"
|
||||||
|
logger.info(
|
||||||
|
"LLM 文本请求开始 operation=%s endpoint=%s model=%s timeout=%s has_api_key=%s input=%s",
|
||||||
|
operation,
|
||||||
|
endpoint,
|
||||||
|
self.model,
|
||||||
|
self.timeout_sec,
|
||||||
|
bool(self.api_key),
|
||||||
|
json_for_log(input_payload, max_text_len=1600),
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
response = self.transport(endpoint, headers, request_payload, self.timeout_sec)
|
||||||
|
content = strip_thinking_text(str(_message_content(response)))
|
||||||
|
except Exception:
|
||||||
|
logger.exception(
|
||||||
|
"LLM 文本请求失败 operation=%s endpoint=%s duration_ms=%s input=%s",
|
||||||
|
operation,
|
||||||
|
endpoint,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
json_for_log(input_payload, max_text_len=1600),
|
||||||
|
)
|
||||||
|
raise
|
||||||
|
logger.info(
|
||||||
|
"LLM 文本请求完成 operation=%s duration_ms=%s content=%s",
|
||||||
|
operation,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
redact_for_log(content, max_text_len=1600),
|
||||||
|
)
|
||||||
|
return content
|
||||||
|
|
||||||
|
def _complete_text_stream(self, operation: str, instruction: str, input_payload: dict[str, Any]) -> Iterable[str]:
|
||||||
|
"""发送 stream chat/completions 请求,并返回过滤后的普通文本分片。"""
|
||||||
|
started_at = time.perf_counter()
|
||||||
|
endpoint = _chat_completions_url(self.base_url)
|
||||||
|
request_payload = {
|
||||||
|
"model": self.model,
|
||||||
|
"temperature": self.temperature,
|
||||||
|
"stream": True,
|
||||||
|
"messages": [
|
||||||
|
{"role": "system", "content": instruction},
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "输入 JSON:\n" + json.dumps(input_payload, ensure_ascii=False, sort_keys=True),
|
||||||
|
},
|
||||||
|
],
|
||||||
|
}
|
||||||
|
headers = {"Content-Type": "application/json"}
|
||||||
|
if self.api_key:
|
||||||
|
headers["Authorization"] = f"Bearer {self.api_key}"
|
||||||
|
logger.info(
|
||||||
|
"LLM 流式文本请求开始 operation=%s endpoint=%s model=%s timeout=%s has_api_key=%s input=%s",
|
||||||
|
operation,
|
||||||
|
endpoint,
|
||||||
|
self.model,
|
||||||
|
self.timeout_sec,
|
||||||
|
bool(self.api_key),
|
||||||
|
json_for_log(input_payload, max_text_len=1600),
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
raw_chunks = self.stream_transport(endpoint, headers, request_payload, self.timeout_sec)
|
||||||
|
for chunk in filter_thinking_chunks(raw_chunks):
|
||||||
|
if chunk:
|
||||||
|
yield chunk
|
||||||
|
except Exception:
|
||||||
|
logger.exception(
|
||||||
|
"LLM 流式文本请求失败 operation=%s endpoint=%s duration_ms=%s input=%s",
|
||||||
|
operation,
|
||||||
|
endpoint,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
json_for_log(input_payload, max_text_len=1600),
|
||||||
|
)
|
||||||
|
raise
|
||||||
|
logger.info(
|
||||||
|
"LLM 流式文本请求完成 operation=%s duration_ms=%s",
|
||||||
|
operation,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def _default_transport(
|
def _default_transport(
|
||||||
url: str,
|
url: str,
|
||||||
@ -217,6 +432,41 @@ def _default_transport(
|
|||||||
return decoded
|
return decoded
|
||||||
|
|
||||||
|
|
||||||
|
def _default_stream_transport(
|
||||||
|
url: str,
|
||||||
|
headers: dict[str, str],
|
||||||
|
payload: dict[str, Any],
|
||||||
|
timeout_sec: float,
|
||||||
|
) -> Iterator[str]:
|
||||||
|
"""使用标准库 urllib 发送 OpenAI-compatible SSE 流式请求。"""
|
||||||
|
request = urllib.request.Request(
|
||||||
|
url,
|
||||||
|
data=json.dumps(payload).encode("utf-8"),
|
||||||
|
headers=headers,
|
||||||
|
method="POST",
|
||||||
|
)
|
||||||
|
with urllib.request.urlopen(request, timeout=timeout_sec) as response:
|
||||||
|
for raw_line in response:
|
||||||
|
line = raw_line.decode("utf-8", errors="replace").strip()
|
||||||
|
if not line or line.startswith(":"):
|
||||||
|
continue
|
||||||
|
if line.startswith("event:") or line.startswith("id:"):
|
||||||
|
continue
|
||||||
|
if not line.startswith("data:"):
|
||||||
|
raise ValueError("LLM 流式响应不是 SSE data 格式")
|
||||||
|
data = line[len("data:") :].strip()
|
||||||
|
if data == "[DONE]":
|
||||||
|
break
|
||||||
|
try:
|
||||||
|
decoded = json.loads(data)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
logger.debug("忽略无法解析的 LLM stream data: %s", redact_for_log(data, max_text_len=300))
|
||||||
|
continue
|
||||||
|
chunk = _stream_delta_content(decoded)
|
||||||
|
if chunk:
|
||||||
|
yield chunk
|
||||||
|
|
||||||
|
|
||||||
def load_prompt_text(path: str | None) -> str:
|
def load_prompt_text(path: str | None) -> str:
|
||||||
"""读取自定义提示词文件。"""
|
"""读取自定义提示词文件。"""
|
||||||
if not path:
|
if not path:
|
||||||
@ -250,6 +500,35 @@ def _message_content(response: dict[str, Any]) -> Any:
|
|||||||
return content
|
return content
|
||||||
|
|
||||||
|
|
||||||
|
def _stream_delta_content(response: dict[str, Any]) -> str:
|
||||||
|
"""从 OpenAI-compatible stream chunk 中提取 delta.content。"""
|
||||||
|
try:
|
||||||
|
choice = response["choices"][0]
|
||||||
|
except (KeyError, IndexError, TypeError):
|
||||||
|
return ""
|
||||||
|
delta = choice.get("delta") if isinstance(choice, dict) else None
|
||||||
|
if isinstance(delta, dict) and "content" in delta:
|
||||||
|
return str(_content_parts_to_text(delta.get("content")))
|
||||||
|
message = choice.get("message") if isinstance(choice, dict) else None
|
||||||
|
if isinstance(message, dict) and "content" in message:
|
||||||
|
return str(_content_parts_to_text(message.get("content")))
|
||||||
|
text = choice.get("text") if isinstance(choice, dict) else None
|
||||||
|
return str(text) if text is not None else ""
|
||||||
|
|
||||||
|
|
||||||
|
def _content_parts_to_text(content: Any) -> str:
|
||||||
|
"""把 OpenAI content parts 或字符串转换为纯文本。"""
|
||||||
|
if isinstance(content, list):
|
||||||
|
parts: list[str] = []
|
||||||
|
for item in content:
|
||||||
|
if isinstance(item, dict) and item.get("type") == "text":
|
||||||
|
parts.append(str(item.get("text", "")))
|
||||||
|
elif isinstance(item, str):
|
||||||
|
parts.append(item)
|
||||||
|
return "".join(parts)
|
||||||
|
return "" if content is None else str(content)
|
||||||
|
|
||||||
|
|
||||||
def _loads_json_object(content: Any) -> Any:
|
def _loads_json_object(content: Any) -> Any:
|
||||||
"""把 message.content 解析为 JSON 对象。"""
|
"""把 message.content 解析为 JSON 对象。"""
|
||||||
if isinstance(content, dict):
|
if isinstance(content, dict):
|
||||||
@ -274,6 +553,45 @@ def _redact_sensitive(value: Any) -> Any:
|
|||||||
return value
|
return value
|
||||||
|
|
||||||
|
|
||||||
|
def _action_review_result_payload(action: str, result: ActionResult) -> dict[str, Any]:
|
||||||
|
"""构造 action 审核输入,避免把正常脚本日志当作错误喂给 LLM。"""
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"backend": result.backend,
|
||||||
|
"ok": result.ok,
|
||||||
|
"exit_code": result.exit_code,
|
||||||
|
"tool_name": result.tool_name,
|
||||||
|
"values": _redact_sensitive(result.values),
|
||||||
|
"error_summary": result.error_summary,
|
||||||
|
}
|
||||||
|
if _needs_diagnostic_log(action, result):
|
||||||
|
diagnostic = _diagnostic_log_text(result)
|
||||||
|
if diagnostic:
|
||||||
|
payload["diagnostic_log"] = diagnostic
|
||||||
|
return payload
|
||||||
|
|
||||||
|
|
||||||
|
def _needs_diagnostic_log(action: str, result: ActionResult) -> bool:
|
||||||
|
"""仅在失败或业务异常时把少量诊断日志交给 LLM。"""
|
||||||
|
if not result.ok or result.error_summary or result.values.get("PENDING_AGENT_CONFIRMATION"):
|
||||||
|
return True
|
||||||
|
if action == "verify-ip":
|
||||||
|
success = result.values.get("SUCCESS")
|
||||||
|
if success is not None and str(success).lower() not in ("true", "1", "yes"):
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def _diagnostic_log_text(result: ActionResult) -> str:
|
||||||
|
"""优先使用错误摘要;必要时取 stderr/stdout/raw_output 的尾部作为诊断上下文。"""
|
||||||
|
if result.error_summary:
|
||||||
|
return _truncate_text(result.error_summary)
|
||||||
|
for text in (result.stderr, result.stdout, result.raw_output):
|
||||||
|
stripped = text.strip()
|
||||||
|
if stripped:
|
||||||
|
return _tail_text(stripped)
|
||||||
|
return ""
|
||||||
|
|
||||||
|
|
||||||
def _truncate_text(value: str, limit: int = 1000) -> str:
|
def _truncate_text(value: str, limit: int = 1000) -> str:
|
||||||
"""截断发送给 LLM 的长文本,避免传入完整日志。"""
|
"""截断发送给 LLM 的长文本,避免传入完整日志。"""
|
||||||
if len(value) <= limit:
|
if len(value) <= limit:
|
||||||
@ -281,6 +599,14 @@ def _truncate_text(value: str, limit: int = 1000) -> str:
|
|||||||
return value[:limit] + "...[已截断]"
|
return value[:limit] + "...[已截断]"
|
||||||
|
|
||||||
|
|
||||||
|
def _tail_text(value: str, limit: int = 1000) -> str:
|
||||||
|
"""保留长诊断日志尾部,通常错误原因更靠近末尾。"""
|
||||||
|
if len(value) <= limit:
|
||||||
|
return value
|
||||||
|
marker = "[已截断]..."
|
||||||
|
return marker + value[-(limit - len(marker)) :]
|
||||||
|
|
||||||
|
|
||||||
def _string(payload: dict[str, Any], key: str, default: str) -> str:
|
def _string(payload: dict[str, Any], key: str, default: str) -> str:
|
||||||
"""安全读取字符串字段。"""
|
"""安全读取字符串字段。"""
|
||||||
value = payload.get(key, default)
|
value = payload.get(key, default)
|
||||||
@ -295,6 +621,31 @@ def _float(payload: dict[str, Any], key: str, default: float) -> float:
|
|||||||
return default
|
return default
|
||||||
|
|
||||||
|
|
||||||
|
def _optional_bool(value: Any) -> bool | None:
|
||||||
|
"""解析可选布尔值,字段缺失时保留 None。"""
|
||||||
|
if value is None:
|
||||||
|
return None
|
||||||
|
if isinstance(value, bool):
|
||||||
|
return value
|
||||||
|
if isinstance(value, str):
|
||||||
|
lowered = value.strip().lower()
|
||||||
|
if lowered in ("", "null", "none"):
|
||||||
|
return None
|
||||||
|
if lowered in ("true", "1", "yes", "y"):
|
||||||
|
return True
|
||||||
|
if lowered in ("false", "0", "no", "n"):
|
||||||
|
return False
|
||||||
|
return bool(value)
|
||||||
|
|
||||||
|
|
||||||
|
def _risk_level(value: Any) -> str:
|
||||||
|
"""解析单 action 风险等级,非法值降级为 medium。"""
|
||||||
|
text = str(value or "").strip().lower()
|
||||||
|
if text in ("low", "medium", "high"):
|
||||||
|
return text
|
||||||
|
return "medium"
|
||||||
|
|
||||||
|
|
||||||
def _dict(value: Any) -> dict[str, Any]:
|
def _dict(value: Any) -> dict[str, Any]:
|
||||||
"""确保返回 dict,非法值降级为空 dict。"""
|
"""确保返回 dict,非法值降级为空 dict。"""
|
||||||
return value if isinstance(value, dict) else {}
|
return value if isinstance(value, dict) else {}
|
||||||
|
|||||||
@ -7,7 +7,7 @@ SYSTEM_PROMPT = """你是 PAM 智能部署 Agent 的结构化理解与规划组
|
|||||||
- 不生成 shell、PowerShell、bat、curl 等可执行命令。
|
- 不生成 shell、PowerShell、bat、curl 等可执行命令。
|
||||||
- 不回显密钥、token、CLIENT_SECRET、Authorization 等敏感值。
|
- 不回显密钥、token、CLIENT_SECRET、Authorization 等敏感值。
|
||||||
- 只能在允许的 action 集合中选择部署动作。
|
- 只能在允许的 action 集合中选择部署动作。
|
||||||
- 真实执行前必须保留人工确认点:参数确认、目标 IP 范围确认、失败回滚确认。
|
- 真实执行前必须保留人工确认点:参数确认、目标 IP 范围确认;失败后应暂停,修复后 resume 重试,回滚只能由用户显式触发。
|
||||||
"""
|
"""
|
||||||
|
|
||||||
INTENT_PROMPT = """根据用户输入识别意图和执行偏好。
|
INTENT_PROMPT = """根据用户输入识别意图和执行偏好。
|
||||||
@ -38,7 +38,13 @@ PARAM_PROMPT = """从用户输入中抽取 PAM 部署参数和控制信息。
|
|||||||
"ZIP_FILE_PATH": "...",
|
"ZIP_FILE_PATH": "...",
|
||||||
"ACTION_TYPE": "...",
|
"ACTION_TYPE": "...",
|
||||||
"TIMEOUT": "...",
|
"TIMEOUT": "...",
|
||||||
"LOG_NAME": "..."
|
"LOG_NAME": "...",
|
||||||
|
"PARENT_VERSION_NUMBER": "...",
|
||||||
|
"POLL_INTERVAL_SEC": "...",
|
||||||
|
"DOWNLOAD_POLL_MAX_ATTEMPTS": "...",
|
||||||
|
"UPGRADE_POLL_MAX_ATTEMPTS": "...",
|
||||||
|
"VERIFY_INTERVAL_SEC": "...",
|
||||||
|
"VERIFY_MAX_ATTEMPTS": "..."
|
||||||
},
|
},
|
||||||
"extracted_control": {
|
"extracted_control": {
|
||||||
"user_specified_ips": ["..."]
|
"user_specified_ips": ["..."]
|
||||||
@ -77,11 +83,64 @@ ACTION_ANALYSIS_PROMPT = """分析一次 PAM action 执行结果。
|
|||||||
"suggested_action": "...",
|
"suggested_action": "...",
|
||||||
"requires_confirmation": false,
|
"requires_confirmation": false,
|
||||||
"should_continue": true,
|
"should_continue": true,
|
||||||
|
"progress_complete": null,
|
||||||
"notes": ["..."]
|
"notes": ["..."]
|
||||||
}
|
}
|
||||||
|
|
||||||
要求:
|
要求:
|
||||||
- 必须明确给出 `should_continue`:没有问题时为 true;存在需要人工判断的问题时为 false。
|
- 必须明确给出 `should_continue`:没有问题时为 true;存在需要人工判断的问题时为 false。
|
||||||
- 如果 exit_code 非 0、ok=false、verify-ip SUCCESS=false、出现 pending_confirmation,应标记异常。
|
- 如果 exit_code 非 0、ok=false、verify-ip SUCCESS=false、出现 legacy pending_confirmation,应标记异常。
|
||||||
|
- 对 `poll-download-progress`、`poll-upgrade-progress` 必须判断 `progress_complete`:已完成为 true;未完成但正常为 false;非进度 action 可为 null。
|
||||||
|
- 进度 action 未完成但正常时,`has_anomaly=false`、`should_continue=true`、`progress_complete=false`,建议继续查询进度。
|
||||||
|
- 进度 action 完成条件优先看 `STEP=DONE`、`STATUS=completed/done/success`、`SUCCESS=true`、`FINISH=true`,或 `MSG=success` 且 `RATE_OF_PROGRESS=100` 且 `CODE` 为空或 0。
|
||||||
|
- 进度 action 出现 `CODE` 非 0,或 `STEP/MSG/STATUS/MESSAGE` 含 fail/error,应标记异常并 `should_continue=false`。
|
||||||
|
- 主要依据结构化字段 `ok`、`exit_code`、`values`、`error_summary` 判断;不会提供完整运行态摘要,避免被历史状态误导。
|
||||||
|
- `verify-ip SUCCESS=false` 由 runtime 按配置重复检查;单次审核仍应说明当前健康检查未通过。
|
||||||
|
- 只有输入里存在 `diagnostic_log` 时,才把它当作异常诊断上下文。
|
||||||
|
- 脚本正常过程日志不会作为错误依据,不能因为日志来自 stderr 就判定异常。
|
||||||
- 不要输出密钥、token、Authorization 或完整日志原文。
|
- 不要输出密钥、token、Authorization 或完整日志原文。
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
CHAT_PROMPT = """你是 PAM 部署 Agent 的交互助手。
|
||||||
|
|
||||||
|
要求:
|
||||||
|
- 可以回答普通问题、解释当前 Agent 的命令和部署流程。
|
||||||
|
- 不要自动触发部署、回滚、升级、脚本执行或 MCP 调用。
|
||||||
|
- 如果用户想执行完整部署,提示使用 `analyze <需求>` 先分析,确认后再输入 `run`。
|
||||||
|
- 如果用户想单独执行 action,提示使用 `action propose <需求>` 或 `action run ...`,执行前仍需要人工确认。
|
||||||
|
- 不要输出密钥、token、Authorization、CLIENT_SECRET 或 api_key。
|
||||||
|
- 不要输出 `<think>`、`</think>`、`Thinking Process`、`Reasoning Process`、`Chain of Thought`、推理过程、内部思考或隐藏分析内容。
|
||||||
|
- 只输出可以直接展示给用户的最终回答。
|
||||||
|
"""
|
||||||
|
|
||||||
|
LOG_ANALYSIS_PROMPT = """分析 PAM Agent 或部署脚本日志。
|
||||||
|
|
||||||
|
要求:
|
||||||
|
- 优先总结异常现象、可能原因和建议下一步。
|
||||||
|
- 不要输出密钥、token、Authorization、CLIENT_SECRET 或 api_key。
|
||||||
|
- 输入通常是日志尾部摘要,不代表完整文件。
|
||||||
|
- 不要因为日志来自 stderr 就直接判定失败,要结合 ERROR、Exception、fail、状态码和上下文判断。
|
||||||
|
- 不要输出 `<think>`、`</think>`、`Thinking Process`、`Reasoning Process`、`Chain of Thought`、推理过程、内部思考或隐藏分析内容。
|
||||||
|
- 只输出可以直接展示给用户的最终分析结果。
|
||||||
|
"""
|
||||||
|
|
||||||
|
SINGLE_ACTION_PROMPT = """把用户自然语言解析成一次 PAM action 调用建议。
|
||||||
|
|
||||||
|
输出 JSON schema:
|
||||||
|
{
|
||||||
|
"action": "get-token",
|
||||||
|
"ip": "",
|
||||||
|
"kwargs": {},
|
||||||
|
"reason": "...",
|
||||||
|
"risk_level": "low|medium|high",
|
||||||
|
"requires_confirmation": true
|
||||||
|
}
|
||||||
|
|
||||||
|
要求:
|
||||||
|
- `action` 必须来自输入的 allowed_actions;不能识别明确 action 时返回空字符串。
|
||||||
|
- 不要猜测危险 action,不要自动规划多个 action。
|
||||||
|
- 逐 IP action 必须尽量提取 `ip`。
|
||||||
|
- 除 `ip` 外的额外参数放入 `kwargs`。
|
||||||
|
- 所有 action 都必须 `requires_confirmation=true`。
|
||||||
|
- 不要输出密钥、token、Authorization、CLIENT_SECRET 或 api_key。
|
||||||
|
"""
|
||||||
|
|||||||
@ -6,10 +6,14 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from collections.abc import Iterable
|
||||||
|
import logging
|
||||||
import re
|
import re
|
||||||
|
from dataclasses import asdict
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from pam_deploy_graph.constants import GLOBAL_ACTION_SEQUENCE, REQUIRED_PARAMS
|
from pam_deploy_graph.constants import GLOBAL_ACTION_SEQUENCE, REQUIRED_PARAMS, SENSITIVE_KEYS
|
||||||
|
from pam_deploy_graph.logging_utils import json_for_log, redact_for_log
|
||||||
from pam_deploy_graph.models import (
|
from pam_deploy_graph.models import (
|
||||||
ActionResult,
|
ActionResult,
|
||||||
ExecutionStrategy,
|
ExecutionStrategy,
|
||||||
@ -17,8 +21,13 @@ from pam_deploy_graph.models import (
|
|||||||
LlmDeployPlan,
|
LlmDeployPlan,
|
||||||
LlmIntentResult,
|
LlmIntentResult,
|
||||||
LlmParamResult,
|
LlmParamResult,
|
||||||
|
LlmSingleActionProposal,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
from .text_filter import strip_thinking_text
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
KEY_ALIASES = {
|
KEY_ALIASES = {
|
||||||
"home_base_url": "HOME_BASE_URL",
|
"home_base_url": "HOME_BASE_URL",
|
||||||
"HOME_BASE_URL": "HOME_BASE_URL",
|
"HOME_BASE_URL": "HOME_BASE_URL",
|
||||||
@ -34,6 +43,9 @@ KEY_ALIASES = {
|
|||||||
"MODULE_NAME": "MODULE_NAME",
|
"MODULE_NAME": "MODULE_NAME",
|
||||||
"versionNumber": "VERSION_NUMBER",
|
"versionNumber": "VERSION_NUMBER",
|
||||||
"VERSION_NUMBER": "VERSION_NUMBER",
|
"VERSION_NUMBER": "VERSION_NUMBER",
|
||||||
|
"parentVersionNumber": "PARENT_VERSION_NUMBER",
|
||||||
|
"PARENT_VERSION_NUMBER": "PARENT_VERSION_NUMBER",
|
||||||
|
"parent_version_number": "PARENT_VERSION_NUMBER",
|
||||||
"zipFilePath": "ZIP_FILE_PATH",
|
"zipFilePath": "ZIP_FILE_PATH",
|
||||||
"ZIP_FILE_PATH": "ZIP_FILE_PATH",
|
"ZIP_FILE_PATH": "ZIP_FILE_PATH",
|
||||||
"actionType": "ACTION_TYPE",
|
"actionType": "ACTION_TYPE",
|
||||||
@ -48,8 +60,86 @@ KEY_ALIASES = {
|
|||||||
class RuleBasedLlmClient:
|
class RuleBasedLlmClient:
|
||||||
"""基于规则的轻量 LLM client fallback。"""
|
"""基于规则的轻量 LLM client fallback。"""
|
||||||
|
|
||||||
|
def chat(self, text: str, context: dict[str, Any] | None = None) -> str:
|
||||||
|
"""规则 fallback 的普通对话说明。"""
|
||||||
|
logger.info("规则 LLM 普通对话 text=%s context=%s", redact_for_log(text, max_text_len=800), json_for_log(context or {}))
|
||||||
|
lowered = text.lower()
|
||||||
|
if any(word in lowered for word in ("help", "帮助", "怎么用", "命令")):
|
||||||
|
return strip_thinking_text(
|
||||||
|
"当前是本地规则 LLM fallback。可用 `analyze <需求>` 分析部署需求,`run` 执行完整 workflow,"
|
||||||
|
"`action propose <需求>` 解析单个 action,`action run ...` 确认后执行单个 action,"
|
||||||
|
"`log analyze <路径>` 分析日志尾部。"
|
||||||
|
)
|
||||||
|
return strip_thinking_text(
|
||||||
|
"当前未配置真实 LLM,已使用本地规则 fallback。普通闲聊只能给出有限说明;"
|
||||||
|
"如需自然语言问答、日志深度分析或更准确的 action 解析,请配置真实 LLM。"
|
||||||
|
)
|
||||||
|
|
||||||
|
def chat_stream(self, text: str, context: dict[str, Any] | None = None) -> Iterable[str]:
|
||||||
|
"""规则 fallback 的流式对话兼容实现。"""
|
||||||
|
yield self.chat(text, context=context)
|
||||||
|
|
||||||
|
def analyze_log(self, log_text: str, question: str | None = None, source_path: str = "") -> str:
|
||||||
|
"""用本地规则分析日志尾部。"""
|
||||||
|
logger.info("规则 LLM 日志分析 source=%s question=%s text_len=%s", source_path, redact_for_log(question or "", max_text_len=300), len(log_text))
|
||||||
|
lines = log_text.splitlines()
|
||||||
|
problem_lines = [
|
||||||
|
line
|
||||||
|
for line in lines
|
||||||
|
if re.search(r"error|exception|fail|traceback|timeout|refused|denied|失败|异常|错误|超时", line, flags=re.IGNORECASE)
|
||||||
|
]
|
||||||
|
summary = [
|
||||||
|
f"日志来源: {source_path or '-'}",
|
||||||
|
f"已分析尾部 {len(lines)} 行。",
|
||||||
|
]
|
||||||
|
if question:
|
||||||
|
summary.append(f"关注问题: {question}")
|
||||||
|
if problem_lines:
|
||||||
|
summary.append(f"发现 {len(problem_lines)} 行疑似异常,最近几条:")
|
||||||
|
summary.extend(f"- {redact_for_log(line, max_text_len=240)}" for line in problem_lines[-5:])
|
||||||
|
summary.append("建议:优先检查以上异常附近的接口返回、网络连通性、认证信息和目标服务状态。")
|
||||||
|
else:
|
||||||
|
summary.append("未在日志尾部发现明显 ERROR/Exception/fail/timeout 关键字。")
|
||||||
|
summary.append("建议:如问题仍存在,请扩大 `--tail` 或提供更具体的问题描述。")
|
||||||
|
return strip_thinking_text("\n".join(summary))
|
||||||
|
|
||||||
|
def propose_action(
|
||||||
|
self,
|
||||||
|
text: str,
|
||||||
|
allowed_actions: list[str],
|
||||||
|
params: dict[str, Any],
|
||||||
|
state_summary: dict[str, Any] | None = None,
|
||||||
|
) -> LlmSingleActionProposal:
|
||||||
|
"""只在用户明确写出 action 名时生成单 action 建议。"""
|
||||||
|
logger.info(
|
||||||
|
"规则 LLM 单 action 解析开始 text=%s allowed=%s state=%s",
|
||||||
|
redact_for_log(text, max_text_len=800),
|
||||||
|
allowed_actions,
|
||||||
|
json_for_log(state_summary or {}),
|
||||||
|
)
|
||||||
|
action = ""
|
||||||
|
lowered = text.lower()
|
||||||
|
for candidate in allowed_actions:
|
||||||
|
if candidate.lower() in lowered:
|
||||||
|
action = candidate
|
||||||
|
break
|
||||||
|
ip_match = re.search(r"\b(?:\d{1,3}\.){3}\d{1,3}\b", text)
|
||||||
|
kwargs = _safe_action_kwargs(self._extract_key_values(text))
|
||||||
|
risk = "high" if action in ("publish-version", "create-download-task", "upgrade-ip", "start-ip", "stop-ip", "rollback-ip") else "medium"
|
||||||
|
proposal = LlmSingleActionProposal(
|
||||||
|
action=action,
|
||||||
|
ip=ip_match.group(0) if ip_match else "",
|
||||||
|
kwargs=kwargs,
|
||||||
|
reason="规则 fallback 仅在输入中出现明确 action 名时生成建议。" if action else "未识别到明确 action 名。",
|
||||||
|
risk_level=risk, # type: ignore[arg-type]
|
||||||
|
requires_confirmation=True,
|
||||||
|
)
|
||||||
|
logger.info("规则 LLM 单 action 解析完成 proposal=%s", json_for_log(asdict(proposal)))
|
||||||
|
return proposal
|
||||||
|
|
||||||
def understand_request(self, text: str) -> LlmIntentResult:
|
def understand_request(self, text: str) -> LlmIntentResult:
|
||||||
"""用关键词规则识别用户意图和执行策略偏好。"""
|
"""用关键词规则识别用户意图和执行策略偏好。"""
|
||||||
|
logger.info("规则 LLM 意图识别开始 text=%s", redact_for_log(text, max_text_len=800))
|
||||||
lowered = text.lower()
|
lowered = text.lower()
|
||||||
reasons: list[str] = []
|
reasons: list[str] = []
|
||||||
intent = "deploy"
|
intent = "deploy"
|
||||||
@ -82,16 +172,19 @@ class RuleBasedLlmClient:
|
|||||||
if intent == "preview":
|
if intent == "preview":
|
||||||
strategy_preference = strategy_preference if strategy_preference != "未指定" else "hybrid_node_mcp"
|
strategy_preference = strategy_preference if strategy_preference != "未指定" else "hybrid_node_mcp"
|
||||||
|
|
||||||
return LlmIntentResult(
|
result = LlmIntentResult(
|
||||||
intent=intent, # type: ignore[arg-type]
|
intent=intent, # type: ignore[arg-type]
|
||||||
mode_preference=mode_preference, # type: ignore[arg-type]
|
mode_preference=mode_preference, # type: ignore[arg-type]
|
||||||
strategy_preference=strategy_preference, # type: ignore[arg-type]
|
strategy_preference=strategy_preference, # type: ignore[arg-type]
|
||||||
confidence=0.72 if intent != "deploy" else 0.6,
|
confidence=0.72 if intent != "deploy" else 0.6,
|
||||||
reasons=reasons,
|
reasons=reasons,
|
||||||
)
|
)
|
||||||
|
logger.info("规则 LLM 意图识别完成 result=%s", json_for_log(asdict(result)))
|
||||||
|
return result
|
||||||
|
|
||||||
def extract_params(self, text: str, base_params: dict[str, Any] | None = None) -> LlmParamResult:
|
def extract_params(self, text: str, base_params: dict[str, Any] | None = None) -> LlmParamResult:
|
||||||
"""从 key=value、中文短语和 IP 地址中抽取参数。"""
|
"""从 key=value、中文短语和 IP 地址中抽取参数。"""
|
||||||
|
logger.info("规则 LLM 参数抽取开始 text=%s base_params=%s", redact_for_log(text, max_text_len=800), json_for_log(base_params or {}))
|
||||||
params = dict(base_params or {})
|
params = dict(base_params or {})
|
||||||
params.update(self._extract_key_values(text))
|
params.update(self._extract_key_values(text))
|
||||||
params.update(self._extract_chinese_patterns(text))
|
params.update(self._extract_chinese_patterns(text))
|
||||||
@ -103,12 +196,14 @@ class RuleBasedLlmClient:
|
|||||||
|
|
||||||
missing = [key for key in REQUIRED_PARAMS if not params.get(key)]
|
missing = [key for key in REQUIRED_PARAMS if not params.get(key)]
|
||||||
sensitive = [key for key in ("CLIENT_SECRET", "CLIENT_ID") if params.get(key)]
|
sensitive = [key for key in ("CLIENT_SECRET", "CLIENT_ID") if params.get(key)]
|
||||||
return LlmParamResult(
|
result = LlmParamResult(
|
||||||
extracted_params=params,
|
extracted_params=params,
|
||||||
extracted_control=control,
|
extracted_control=control,
|
||||||
missing_required_params=missing,
|
missing_required_params=missing,
|
||||||
sensitive_fields_present=sensitive,
|
sensitive_fields_present=sensitive,
|
||||||
)
|
)
|
||||||
|
logger.info("规则 LLM 参数抽取完成 result=%s", json_for_log(asdict(result)))
|
||||||
|
return result
|
||||||
|
|
||||||
def generate_plan(
|
def generate_plan(
|
||||||
self,
|
self,
|
||||||
@ -118,6 +213,7 @@ class RuleBasedLlmClient:
|
|||||||
strategy: ExecutionStrategy,
|
strategy: ExecutionStrategy,
|
||||||
) -> LlmDeployPlan:
|
) -> LlmDeployPlan:
|
||||||
"""生成确定性的部署计划和风险提示。"""
|
"""生成确定性的部署计划和风险提示。"""
|
||||||
|
logger.info("规则 LLM 计划生成开始 intent=%s strategy=%s params=%s", intent, strategy, json_for_log(params))
|
||||||
if strategy == "hybrid_node_mcp":
|
if strategy == "hybrid_node_mcp":
|
||||||
strategy_text = "PAM_HOME 使用脚本 action,PAM_NODE 使用 MCP"
|
strategy_text = "PAM_HOME 使用脚本 action,PAM_NODE 使用 MCP"
|
||||||
elif strategy == "script_only":
|
elif strategy == "script_only":
|
||||||
@ -139,22 +235,38 @@ class RuleBasedLlmClient:
|
|||||||
if strategy == "hybrid_node_mcp":
|
if strategy == "hybrid_node_mcp":
|
||||||
risk_notes.append("PAM_HOME 当前没有 MCP 能力,HOME 阶段仍会调用脚本 action。")
|
risk_notes.append("PAM_HOME 当前没有 MCP 能力,HOME 阶段仍会调用脚本 action。")
|
||||||
|
|
||||||
return LlmDeployPlan(
|
result = LlmDeployPlan(
|
||||||
summary=summary,
|
summary=summary,
|
||||||
risk_notes=risk_notes,
|
risk_notes=risk_notes,
|
||||||
planned_actions=list(GLOBAL_ACTION_SEQUENCE),
|
planned_actions=list(GLOBAL_ACTION_SEQUENCE),
|
||||||
requires_confirmation=intent in ("deploy", "query_node_ips", "rollback"),
|
requires_confirmation=intent in ("deploy", "query_node_ips", "rollback"),
|
||||||
execution_strategy=strategy,
|
execution_strategy=strategy,
|
||||||
)
|
)
|
||||||
|
logger.info("规则 LLM 计划生成完成 result=%s", json_for_log(asdict(result)))
|
||||||
|
return result
|
||||||
|
|
||||||
def analyze_action_result(
|
def analyze_action_result(
|
||||||
self,
|
self,
|
||||||
*,
|
*,
|
||||||
action: str,
|
action: str,
|
||||||
result: ActionResult,
|
result: ActionResult,
|
||||||
state_summary: dict[str, Any],
|
|
||||||
) -> LlmActionAnalysis:
|
) -> LlmActionAnalysis:
|
||||||
"""用本地规则分析 action 结果,作为真实 LLM 不可用时的兜底。"""
|
"""用本地规则分析 action 结果,作为真实 LLM 不可用时的兜底。"""
|
||||||
|
logger.info(
|
||||||
|
"规则 LLM action 审核开始 action=%s result=%s",
|
||||||
|
action,
|
||||||
|
json_for_log(
|
||||||
|
{
|
||||||
|
"backend": result.backend,
|
||||||
|
"ok": result.ok,
|
||||||
|
"exit_code": result.exit_code,
|
||||||
|
"tool_name": result.tool_name,
|
||||||
|
"values": result.values,
|
||||||
|
"error_summary": result.error_summary,
|
||||||
|
},
|
||||||
|
max_text_len=1000,
|
||||||
|
),
|
||||||
|
)
|
||||||
notes: list[str] = []
|
notes: list[str] = []
|
||||||
has_anomaly = not result.ok
|
has_anomaly = not result.ok
|
||||||
severity = "info"
|
severity = "info"
|
||||||
@ -162,11 +274,12 @@ class RuleBasedLlmClient:
|
|||||||
suggested_action = "继续观察。"
|
suggested_action = "继续观察。"
|
||||||
requires_confirmation = False
|
requires_confirmation = False
|
||||||
should_continue = True
|
should_continue = True
|
||||||
|
progress_complete: bool | None = None
|
||||||
|
|
||||||
if not result.ok:
|
if not result.ok:
|
||||||
severity = "medium"
|
severity = "medium"
|
||||||
possible_reason = result.error_summary or "action 返回失败状态。"
|
possible_reason = result.error_summary or "action 返回失败状态。"
|
||||||
suggested_action = "查看 action stderr/raw_output,确认参数、网络和目标服务状态。"
|
suggested_action = "查看 action 诊断日志、参数、网络和目标服务状态。"
|
||||||
notes.append("硬规则检测到 action 执行失败。")
|
notes.append("硬规则检测到 action 执行失败。")
|
||||||
should_continue = False
|
should_continue = False
|
||||||
|
|
||||||
@ -188,6 +301,25 @@ class RuleBasedLlmClient:
|
|||||||
notes.append("rollback-ip 失败需要人工处理。")
|
notes.append("rollback-ip 失败需要人工处理。")
|
||||||
should_continue = False
|
should_continue = False
|
||||||
|
|
||||||
|
if action in ("poll-download-progress", "poll-upgrade-progress"):
|
||||||
|
progress_complete, progress_has_anomaly, progress_reason, progress_note = _analyze_progress_values(action, result.values)
|
||||||
|
if progress_note:
|
||||||
|
notes.append(progress_note)
|
||||||
|
if progress_has_anomaly:
|
||||||
|
has_anomaly = True
|
||||||
|
severity = "high"
|
||||||
|
possible_reason = progress_reason or possible_reason or "进度接口返回失败状态。"
|
||||||
|
suggested_action = "停止后续 action,检查下载/推送任务状态、PAM_HOME/PAM_NODE 日志和接口返回。"
|
||||||
|
should_continue = False
|
||||||
|
elif progress_complete:
|
||||||
|
has_anomaly = has_anomaly or False
|
||||||
|
suggested_action = "进度已完成,可以继续下一个 action。"
|
||||||
|
should_continue = should_continue and True
|
||||||
|
elif result.ok:
|
||||||
|
severity = severity if has_anomaly else "info"
|
||||||
|
suggested_action = "进度未完成,继续查询进度。"
|
||||||
|
should_continue = should_continue and True
|
||||||
|
|
||||||
if result.values.get("PENDING_AGENT_CONFIRMATION"):
|
if result.values.get("PENDING_AGENT_CONFIRMATION"):
|
||||||
has_anomaly = True
|
has_anomaly = True
|
||||||
severity = "high"
|
severity = "high"
|
||||||
@ -197,7 +329,7 @@ class RuleBasedLlmClient:
|
|||||||
notes.append("action 返回待人工确认标记。")
|
notes.append("action 返回待人工确认标记。")
|
||||||
should_continue = False
|
should_continue = False
|
||||||
|
|
||||||
return LlmActionAnalysis(
|
analysis = LlmActionAnalysis(
|
||||||
action=action,
|
action=action,
|
||||||
has_anomaly=has_anomaly,
|
has_anomaly=has_anomaly,
|
||||||
severity=severity, # type: ignore[arg-type]
|
severity=severity, # type: ignore[arg-type]
|
||||||
@ -205,8 +337,11 @@ class RuleBasedLlmClient:
|
|||||||
suggested_action=suggested_action,
|
suggested_action=suggested_action,
|
||||||
requires_confirmation=requires_confirmation,
|
requires_confirmation=requires_confirmation,
|
||||||
should_continue=should_continue,
|
should_continue=should_continue,
|
||||||
|
progress_complete=progress_complete,
|
||||||
notes=notes,
|
notes=notes,
|
||||||
)
|
)
|
||||||
|
logger.info("规则 LLM action 审核完成 analysis=%s", json_for_log(asdict(analysis)))
|
||||||
|
return analysis
|
||||||
|
|
||||||
def _extract_key_values(self, text: str) -> dict[str, str]:
|
def _extract_key_values(self, text: str) -> dict[str, str]:
|
||||||
"""抽取 KEY=VALUE 形式的参数。"""
|
"""抽取 KEY=VALUE 形式的参数。"""
|
||||||
@ -225,6 +360,7 @@ class RuleBasedLlmClient:
|
|||||||
"APP_NAME": r"(?:应用|应用名)\s*[::]?\s*([A-Za-z0-9_.-]+)",
|
"APP_NAME": r"(?:应用|应用名)\s*[::]?\s*([A-Za-z0-9_.-]+)",
|
||||||
"MODULE_NAME": r"(?:模块|模块名)\s*[::]?\s*([A-Za-z0-9_.-]+)",
|
"MODULE_NAME": r"(?:模块|模块名)\s*[::]?\s*([A-Za-z0-9_.-]+)",
|
||||||
"VERSION_NUMBER": r"(?:版本|版本号)\s*[::]?\s*([A-Za-z0-9_.-]+)",
|
"VERSION_NUMBER": r"(?:版本|版本号)\s*[::]?\s*([A-Za-z0-9_.-]+)",
|
||||||
|
"PARENT_VERSION_NUMBER": r"(?:继承版本|父版本|规则版本|继承哪个版本的规则)\s*[::]?\s*([A-Za-z0-9_.-]+)",
|
||||||
"ZIP_FILE_PATH": r"(?:包|软件包|zip)\s*[::]?\s*([A-Za-z]:[\\/][^\s,;]+|/[^\s,;]+)",
|
"ZIP_FILE_PATH": r"(?:包|软件包|zip)\s*[::]?\s*([A-Za-z]:[\\/][^\s,;]+|/[^\s,;]+)",
|
||||||
}
|
}
|
||||||
params: dict[str, str] = {}
|
params: dict[str, str] = {}
|
||||||
@ -233,3 +369,54 @@ class RuleBasedLlmClient:
|
|||||||
if match:
|
if match:
|
||||||
params[key] = match.group(1)
|
params[key] = match.group(1)
|
||||||
return params
|
return params
|
||||||
|
|
||||||
|
|
||||||
|
def _analyze_progress_values(action: str, values: dict[str, Any]) -> tuple[bool, bool, str, str]:
|
||||||
|
"""分析进度字段,返回完成状态、异常状态、原因和备注。"""
|
||||||
|
step = _lower_value(values.get("STEP"))
|
||||||
|
status = _lower_value(values.get("STATUS"))
|
||||||
|
msg = _lower_value(values.get("MSG"))
|
||||||
|
message = _lower_value(values.get("MESSAGE"))
|
||||||
|
success = _lower_value(values.get("SUCCESS"))
|
||||||
|
finish = _lower_value(values.get("FINISH"))
|
||||||
|
code = _lower_value(values.get("CODE"))
|
||||||
|
rate = _lower_value(values.get("RATE_OF_PROGRESS"))
|
||||||
|
|
||||||
|
complete = False
|
||||||
|
if step == "done":
|
||||||
|
complete = True
|
||||||
|
elif status in ("completed", "complete", "done", "success", "succeeded"):
|
||||||
|
complete = True
|
||||||
|
elif success in ("true", "1", "yes"):
|
||||||
|
complete = True
|
||||||
|
elif action == "poll-upgrade-progress" and finish in ("true", "1", "yes"):
|
||||||
|
complete = True
|
||||||
|
elif msg == "success" and rate == "100" and (not code or code == "0"):
|
||||||
|
complete = True
|
||||||
|
|
||||||
|
if code and code != "0":
|
||||||
|
return complete, True, f"进度接口返回非 0 CODE: {code}", _progress_note(values)
|
||||||
|
combined = " ".join(item for item in (step, status, msg, message) if item)
|
||||||
|
if re.search(r"fail|error", combined, flags=re.IGNORECASE):
|
||||||
|
return complete, True, values.get("MESSAGE") or values.get("MSG") or values.get("STEP") or "进度接口返回失败状态", _progress_note(values)
|
||||||
|
return complete, False, "", _progress_note(values)
|
||||||
|
|
||||||
|
|
||||||
|
def _progress_note(values: dict[str, Any]) -> str:
|
||||||
|
"""把进度核心字段整理成一条备注。"""
|
||||||
|
parts = []
|
||||||
|
for key in ("RATE_OF_PROGRESS", "STEP", "MSG", "STATUS", "SUCCESS", "CODE", "FINISH", "MESSAGE"):
|
||||||
|
value = values.get(key)
|
||||||
|
if value not in (None, ""):
|
||||||
|
parts.append(f"{key}={value}")
|
||||||
|
return "当前进度: " + ", ".join(parts) if parts else "进度接口未返回明确进度字段。"
|
||||||
|
|
||||||
|
|
||||||
|
def _lower_value(value: Any) -> str:
|
||||||
|
"""把字段值转成小写字符串。"""
|
||||||
|
return str(value).strip().lower() if value is not None else ""
|
||||||
|
|
||||||
|
|
||||||
|
def _safe_action_kwargs(values: dict[str, str]) -> dict[str, str]:
|
||||||
|
"""过滤单 action 额外参数,避免把敏感字段放入执行建议。"""
|
||||||
|
return {key: value for key, value in values.items() if key not in SENSITIVE_KEYS}
|
||||||
|
|||||||
260
pam_deploy_graph/llm/text_filter.py
Normal file
260
pam_deploy_graph/llm/text_filter.py
Normal file
@ -0,0 +1,260 @@
|
|||||||
|
"""LLM 文本输出过滤工具。"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from collections.abc import Iterable, Iterator
|
||||||
|
import re
|
||||||
|
|
||||||
|
OPEN_THINK_TAG = "<think>"
|
||||||
|
CLOSE_THINK_TAG = "</think>"
|
||||||
|
REASONING_START_RE = re.compile(
|
||||||
|
r"^\s*(?:[#>\-*]+\s*)*(?:\*\*)?"
|
||||||
|
r"(?:thinking process|thought process|reasoning process|chain of thought|internal reasoning|inner monologue|"
|
||||||
|
r"思考过程|推理过程|内部思考)"
|
||||||
|
r"(?:\*\*)?\s*(?:[::]|\s*$)",
|
||||||
|
flags=re.IGNORECASE,
|
||||||
|
)
|
||||||
|
FINAL_ANSWER_RE = re.compile(
|
||||||
|
r"^\s*(?:[#>\-*]+\s*)*(?:\*\*)?"
|
||||||
|
r"(?:final answer|final response|answer|response|最终答案|最终回答|正式回答|回答|回复|结论)"
|
||||||
|
r"(?:\*\*)?\s*[::]\s*",
|
||||||
|
flags=re.IGNORECASE | re.MULTILINE,
|
||||||
|
)
|
||||||
|
REASONING_LINE_RE = re.compile(
|
||||||
|
r"(thinking process|thought process|reasoning process|chain of thought|internal reasoning|inner monologue|"
|
||||||
|
r"analyze the request|determine the response|drafting the response|refining the response|"
|
||||||
|
r"user question|input json|role:|constraints:|requirements:|"
|
||||||
|
r"do not output|do not automatically|hidden analysis|forbidden tags|"
|
||||||
|
r"i need to|i should|i must|i will|i can|must ensure|should briefly|ensure no|keep it concise|"
|
||||||
|
r"思考过程|推理过程|内部思考|分析请求|确定回答|起草回答|优化回答|隐藏分析)",
|
||||||
|
flags=re.IGNORECASE,
|
||||||
|
)
|
||||||
|
MAX_REASONING_PREFIX_HOLD = 80
|
||||||
|
|
||||||
|
|
||||||
|
def strip_thinking_text(text: str) -> str:
|
||||||
|
"""移除 LLM 普通文本输出里的思考标签、显式思考段和内容。"""
|
||||||
|
filter_ = ThinkingTextStreamFilter()
|
||||||
|
visible = filter_.feed(text) + filter_.finish()
|
||||||
|
return visible.strip()
|
||||||
|
|
||||||
|
|
||||||
|
def filter_thinking_chunks(chunks: Iterable[str]) -> Iterator[str]:
|
||||||
|
"""按流式分片移除思考内容,避免跨分片泄露。"""
|
||||||
|
filter_ = ThinkingTextStreamFilter()
|
||||||
|
for chunk in chunks:
|
||||||
|
visible = filter_.feed(str(chunk))
|
||||||
|
if visible:
|
||||||
|
yield visible
|
||||||
|
tail = filter_.finish()
|
||||||
|
if tail:
|
||||||
|
yield tail
|
||||||
|
|
||||||
|
|
||||||
|
class ThinkingTextStreamFilter:
|
||||||
|
"""支持跨 chunk 识别 think 标签和显式思考段的流式过滤器。"""
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
"""初始化可见/隐藏状态和待判定缓冲区。"""
|
||||||
|
self._pending = ""
|
||||||
|
self._inside_think = False
|
||||||
|
self._reasoning_filter = ExplicitReasoningStreamFilter()
|
||||||
|
|
||||||
|
def feed(self, chunk: str) -> str:
|
||||||
|
"""输入一个文本分片,返回当前可安全展示的可见文本。"""
|
||||||
|
if not chunk:
|
||||||
|
return ""
|
||||||
|
self._pending += chunk
|
||||||
|
output: list[str] = []
|
||||||
|
while self._pending:
|
||||||
|
lowered = self._pending.lower()
|
||||||
|
if self._inside_think:
|
||||||
|
close_index = lowered.find(CLOSE_THINK_TAG)
|
||||||
|
if close_index >= 0:
|
||||||
|
self._pending = self._pending[close_index + len(CLOSE_THINK_TAG) :]
|
||||||
|
self._inside_think = False
|
||||||
|
continue
|
||||||
|
keep = _longest_suffix_prefix(lowered, [CLOSE_THINK_TAG])
|
||||||
|
self._pending = self._pending[-keep:] if keep else ""
|
||||||
|
break
|
||||||
|
|
||||||
|
open_index = lowered.find(OPEN_THINK_TAG)
|
||||||
|
close_index = lowered.find(CLOSE_THINK_TAG)
|
||||||
|
if open_index >= 0 and (close_index < 0 or open_index < close_index):
|
||||||
|
output.append(self._pending[:open_index])
|
||||||
|
self._pending = self._pending[open_index + len(OPEN_THINK_TAG) :]
|
||||||
|
self._inside_think = True
|
||||||
|
continue
|
||||||
|
if close_index >= 0:
|
||||||
|
output.append(self._pending[:close_index])
|
||||||
|
self._pending = self._pending[close_index + len(CLOSE_THINK_TAG) :]
|
||||||
|
continue
|
||||||
|
|
||||||
|
keep = _longest_suffix_prefix(lowered, [OPEN_THINK_TAG, CLOSE_THINK_TAG])
|
||||||
|
if keep:
|
||||||
|
output.append(self._pending[:-keep])
|
||||||
|
self._pending = self._pending[-keep:]
|
||||||
|
else:
|
||||||
|
output.append(self._pending)
|
||||||
|
self._pending = ""
|
||||||
|
break
|
||||||
|
return self._reasoning_filter.feed("".join(output))
|
||||||
|
|
||||||
|
def finish(self) -> str:
|
||||||
|
"""结束流式过滤,丢弃未闭合 think 内容和未完成标签。"""
|
||||||
|
if self._inside_think:
|
||||||
|
self._pending = ""
|
||||||
|
self._inside_think = False
|
||||||
|
return self._reasoning_filter.finish()
|
||||||
|
lowered = self._pending.lower()
|
||||||
|
if lowered in _tag_prefixes():
|
||||||
|
self._pending = ""
|
||||||
|
return self._reasoning_filter.finish()
|
||||||
|
tail = self._pending
|
||||||
|
self._pending = ""
|
||||||
|
return self._reasoning_filter.feed(tail) + self._reasoning_filter.finish()
|
||||||
|
|
||||||
|
|
||||||
|
class ExplicitReasoningStreamFilter:
|
||||||
|
"""过滤以 `Thinking Process:` 等形式输出的显式思考段。"""
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
"""初始化思考段识别状态。"""
|
||||||
|
self._buffer = ""
|
||||||
|
self._mode = "undecided"
|
||||||
|
|
||||||
|
def feed(self, chunk: str) -> str:
|
||||||
|
"""输入已去掉 think 标签的文本,返回可展示内容。"""
|
||||||
|
if not chunk:
|
||||||
|
return ""
|
||||||
|
if self._mode == "pass":
|
||||||
|
return chunk
|
||||||
|
self._buffer += chunk
|
||||||
|
if self._mode == "suppress":
|
||||||
|
final_text = _extract_after_final_answer_marker(self._buffer)
|
||||||
|
if final_text is not None:
|
||||||
|
self._buffer = ""
|
||||||
|
self._mode = "pass"
|
||||||
|
return final_text
|
||||||
|
return ""
|
||||||
|
if _starts_with_reasoning_marker(self._buffer):
|
||||||
|
self._mode = "suppress"
|
||||||
|
final_text = _extract_after_final_answer_marker(self._buffer)
|
||||||
|
if final_text is not None:
|
||||||
|
self._buffer = ""
|
||||||
|
self._mode = "pass"
|
||||||
|
return final_text
|
||||||
|
return ""
|
||||||
|
if _could_be_reasoning_marker_prefix(self._buffer):
|
||||||
|
return ""
|
||||||
|
self._mode = "pass"
|
||||||
|
visible = self._buffer
|
||||||
|
self._buffer = ""
|
||||||
|
return visible
|
||||||
|
|
||||||
|
def finish(self) -> str:
|
||||||
|
"""结束过滤,输出普通缓冲或清理被压住的显式思考段。"""
|
||||||
|
if not self._buffer:
|
||||||
|
return ""
|
||||||
|
if self._mode == "suppress" or _starts_with_reasoning_marker(self._buffer):
|
||||||
|
visible = _strip_leading_reasoning_section(self._buffer)
|
||||||
|
else:
|
||||||
|
visible = self._buffer
|
||||||
|
self._buffer = ""
|
||||||
|
self._mode = "pass"
|
||||||
|
return visible
|
||||||
|
|
||||||
|
|
||||||
|
def _starts_with_reasoning_marker(text: str) -> bool:
|
||||||
|
"""判断文本首个非空内容是否是显式思考段标记。"""
|
||||||
|
return REASONING_START_RE.match(text) is not None
|
||||||
|
|
||||||
|
|
||||||
|
def _could_be_reasoning_marker_prefix(text: str) -> bool:
|
||||||
|
"""流式初始阶段判断当前缓冲是否可能是思考段标记的一部分。"""
|
||||||
|
candidate = _normalize_marker_prefix(text)
|
||||||
|
if not candidate:
|
||||||
|
return True
|
||||||
|
markers = (
|
||||||
|
"thinking process",
|
||||||
|
"thought process",
|
||||||
|
"reasoning process",
|
||||||
|
"chain of thought",
|
||||||
|
"internal reasoning",
|
||||||
|
"inner monologue",
|
||||||
|
"思考过程",
|
||||||
|
"推理过程",
|
||||||
|
"内部思考",
|
||||||
|
)
|
||||||
|
return len(candidate) < MAX_REASONING_PREFIX_HOLD and any(marker.startswith(candidate) for marker in markers)
|
||||||
|
|
||||||
|
|
||||||
|
def _normalize_marker_prefix(text: str) -> str:
|
||||||
|
"""把流式开头清理成便于判断的 marker 前缀。"""
|
||||||
|
stripped = text.lstrip()
|
||||||
|
stripped = re.sub(r"^(?:[#>\-*]+\s*)+", "", stripped)
|
||||||
|
stripped = stripped.strip("*").strip()
|
||||||
|
return stripped.lower()
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_after_final_answer_marker(text: str) -> str | None:
|
||||||
|
"""如果存在最终回答标记,返回标记后的正文。"""
|
||||||
|
matches = list(FINAL_ANSWER_RE.finditer(text))
|
||||||
|
if not matches:
|
||||||
|
return None
|
||||||
|
return text[matches[-1].end() :].strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _strip_leading_reasoning_section(text: str) -> str:
|
||||||
|
"""删除以显式思考标记开头的推理段,保留后续最终正文。"""
|
||||||
|
final_text = _extract_after_final_answer_marker(text)
|
||||||
|
if final_text is not None:
|
||||||
|
return final_text
|
||||||
|
lines = text.splitlines()
|
||||||
|
first = _first_non_empty_line_index(lines)
|
||||||
|
if first is None or not _starts_with_reasoning_marker(lines[first]):
|
||||||
|
return text.strip()
|
||||||
|
last_reasoning = first
|
||||||
|
for index in range(first, len(lines)):
|
||||||
|
if _looks_like_reasoning_line(lines[index]):
|
||||||
|
last_reasoning = index
|
||||||
|
return "\n".join(lines[last_reasoning + 1 :]).strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _first_non_empty_line_index(lines: list[str]) -> int | None:
|
||||||
|
"""返回首个非空行下标。"""
|
||||||
|
for index, line in enumerate(lines):
|
||||||
|
if line.strip():
|
||||||
|
return index
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _looks_like_reasoning_line(line: str) -> bool:
|
||||||
|
"""识别常见显式思考过程行。"""
|
||||||
|
stripped = line.strip()
|
||||||
|
if not stripped:
|
||||||
|
return False
|
||||||
|
if _starts_with_reasoning_marker(stripped):
|
||||||
|
return True
|
||||||
|
if REASONING_LINE_RE.search(stripped):
|
||||||
|
return True
|
||||||
|
return bool(re.match(r"^\s*\d+\.\s*\*\*[^*]+(?:request|response|answer|constraints|过程|回答)[^*]*\*\*", stripped, flags=re.IGNORECASE))
|
||||||
|
|
||||||
|
|
||||||
|
def _longest_suffix_prefix(text: str, targets: list[str]) -> int:
|
||||||
|
"""返回 text 末尾与任一目标标签前缀匹配的最长长度。"""
|
||||||
|
best = 0
|
||||||
|
for target in targets:
|
||||||
|
max_len = min(len(text), len(target) - 1)
|
||||||
|
for length in range(1, max_len + 1):
|
||||||
|
if text.endswith(target[:length]):
|
||||||
|
best = max(best, length)
|
||||||
|
return best
|
||||||
|
|
||||||
|
|
||||||
|
def _tag_prefixes() -> set[str]:
|
||||||
|
"""生成 think 标签的所有非完整前缀,用于收尾时丢弃半截标签。"""
|
||||||
|
prefixes = {""}
|
||||||
|
for tag in (OPEN_THINK_TAG, CLOSE_THINK_TAG):
|
||||||
|
prefixes.update(tag[:index] for index in range(1, len(tag)))
|
||||||
|
return prefixes
|
||||||
150
pam_deploy_graph/logging_utils.py
Normal file
150
pam_deploy_graph/logging_utils.py
Normal file
@ -0,0 +1,150 @@
|
|||||||
|
"""Agent 运行日志配置和脱敏工具。"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
from dataclasses import asdict, is_dataclass
|
||||||
|
from logging.handlers import TimedRotatingFileHandler
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from .constants import SENSITIVE_KEYS
|
||||||
|
|
||||||
|
DEFAULT_LOG_FILE = Path("logs") / "pam_deploy_agent.log"
|
||||||
|
DEFAULT_LOG_RETENTION_DAYS = 14
|
||||||
|
LOG_FILE_ENV = "PAM_AGENT_LOG_FILE"
|
||||||
|
LOG_LEVEL_ENV = "PAM_AGENT_LOG_LEVEL"
|
||||||
|
LOG_RETENTION_DAYS_ENV = "PAM_AGENT_LOG_RETENTION_DAYS"
|
||||||
|
_HANDLER_MARKER = "_pam_deploy_agent_handler"
|
||||||
|
_SENSITIVE_NAME_PARTS = ("secret", "token", "authorization", "api_key", "apikey", "password")
|
||||||
|
_ASSIGNMENT_PATTERN = re.compile(
|
||||||
|
r"(?i)\b(client_secret|mcp_client_secret|api_key|pam_llm_api_key|token|access_token|authorization|password)\b"
|
||||||
|
r"\s*([:=])\s*([^\s,;]+)"
|
||||||
|
)
|
||||||
|
_AUTH_BEARER_ASSIGNMENT_PATTERN = re.compile(r"(?i)\b(authorization)\b\s*([:=])\s*bearer\s+[^\s,;]+")
|
||||||
|
_BEARER_PATTERN = re.compile(r"(?i)(bearer\s+)[A-Za-z0-9._~+\-/=]+")
|
||||||
|
|
||||||
|
|
||||||
|
def configure_logging(
|
||||||
|
log_file: str | Path | None = None,
|
||||||
|
level: str | int | None = None,
|
||||||
|
retention_days: int | str | None = None,
|
||||||
|
) -> Path:
|
||||||
|
"""配置 Agent 每日滚动文件日志;重复调用不会重复添加 handler。"""
|
||||||
|
actual_path = Path(log_file or os.getenv(LOG_FILE_ENV) or DEFAULT_LOG_FILE)
|
||||||
|
actual_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
actual_level = _resolve_level(level or os.getenv(LOG_LEVEL_ENV) or "INFO")
|
||||||
|
actual_retention_days = _resolve_retention_days(
|
||||||
|
retention_days if retention_days is not None else os.getenv(LOG_RETENTION_DAYS_ENV),
|
||||||
|
)
|
||||||
|
|
||||||
|
package_logger = logging.getLogger("pam_deploy_graph")
|
||||||
|
package_logger.setLevel(actual_level)
|
||||||
|
package_logger.propagate = False
|
||||||
|
|
||||||
|
marker = str(actual_path.resolve())
|
||||||
|
for handler in list(package_logger.handlers):
|
||||||
|
if getattr(handler, _HANDLER_MARKER, "") == marker:
|
||||||
|
if isinstance(handler, TimedRotatingFileHandler):
|
||||||
|
handler.setLevel(actual_level)
|
||||||
|
handler.backupCount = actual_retention_days
|
||||||
|
return actual_path
|
||||||
|
package_logger.removeHandler(handler)
|
||||||
|
handler.close()
|
||||||
|
break
|
||||||
|
|
||||||
|
handler = TimedRotatingFileHandler(
|
||||||
|
actual_path,
|
||||||
|
when="midnight",
|
||||||
|
interval=1,
|
||||||
|
backupCount=actual_retention_days,
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
setattr(handler, _HANDLER_MARKER, marker)
|
||||||
|
handler.setLevel(actual_level)
|
||||||
|
handler.setFormatter(
|
||||||
|
logging.Formatter(
|
||||||
|
fmt="%(asctime)s %(levelname)s [%(name)s] %(message)s",
|
||||||
|
datefmt="%Y-%m-%d %H:%M:%S",
|
||||||
|
)
|
||||||
|
)
|
||||||
|
package_logger.addHandler(handler)
|
||||||
|
package_logger.info(
|
||||||
|
"日志已初始化 path=%s level=%s rotation=daily retention_days=%s",
|
||||||
|
actual_path,
|
||||||
|
logging.getLevelName(actual_level),
|
||||||
|
actual_retention_days,
|
||||||
|
)
|
||||||
|
return actual_path
|
||||||
|
|
||||||
|
|
||||||
|
def redact_for_log(value: Any, *, max_text_len: int = 1200) -> Any:
|
||||||
|
"""递归脱敏并截断日志对象,避免把密钥、token 或完整长文本写入日志。"""
|
||||||
|
if is_dataclass(value) and not isinstance(value, type):
|
||||||
|
return redact_for_log(asdict(value), max_text_len=max_text_len)
|
||||||
|
if isinstance(value, dict):
|
||||||
|
redacted: dict[str, Any] = {}
|
||||||
|
for key, item in value.items():
|
||||||
|
text_key = str(key)
|
||||||
|
if _is_sensitive_key(text_key):
|
||||||
|
redacted[text_key] = "***"
|
||||||
|
else:
|
||||||
|
redacted[text_key] = redact_for_log(item, max_text_len=max_text_len)
|
||||||
|
return redacted
|
||||||
|
if isinstance(value, (list, tuple, set)):
|
||||||
|
return [redact_for_log(item, max_text_len=max_text_len) for item in value]
|
||||||
|
if isinstance(value, str):
|
||||||
|
return _truncate(_redact_string(value), max_text_len)
|
||||||
|
if value is None or isinstance(value, (bool, int, float)):
|
||||||
|
return value
|
||||||
|
return _truncate(_redact_string(str(value)), max_text_len)
|
||||||
|
|
||||||
|
|
||||||
|
def json_for_log(value: Any, *, max_text_len: int = 1200) -> str:
|
||||||
|
"""把对象脱敏后序列化成适合单行日志的 JSON 文本。"""
|
||||||
|
redacted = redact_for_log(value, max_text_len=max_text_len)
|
||||||
|
return json.dumps(redacted, ensure_ascii=False, default=str, sort_keys=True)
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_level(value: str | int) -> int:
|
||||||
|
"""解析日志级别字符串,非法值降级为 INFO。"""
|
||||||
|
if isinstance(value, int):
|
||||||
|
return value
|
||||||
|
resolved = getattr(logging, str(value).upper(), logging.INFO)
|
||||||
|
return resolved if isinstance(resolved, int) else logging.INFO
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_retention_days(value: int | str | None) -> int:
|
||||||
|
"""解析日志保留天数,非法值使用默认值。"""
|
||||||
|
if value in (None, ""):
|
||||||
|
return DEFAULT_LOG_RETENTION_DAYS
|
||||||
|
try:
|
||||||
|
days = int(str(value).strip())
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
return DEFAULT_LOG_RETENTION_DAYS
|
||||||
|
return max(days, 0)
|
||||||
|
|
||||||
|
|
||||||
|
def _is_sensitive_key(key: str) -> bool:
|
||||||
|
"""判断字段名是否应脱敏。"""
|
||||||
|
if key in SENSITIVE_KEYS:
|
||||||
|
return True
|
||||||
|
normalized = key.lower().replace("-", "_")
|
||||||
|
return any(part in normalized for part in _SENSITIVE_NAME_PARTS)
|
||||||
|
|
||||||
|
|
||||||
|
def _truncate(value: str, limit: int) -> str:
|
||||||
|
"""截断过长字符串。"""
|
||||||
|
if len(value) <= limit:
|
||||||
|
return value
|
||||||
|
return value[:limit] + "...[已截断]"
|
||||||
|
|
||||||
|
|
||||||
|
def _redact_string(value: str) -> str:
|
||||||
|
"""脱敏字符串中的常见 KEY=VALUE 和 Bearer token 片段。"""
|
||||||
|
value = _AUTH_BEARER_ASSIGNMENT_PATTERN.sub(lambda match: f"{match.group(1)}{match.group(2)}***", value)
|
||||||
|
value = _ASSIGNMENT_PATTERN.sub(lambda match: f"{match.group(1)}{match.group(2)}***", value)
|
||||||
|
return _BEARER_PATTERN.sub(lambda match: f"{match.group(1)}***", value)
|
||||||
@ -7,6 +7,7 @@ callable 或 SDK session 适配成这个接口,避免业务代码绑定具体
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import json
|
import json
|
||||||
|
import logging
|
||||||
import time
|
import time
|
||||||
import urllib.parse
|
import urllib.parse
|
||||||
import urllib.request
|
import urllib.request
|
||||||
@ -16,6 +17,10 @@ from dataclasses import dataclass, field
|
|||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
|
from .logging_utils import json_for_log
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
@dataclass(frozen=True)
|
||||||
class McpAuthConfig:
|
class McpAuthConfig:
|
||||||
@ -111,10 +116,21 @@ class McpClientConfig:
|
|||||||
|
|
||||||
def load_mcp_client_config(path: str | Path) -> McpClientConfig:
|
def load_mcp_client_config(path: str | Path) -> McpClientConfig:
|
||||||
"""读取 MCP client JSON 配置文件。"""
|
"""读取 MCP client JSON 配置文件。"""
|
||||||
|
logger.info("读取 MCP client 配置 path=%s", path)
|
||||||
payload = json.loads(Path(path).read_text(encoding="utf-8"))
|
payload = json.loads(Path(path).read_text(encoding="utf-8"))
|
||||||
if not isinstance(payload, dict):
|
if not isinstance(payload, dict):
|
||||||
raise ValueError("MCP client 配置必须是 JSON object")
|
raise ValueError("MCP client 配置必须是 JSON object")
|
||||||
return McpClientConfig.from_mapping(payload)
|
config = McpClientConfig.from_mapping(payload)
|
||||||
|
logger.info(
|
||||||
|
"MCP client 配置读取完成 path=%s transport=%s server_url=%s command=%s has_auth=%s tool_names=%s",
|
||||||
|
path,
|
||||||
|
config.transport,
|
||||||
|
config.server_url,
|
||||||
|
config.command,
|
||||||
|
config.auth is not None,
|
||||||
|
json_for_log(config.tool_names),
|
||||||
|
)
|
||||||
|
return config
|
||||||
|
|
||||||
|
|
||||||
class FunctionMcpToolClient:
|
class FunctionMcpToolClient:
|
||||||
@ -126,6 +142,7 @@ class FunctionMcpToolClient:
|
|||||||
|
|
||||||
def call_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
|
def call_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
|
||||||
"""调用底层函数并返回原始结果。"""
|
"""调用底层函数并返回原始结果。"""
|
||||||
|
logger.info("Function MCP tool 调用 tool=%s arguments=%s", tool_name, json_for_log(arguments))
|
||||||
return self.caller(tool_name, arguments)
|
return self.caller(tool_name, arguments)
|
||||||
|
|
||||||
|
|
||||||
@ -147,13 +164,19 @@ class SessionMcpToolClient:
|
|||||||
|
|
||||||
def call_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
|
def call_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
|
||||||
"""调用 SDK session,并把 SDK 返回值归一化。"""
|
"""调用 SDK session,并把 SDK 返回值归一化。"""
|
||||||
|
logger.info("Session MCP tool 调用开始 tool=%s arguments=%s", tool_name, json_for_log(arguments))
|
||||||
result = self.session.call_tool(tool_name, arguments)
|
result = self.session.call_tool(tool_name, arguments)
|
||||||
return normalize_mcp_sdk_result(result)
|
normalized = normalize_mcp_sdk_result(result)
|
||||||
|
logger.info("Session MCP tool 调用完成 tool=%s result=%s", tool_name, json_for_log(normalized, max_text_len=1600))
|
||||||
|
return normalized
|
||||||
|
|
||||||
def list_tools(self) -> list[str]:
|
def list_tools(self) -> list[str]:
|
||||||
"""从 SDK session 获取 tool 名称列表。"""
|
"""从 SDK session 获取 tool 名称列表。"""
|
||||||
|
logger.info("Session MCP list_tools 开始")
|
||||||
result = self.session.list_tools()
|
result = self.session.list_tools()
|
||||||
return normalize_mcp_tool_list(result)
|
tools = normalize_mcp_tool_list(result)
|
||||||
|
logger.info("Session MCP list_tools 完成 tools=%s", tools)
|
||||||
|
return tools
|
||||||
|
|
||||||
|
|
||||||
class StdioMcpToolClient:
|
class StdioMcpToolClient:
|
||||||
@ -176,9 +199,19 @@ class StdioMcpToolClient:
|
|||||||
self.env = env
|
self.env = env
|
||||||
self.cwd = cwd or None
|
self.cwd = cwd or None
|
||||||
self.timeout_seconds = timeout_seconds
|
self.timeout_seconds = timeout_seconds
|
||||||
|
logger.info(
|
||||||
|
"stdio MCP client 初始化 command=%s args=%s cwd=%s env_keys=%s timeout=%s",
|
||||||
|
self.command,
|
||||||
|
self.args,
|
||||||
|
self.cwd or "",
|
||||||
|
sorted((self.env or {}).keys()),
|
||||||
|
self.timeout_seconds,
|
||||||
|
)
|
||||||
|
|
||||||
def call_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
|
def call_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
|
||||||
"""创建一次 MCP stdio session,调用 tool 后关闭 session。"""
|
"""创建一次 MCP stdio session,调用 tool 后关闭 session。"""
|
||||||
|
started_at = time.perf_counter()
|
||||||
|
logger.info("stdio MCP tool 调用开始 tool=%s arguments=%s", tool_name, json_for_log(arguments))
|
||||||
try:
|
try:
|
||||||
import anyio
|
import anyio
|
||||||
from mcp import ClientSession
|
from mcp import ClientSession
|
||||||
@ -203,10 +236,23 @@ class StdioMcpToolClient:
|
|||||||
)
|
)
|
||||||
return normalize_mcp_sdk_result(result)
|
return normalize_mcp_sdk_result(result)
|
||||||
|
|
||||||
return anyio.run(call_once)
|
try:
|
||||||
|
result = anyio.run(call_once)
|
||||||
|
except Exception:
|
||||||
|
logger.exception("stdio MCP tool 调用失败 tool=%s duration_ms=%s", tool_name, int((time.perf_counter() - started_at) * 1000))
|
||||||
|
raise
|
||||||
|
logger.info(
|
||||||
|
"stdio MCP tool 调用完成 tool=%s duration_ms=%s result=%s",
|
||||||
|
tool_name,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
json_for_log(result, max_text_len=1600),
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
def list_tools(self) -> list[str]:
|
def list_tools(self) -> list[str]:
|
||||||
"""创建一次 MCP stdio session,读取 server 暴露的 tool 列表。"""
|
"""创建一次 MCP stdio session,读取 server 暴露的 tool 列表。"""
|
||||||
|
started_at = time.perf_counter()
|
||||||
|
logger.info("stdio MCP list_tools 开始")
|
||||||
try:
|
try:
|
||||||
import anyio
|
import anyio
|
||||||
from mcp import ClientSession
|
from mcp import ClientSession
|
||||||
@ -227,7 +273,13 @@ class StdioMcpToolClient:
|
|||||||
result = await session.list_tools()
|
result = await session.list_tools()
|
||||||
return normalize_mcp_tool_list(result)
|
return normalize_mcp_tool_list(result)
|
||||||
|
|
||||||
return anyio.run(list_once)
|
try:
|
||||||
|
tools = anyio.run(list_once)
|
||||||
|
except Exception:
|
||||||
|
logger.exception("stdio MCP list_tools 失败 duration_ms=%s", int((time.perf_counter() - started_at) * 1000))
|
||||||
|
raise
|
||||||
|
logger.info("stdio MCP list_tools 完成 duration_ms=%s tools=%s", int((time.perf_counter() - started_at) * 1000), tools)
|
||||||
|
return tools
|
||||||
|
|
||||||
|
|
||||||
class OAuthTokenProvider:
|
class OAuthTokenProvider:
|
||||||
@ -243,6 +295,12 @@ class OAuthTokenProvider:
|
|||||||
self.timeout_seconds = timeout_seconds
|
self.timeout_seconds = timeout_seconds
|
||||||
self._token = ""
|
self._token = ""
|
||||||
self._expires_at = 0.0
|
self._expires_at = 0.0
|
||||||
|
logger.info(
|
||||||
|
"MCP OAuth token provider 初始化 token_url=%s client_id=%s timeout=%s",
|
||||||
|
self.config.token_url,
|
||||||
|
self.config.client_id,
|
||||||
|
self.timeout_seconds,
|
||||||
|
)
|
||||||
|
|
||||||
def authorization_headers(self) -> dict[str, str]:
|
def authorization_headers(self) -> dict[str, str]:
|
||||||
"""返回带 token 的请求头。"""
|
"""返回带 token 的请求头。"""
|
||||||
@ -255,7 +313,9 @@ class OAuthTokenProvider:
|
|||||||
"""获取可用 token,未过期时复用缓存。"""
|
"""获取可用 token,未过期时复用缓存。"""
|
||||||
now = time.time()
|
now = time.time()
|
||||||
if self._token and now < self._expires_at:
|
if self._token and now < self._expires_at:
|
||||||
|
logger.info("MCP auth token 使用缓存 expires_in_sec=%s", int(self._expires_at - now))
|
||||||
return self._token
|
return self._token
|
||||||
|
logger.info("MCP auth token 开始刷新 token_url=%s client_id=%s", self.config.token_url, self.config.client_id)
|
||||||
payload = {
|
payload = {
|
||||||
"grant_type": self.config.grant_type,
|
"grant_type": self.config.grant_type,
|
||||||
"client_id": self.config.client_id,
|
"client_id": self.config.client_id,
|
||||||
@ -278,6 +338,7 @@ class OAuthTokenProvider:
|
|||||||
expires_in = _safe_float(result.get(self.config.expires_in_field), 3600)
|
expires_in = _safe_float(result.get(self.config.expires_in_field), 3600)
|
||||||
self._token = token
|
self._token = token
|
||||||
self._expires_at = now + max(expires_in - 60, 1)
|
self._expires_at = now + max(expires_in - 60, 1)
|
||||||
|
logger.info("MCP auth token 刷新完成 expires_in=%s cached_until=%s", expires_in, int(self._expires_at))
|
||||||
return token
|
return token
|
||||||
|
|
||||||
|
|
||||||
@ -305,14 +366,23 @@ class HttpMcpToolClient:
|
|||||||
self.auth_provider = auth_provider
|
self.auth_provider = auth_provider
|
||||||
self.timeout_seconds = timeout_seconds
|
self.timeout_seconds = timeout_seconds
|
||||||
self.sse_read_timeout_seconds = sse_read_timeout_seconds
|
self.sse_read_timeout_seconds = sse_read_timeout_seconds
|
||||||
|
logger.info(
|
||||||
|
"HTTP MCP client 初始化 url=%s transport=%s has_auth=%s headers=%s timeout=%s sse_read_timeout=%s",
|
||||||
|
self.url,
|
||||||
|
self.transport,
|
||||||
|
self.auth_provider is not None,
|
||||||
|
json_for_log(self.headers),
|
||||||
|
self.timeout_seconds,
|
||||||
|
self.sse_read_timeout_seconds,
|
||||||
|
)
|
||||||
|
|
||||||
def call_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
|
def call_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
|
||||||
"""连接 MCP server,调用 tool 后关闭 session。"""
|
"""连接 MCP server,调用 tool 后关闭 session。"""
|
||||||
return self._run_session(lambda session: session.call_tool(tool_name, arguments))
|
return self._run_session(lambda session: session.call_tool(tool_name, arguments), operation_name=f"call_tool:{tool_name}", arguments=arguments)
|
||||||
|
|
||||||
def list_tools(self) -> list[str]:
|
def list_tools(self) -> list[str]:
|
||||||
"""连接 MCP server,读取 server 暴露的 tool 名称。"""
|
"""连接 MCP server,读取 server 暴露的 tool 名称。"""
|
||||||
return self._run_session(lambda session: session.list_tools(), normalize_tools=True)
|
return self._run_session(lambda session: session.list_tools(), normalize_tools=True, operation_name="list_tools")
|
||||||
|
|
||||||
def _build_headers(self) -> dict[str, str]:
|
def _build_headers(self) -> dict[str, str]:
|
||||||
"""合并静态 headers 和动态鉴权 token。"""
|
"""合并静态 headers 和动态鉴权 token。"""
|
||||||
@ -321,8 +391,23 @@ class HttpMcpToolClient:
|
|||||||
headers.update(self.auth_provider.authorization_headers())
|
headers.update(self.auth_provider.authorization_headers())
|
||||||
return headers
|
return headers
|
||||||
|
|
||||||
def _run_session(self, operation: Callable[[Any], Any], *, normalize_tools: bool = False) -> Any:
|
def _run_session(
|
||||||
|
self,
|
||||||
|
operation: Callable[[Any], Any],
|
||||||
|
*,
|
||||||
|
normalize_tools: bool = False,
|
||||||
|
operation_name: str = "operation",
|
||||||
|
arguments: dict[str, Any] | None = None,
|
||||||
|
) -> Any:
|
||||||
"""创建一次 HTTP/SSE MCP session 并执行指定操作。"""
|
"""创建一次 HTTP/SSE MCP session 并执行指定操作。"""
|
||||||
|
started_at = time.perf_counter()
|
||||||
|
logger.info(
|
||||||
|
"HTTP MCP session 开始 operation=%s url=%s transport=%s arguments=%s",
|
||||||
|
operation_name,
|
||||||
|
self.url,
|
||||||
|
self.transport,
|
||||||
|
json_for_log(arguments or {}),
|
||||||
|
)
|
||||||
try:
|
try:
|
||||||
import anyio
|
import anyio
|
||||||
from mcp import ClientSession
|
from mcp import ClientSession
|
||||||
@ -357,7 +442,24 @@ class HttpMcpToolClient:
|
|||||||
result = await operation(session)
|
result = await operation(session)
|
||||||
return normalize_mcp_tool_list(result) if normalize_tools else normalize_mcp_sdk_result(result)
|
return normalize_mcp_tool_list(result) if normalize_tools else normalize_mcp_sdk_result(result)
|
||||||
|
|
||||||
return anyio.run(call_once)
|
try:
|
||||||
|
result = anyio.run(call_once)
|
||||||
|
except Exception:
|
||||||
|
logger.exception(
|
||||||
|
"HTTP MCP session 失败 operation=%s url=%s transport=%s duration_ms=%s",
|
||||||
|
operation_name,
|
||||||
|
self.url,
|
||||||
|
self.transport,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
)
|
||||||
|
raise
|
||||||
|
logger.info(
|
||||||
|
"HTTP MCP session 完成 operation=%s duration_ms=%s result=%s",
|
||||||
|
operation_name,
|
||||||
|
int((time.perf_counter() - started_at) * 1000),
|
||||||
|
json_for_log(result, max_text_len=1600),
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
def normalize_mcp_sdk_result(result: Any) -> Any:
|
def normalize_mcp_sdk_result(result: Any) -> Any:
|
||||||
|
|||||||
@ -2,8 +2,10 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
|
from .logging_utils import json_for_log
|
||||||
from .mcp_client import (
|
from .mcp_client import (
|
||||||
HttpMcpToolClient,
|
HttpMcpToolClient,
|
||||||
McpClientConfig,
|
McpClientConfig,
|
||||||
@ -13,16 +15,36 @@ from .mcp_client import (
|
|||||||
)
|
)
|
||||||
from .mcp_runner import McpActionRunner
|
from .mcp_runner import McpActionRunner
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def build_mcp_runner_from_config(path: str | Path) -> McpActionRunner:
|
def build_mcp_runner_from_config(path: str | Path) -> McpActionRunner:
|
||||||
"""读取 MCP 配置文件,并构造可直接给 Agent 使用的 runner。"""
|
"""读取 MCP 配置文件,并构造可直接给 Agent 使用的 runner。"""
|
||||||
|
logger.info("开始构建 MCP runner config_path=%s", path)
|
||||||
config = load_mcp_client_config(path)
|
config = load_mcp_client_config(path)
|
||||||
client = build_mcp_client(config)
|
client = build_mcp_client(config)
|
||||||
return McpActionRunner(client=client, tool_names=config.tool_names or None)
|
runner = McpActionRunner(client=client, tool_names=config.tool_names or None)
|
||||||
|
logger.info(
|
||||||
|
"MCP runner 构建完成 config_path=%s transport=%s server_url=%s client=%s tool_names=%s",
|
||||||
|
path,
|
||||||
|
config.transport,
|
||||||
|
config.server_url,
|
||||||
|
type(client).__name__,
|
||||||
|
json_for_log(config.tool_names),
|
||||||
|
)
|
||||||
|
return runner
|
||||||
|
|
||||||
|
|
||||||
def build_mcp_client(config: McpClientConfig):
|
def build_mcp_client(config: McpClientConfig):
|
||||||
"""根据 transport 类型创建 MCP client。"""
|
"""根据 transport 类型创建 MCP client。"""
|
||||||
|
logger.info(
|
||||||
|
"开始构建 MCP client transport=%s server_url=%s command=%s has_auth=%s headers=%s",
|
||||||
|
config.transport,
|
||||||
|
config.server_url,
|
||||||
|
config.command,
|
||||||
|
config.auth is not None,
|
||||||
|
json_for_log(config.headers),
|
||||||
|
)
|
||||||
if config.transport == "stdio":
|
if config.transport == "stdio":
|
||||||
return StdioMcpToolClient(
|
return StdioMcpToolClient(
|
||||||
command=config.command,
|
command=config.command,
|
||||||
|
|||||||
@ -2,11 +2,15 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
from typing import Any, Protocol
|
from typing import Any, Protocol
|
||||||
|
|
||||||
|
from .logging_utils import json_for_log
|
||||||
from .models import ActionResult
|
from .models import ActionResult
|
||||||
from .output_parser import parse_mcp_result
|
from .output_parser import parse_mcp_result
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class McpToolClient(Protocol):
|
class McpToolClient(Protocol):
|
||||||
"""MCP 工具客户端需要实现的最小同步接口。"""
|
"""MCP 工具客户端需要实现的最小同步接口。"""
|
||||||
@ -46,6 +50,11 @@ class McpActionRunner:
|
|||||||
self.client = client
|
self.client = client
|
||||||
self.tool_names = tool_names or {}
|
self.tool_names = tool_names or {}
|
||||||
self._discovered_tools: list[str] | None = None
|
self._discovered_tools: list[str] | None = None
|
||||||
|
logger.info(
|
||||||
|
"MCP action runner 初始化 client=%s explicit_tool_names=%s",
|
||||||
|
type(client).__name__ if client else "",
|
||||||
|
json_for_log(self.tool_names),
|
||||||
|
)
|
||||||
|
|
||||||
def run(
|
def run(
|
||||||
self,
|
self,
|
||||||
@ -70,16 +79,34 @@ class McpActionRunner:
|
|||||||
node_url=node_url,
|
node_url=node_url,
|
||||||
stop_first=stop_first,
|
stop_first=stop_first,
|
||||||
)
|
)
|
||||||
|
logger.info(
|
||||||
|
"MCP action 调用开始 action=%s tool=%s arguments=%s",
|
||||||
|
action,
|
||||||
|
tool_name,
|
||||||
|
json_for_log(arguments),
|
||||||
|
)
|
||||||
try:
|
try:
|
||||||
payload = self.client.call_tool(tool_name, arguments)
|
payload = self.client.call_tool(tool_name, arguments)
|
||||||
except Exception as exc: # pragma: no cover - 防御性异常包装
|
except Exception as exc: # pragma: no cover - 防御性异常包装
|
||||||
|
logger.exception("MCP action 调用异常 action=%s tool=%s", action, tool_name)
|
||||||
return parse_mcp_result(action, {}, ok=False, tool_name=tool_name, error=str(exc))
|
return parse_mcp_result(action, {}, ok=False, tool_name=tool_name, error=str(exc))
|
||||||
return parse_mcp_result(action, payload, ok=True, tool_name=tool_name)
|
logger.info("MCP action 原始返回 action=%s tool=%s payload=%s", action, tool_name, json_for_log(payload, max_text_len=1600))
|
||||||
|
result = parse_mcp_result(action, payload, ok=True, tool_name=tool_name)
|
||||||
|
logger.info(
|
||||||
|
"MCP action 解析完成 action=%s tool=%s ok=%s values=%s error=%s",
|
||||||
|
action,
|
||||||
|
tool_name,
|
||||||
|
result.ok,
|
||||||
|
json_for_log(result.values),
|
||||||
|
result.error_summary,
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
def _resolve_tool_name(self, action: str) -> str:
|
def _resolve_tool_name(self, action: str) -> str:
|
||||||
"""根据显式映射、server tools 自动发现和默认约定解析 tool name。"""
|
"""根据显式映射、server tools 自动发现和默认约定解析 tool name。"""
|
||||||
explicit = self.tool_names.get(action)
|
explicit = self.tool_names.get(action)
|
||||||
if explicit:
|
if explicit:
|
||||||
|
logger.info("MCP tool 使用显式映射 action=%s tool=%s", action, explicit)
|
||||||
return explicit
|
return explicit
|
||||||
|
|
||||||
discovered = self._list_discovered_tools()
|
discovered = self._list_discovered_tools()
|
||||||
@ -89,12 +116,14 @@ class McpActionRunner:
|
|||||||
for candidate in candidates:
|
for candidate in candidates:
|
||||||
matched = by_lower.get(candidate.lower())
|
matched = by_lower.get(candidate.lower())
|
||||||
if matched:
|
if matched:
|
||||||
|
logger.info("MCP tool 自动匹配 action=%s tool=%s candidates=%s", action, matched, candidates)
|
||||||
return matched
|
return matched
|
||||||
available = ", ".join(discovered)
|
available = ", ".join(discovered)
|
||||||
raise ValueError(f"MCP server 未发现 action 对应 tool: {action}; 已发现: {available}")
|
raise ValueError(f"MCP server 未发现 action 对应 tool: {action}; 已发现: {available}")
|
||||||
|
|
||||||
fallback = DEFAULT_NODE_MCP_TOOLS.get(action)
|
fallback = DEFAULT_NODE_MCP_TOOLS.get(action)
|
||||||
if fallback:
|
if fallback:
|
||||||
|
logger.info("MCP tool 使用默认约定 action=%s tool=%s", action, fallback)
|
||||||
return fallback
|
return fallback
|
||||||
raise ValueError(f"action 未映射 MCP tool: {action}")
|
raise ValueError(f"action 未映射 MCP tool: {action}")
|
||||||
|
|
||||||
@ -108,7 +137,9 @@ class McpActionRunner:
|
|||||||
try:
|
try:
|
||||||
self._discovered_tools = list(self.client.list_tools())
|
self._discovered_tools = list(self.client.list_tools())
|
||||||
except Exception:
|
except Exception:
|
||||||
|
logger.exception("MCP tool 自动发现失败,使用默认 tool name 约定")
|
||||||
self._discovered_tools = []
|
self._discovered_tools = []
|
||||||
|
logger.info("MCP tool 自动发现完成 tools=%s", self._discovered_tools)
|
||||||
return self._discovered_tools
|
return self._discovered_tools
|
||||||
|
|
||||||
def _build_arguments(
|
def _build_arguments(
|
||||||
@ -138,6 +169,8 @@ class McpActionRunner:
|
|||||||
arguments["hashCode"] = hash_code
|
arguments["hashCode"] = hash_code
|
||||||
if node_url:
|
if node_url:
|
||||||
arguments["nodeUrl"] = node_url
|
arguments["nodeUrl"] = node_url
|
||||||
|
if action == "create-download-task" and params.get("PARENT_VERSION_NUMBER"):
|
||||||
|
arguments["parentVersionNumber"] = params.get("PARENT_VERSION_NUMBER")
|
||||||
if action == "rollback-ip":
|
if action == "rollback-ip":
|
||||||
arguments["stopFirst"] = stop_first
|
arguments["stopFirst"] = stop_first
|
||||||
return {key: value for key, value in arguments.items() if value not in (None, "")}
|
return {key: value for key, value in arguments.items() if value not in (None, "")}
|
||||||
|
|||||||
@ -11,6 +11,7 @@ IntentName = Literal["deploy", "show_usage", "preview", "query_node_ips", "rollb
|
|||||||
ModePreference = Literal["MCP", "API脚本", "未指定"]
|
ModePreference = Literal["MCP", "API脚本", "未指定"]
|
||||||
StrategyPreference = Literal["hybrid_node_mcp", "script_only", "fake", "未指定"]
|
StrategyPreference = Literal["hybrid_node_mcp", "script_only", "fake", "未指定"]
|
||||||
ActionAnalysisSeverity = Literal["info", "low", "medium", "high"]
|
ActionAnalysisSeverity = Literal["info", "low", "medium", "high"]
|
||||||
|
ActionRiskLevel = Literal["low", "medium", "high"]
|
||||||
|
|
||||||
|
|
||||||
@dataclass(slots=True)
|
@dataclass(slots=True)
|
||||||
@ -100,9 +101,22 @@ class LlmActionAnalysis:
|
|||||||
suggested_action: str = ""
|
suggested_action: str = ""
|
||||||
requires_confirmation: bool = False
|
requires_confirmation: bool = False
|
||||||
should_continue: bool = True
|
should_continue: bool = True
|
||||||
|
progress_complete: bool | None = None
|
||||||
notes: list[str] = field(default_factory=list)
|
notes: list[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(slots=True)
|
||||||
|
class LlmSingleActionProposal:
|
||||||
|
"""LLM 对单次 action 调用的结构化建议。"""
|
||||||
|
|
||||||
|
action: str
|
||||||
|
ip: str = ""
|
||||||
|
kwargs: dict[str, Any] = field(default_factory=dict)
|
||||||
|
reason: str = ""
|
||||||
|
risk_level: ActionRiskLevel = "medium"
|
||||||
|
requires_confirmation: bool = True
|
||||||
|
|
||||||
|
|
||||||
@dataclass(slots=True)
|
@dataclass(slots=True)
|
||||||
class AgentState:
|
class AgentState:
|
||||||
"""一次部署运行的完整状态,可序列化到 checkpoint。"""
|
"""一次部署运行的完整状态,可序列化到 checkpoint。"""
|
||||||
@ -131,3 +145,4 @@ class AgentState:
|
|||||||
pause_reason: str = ""
|
pause_reason: str = ""
|
||||||
review_context: dict[str, Any] = field(default_factory=dict)
|
review_context: dict[str, Any] = field(default_factory=dict)
|
||||||
events: list[dict[str, Any]] = field(default_factory=list)
|
events: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
poll_attempts: dict[str, int] = field(default_factory=dict)
|
||||||
|
|||||||
@ -2,13 +2,18 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
import subprocess
|
import subprocess
|
||||||
|
import time
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
|
from .logging_utils import json_for_log
|
||||||
from .models import ActionResult
|
from .models import ActionResult
|
||||||
from .output_parser import parse_script_result
|
from .output_parser import parse_script_result
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class ScriptActionRunner:
|
class ScriptActionRunner:
|
||||||
"""脚本 action runner,负责构造命令、执行脚本并解析结果。"""
|
"""脚本 action runner,负责构造命令、执行脚本并解析结果。"""
|
||||||
@ -40,6 +45,18 @@ class ScriptActionRunner:
|
|||||||
stop_first=stop_first,
|
stop_first=stop_first,
|
||||||
trace_file_path=trace_file_path,
|
trace_file_path=trace_file_path,
|
||||||
)
|
)
|
||||||
|
started_at = time.perf_counter()
|
||||||
|
logger.info(
|
||||||
|
"脚本 action 开始 action=%s command=%s cwd=%s config=%s ip=%s trace=%s timeout=%s",
|
||||||
|
action,
|
||||||
|
json_for_log(command),
|
||||||
|
self.script_base_dir,
|
||||||
|
config_path,
|
||||||
|
ip or "",
|
||||||
|
trace_file_path or "",
|
||||||
|
timeout_sec,
|
||||||
|
)
|
||||||
|
try:
|
||||||
completed = subprocess.run(
|
completed = subprocess.run(
|
||||||
command,
|
command,
|
||||||
cwd=str(self.script_base_dir),
|
cwd=str(self.script_base_dir),
|
||||||
@ -48,7 +65,19 @@ class ScriptActionRunner:
|
|||||||
timeout=timeout_sec,
|
timeout=timeout_sec,
|
||||||
check=False,
|
check=False,
|
||||||
)
|
)
|
||||||
return parse_script_result(
|
except Exception:
|
||||||
|
logger.exception("脚本 action 执行异常 action=%s command=%s cwd=%s", action, json_for_log(command), self.script_base_dir)
|
||||||
|
raise
|
||||||
|
duration_ms = int((time.perf_counter() - started_at) * 1000)
|
||||||
|
logger.info(
|
||||||
|
"脚本 action 结束 action=%s exit_code=%s duration_ms=%s stdout=%s stderr=%s",
|
||||||
|
action,
|
||||||
|
completed.returncode,
|
||||||
|
duration_ms,
|
||||||
|
json_for_log(completed.stdout, max_text_len=1200),
|
||||||
|
json_for_log(completed.stderr, max_text_len=1200),
|
||||||
|
)
|
||||||
|
result = parse_script_result(
|
||||||
action=action,
|
action=action,
|
||||||
stdout=completed.stdout,
|
stdout=completed.stdout,
|
||||||
stderr=completed.stderr,
|
stderr=completed.stderr,
|
||||||
@ -56,6 +85,14 @@ class ScriptActionRunner:
|
|||||||
backend="script",
|
backend="script",
|
||||||
tool_name=script_entry,
|
tool_name=script_entry,
|
||||||
)
|
)
|
||||||
|
logger.info(
|
||||||
|
"脚本 action 解析完成 action=%s ok=%s values=%s error=%s",
|
||||||
|
action,
|
||||||
|
result.ok,
|
||||||
|
json_for_log(result.values),
|
||||||
|
result.error_summary,
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
def build_command(
|
def build_command(
|
||||||
self,
|
self,
|
||||||
|
|||||||
@ -70,7 +70,7 @@ ACTION_TOOL_SPECS: dict[str, ActionToolSpec] = {
|
|||||||
name="poll_download_progress",
|
name="poll_download_progress",
|
||||||
action="poll-download-progress",
|
action="poll-download-progress",
|
||||||
scope="global",
|
scope="global",
|
||||||
description="轮询云下载任务进度。",
|
description="单次查询云下载任务进度;是否继续查询由 Agent workflow 和 LLM 审核决定。",
|
||||||
risk_level="medium",
|
risk_level="medium",
|
||||||
),
|
),
|
||||||
"upgrade-ip": ActionToolSpec(
|
"upgrade-ip": ActionToolSpec(
|
||||||
@ -85,7 +85,7 @@ ACTION_TOOL_SPECS: dict[str, ActionToolSpec] = {
|
|||||||
name="poll_upgrade_progress",
|
name="poll_upgrade_progress",
|
||||||
action="poll-upgrade-progress",
|
action="poll-upgrade-progress",
|
||||||
scope="ip",
|
scope="ip",
|
||||||
description="轮询单个工作站升级进度。",
|
description="单次查询单个工作站升级进度;是否继续查询由 Agent workflow 和 LLM 审核决定。",
|
||||||
risk_level="medium",
|
risk_level="medium",
|
||||||
),
|
),
|
||||||
"start-ip": ActionToolSpec(
|
"start-ip": ActionToolSpec(
|
||||||
|
|||||||
@ -9,10 +9,19 @@
|
|||||||
"suggested_action": "...",
|
"suggested_action": "...",
|
||||||
"requires_confirmation": false,
|
"requires_confirmation": false,
|
||||||
"should_continue": true,
|
"should_continue": true,
|
||||||
|
"progress_complete": null,
|
||||||
"notes": ["..."]
|
"notes": ["..."]
|
||||||
}
|
}
|
||||||
|
|
||||||
要求:
|
要求:
|
||||||
- 必须明确给出 `should_continue`:没有问题时为 true;存在需要人工判断的问题时为 false。
|
- 必须明确给出 `should_continue`:没有问题时为 true;存在需要人工判断的问题时为 false。
|
||||||
- 如果 exit_code 非 0、ok=false、verify-ip SUCCESS=false、出现 pending_confirmation,应标记异常。
|
- 如果 exit_code 非 0、ok=false、verify-ip SUCCESS=false、出现旧版 pending_confirmation,应标记异常。
|
||||||
|
- 对 `poll-download-progress`、`poll-upgrade-progress` 必须判断 `progress_complete`:已完成为 true;未完成但正常为 false;非进度 action 可为 null。
|
||||||
|
- 进度 action 未完成但正常时,`has_anomaly=false`、`should_continue=true`、`progress_complete=false`,建议继续查询进度。
|
||||||
|
- 进度 action 完成条件优先看 `STEP=DONE`、`STATUS=completed/done/success`、`SUCCESS=true`、`FINISH=true`,或 `MSG=success` 且 `RATE_OF_PROGRESS=100` 且 `CODE` 为空或 0。
|
||||||
|
- 进度 action 出现 `CODE` 非 0,或 `STEP/MSG/STATUS/MESSAGE` 含 fail/error,应标记异常并 `should_continue=false`。
|
||||||
|
- 主要依据结构化字段 `ok`、`exit_code`、`values`、`error_summary` 判断;不会提供完整运行态摘要,避免被历史状态误导。
|
||||||
|
- `verify-ip SUCCESS=false` 由 runtime 按配置重复检查;单次审核仍应说明当前健康检查未通过。
|
||||||
|
- 只有输入里存在 `diagnostic_log` 时,才把它当作异常诊断上下文。
|
||||||
|
- 脚本正常过程日志不会作为错误依据,不能因为日志来自 stderr 就判定异常。
|
||||||
- 不要输出密钥、token、Authorization 或完整日志原文。
|
- 不要输出密钥、token、Authorization 或完整日志原文。
|
||||||
|
|||||||
@ -18,11 +18,13 @@ PARAMS = {
|
|||||||
"MODULE_NAME": "Node",
|
"MODULE_NAME": "Node",
|
||||||
"VERSION_NUMBER": "2.0.5",
|
"VERSION_NUMBER": "2.0.5",
|
||||||
"ZIP_FILE_PATH": "C:/pkg.zip",
|
"ZIP_FILE_PATH": "C:/pkg.zip",
|
||||||
|
"VERIFY_INTERVAL_SEC": 0,
|
||||||
|
"VERIFY_MAX_ATTEMPTS": 2,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class BlockingReviewLlmClient:
|
class BlockingReviewLlmClient:
|
||||||
def analyze_action_result(self, *, action, result, state_summary):
|
def analyze_action_result(self, *, action, result):
|
||||||
return LlmActionAnalysis(
|
return LlmActionAnalysis(
|
||||||
action=action,
|
action=action,
|
||||||
has_anomaly=True,
|
has_anomaly=True,
|
||||||
@ -35,11 +37,84 @@ class BlockingReviewLlmClient:
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class BlockingOnceReviewLlmClient:
|
||||||
|
def __init__(self, blocked_action: str = "get-token") -> None:
|
||||||
|
self.blocked_action = blocked_action
|
||||||
|
self.blocked = False
|
||||||
|
|
||||||
|
def analyze_action_result(self, *, action, result):
|
||||||
|
if action == self.blocked_action and not self.blocked:
|
||||||
|
self.blocked = True
|
||||||
|
return LlmActionAnalysis(
|
||||||
|
action=action,
|
||||||
|
has_anomaly=True,
|
||||||
|
severity="high",
|
||||||
|
possible_reason="review blocked once",
|
||||||
|
suggested_action="fix then retry current action",
|
||||||
|
requires_confirmation=True,
|
||||||
|
should_continue=False,
|
||||||
|
)
|
||||||
|
return LlmActionAnalysis(action=action)
|
||||||
|
|
||||||
|
|
||||||
class BrokenReviewLlmClient:
|
class BrokenReviewLlmClient:
|
||||||
def analyze_action_result(self, *, action, result, state_summary):
|
def analyze_action_result(self, *, action, result):
|
||||||
raise RuntimeError("review transport failed")
|
raise RuntimeError("review transport failed")
|
||||||
|
|
||||||
|
|
||||||
|
class ProgressivePollRunner(FakeActionRunner):
|
||||||
|
"""模拟下载和推送进度多次查询后才完成。"""
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.download_progress = ["10", "55", "100"]
|
||||||
|
self.upgrade_progress: dict[str, list[str]] = {}
|
||||||
|
|
||||||
|
def _fixture_for(self, action, kwargs):
|
||||||
|
if action == "poll-download-progress":
|
||||||
|
rate = self.download_progress.pop(0) if self.download_progress else "100"
|
||||||
|
return {
|
||||||
|
"ACTION": action,
|
||||||
|
"STEP": "DONE" if rate == "100" else "RUNNING",
|
||||||
|
"RATE_OF_PROGRESS": rate,
|
||||||
|
"MSG": "success" if rate == "100" else "running",
|
||||||
|
"MESSAGE": f"download {rate}%",
|
||||||
|
}
|
||||||
|
if action == "poll-upgrade-progress":
|
||||||
|
ip = kwargs.get("ip", "")
|
||||||
|
values = self.upgrade_progress.setdefault(str(ip), ["30", "100"])
|
||||||
|
rate = values.pop(0) if values else "100"
|
||||||
|
return {
|
||||||
|
"ACTION": action,
|
||||||
|
"IP": ip,
|
||||||
|
"STEP": "DONE" if rate == "100" else "RUNNING",
|
||||||
|
"RATE_OF_PROGRESS": rate,
|
||||||
|
"MSG": "success" if rate == "100" else "running",
|
||||||
|
"MESSAGE": f"upgrade {rate}%",
|
||||||
|
}
|
||||||
|
return super()._fixture_for(action, kwargs)
|
||||||
|
|
||||||
|
|
||||||
|
class FlakyVerifyRunner(FakeActionRunner):
|
||||||
|
"""模拟应用启动后第二次健康检查通过。"""
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.verify_calls = 0
|
||||||
|
|
||||||
|
def _fixture_for(self, action, kwargs):
|
||||||
|
if action == "verify-ip" and kwargs.get("ip") == "192.168.1.10":
|
||||||
|
self.verify_calls += 1
|
||||||
|
if self.verify_calls == 1:
|
||||||
|
return {
|
||||||
|
"ACTION": action,
|
||||||
|
"IP": "192.168.1.10",
|
||||||
|
"SUCCESS": "false",
|
||||||
|
"MESSAGE": "application is starting",
|
||||||
|
}
|
||||||
|
return super()._fixture_for(action, kwargs)
|
||||||
|
|
||||||
|
|
||||||
def test_run_deploy_flow_success(tmp_path: Path):
|
def test_run_deploy_flow_success(tmp_path: Path):
|
||||||
agent = PamDeployAgent(fake_runner=FakeActionRunner())
|
agent = PamDeployAgent(fake_runner=FakeActionRunner())
|
||||||
state = agent.create_state(
|
state = agent.create_state(
|
||||||
@ -55,6 +130,83 @@ def test_run_deploy_flow_success(tmp_path: Path):
|
|||||||
assert all(item["status"] == "SUCCESS" for item in state.ip_states.values())
|
assert all(item["status"] == "SUCCESS" for item in state.ip_states.values())
|
||||||
|
|
||||||
|
|
||||||
|
def test_progress_actions_repeat_until_llm_marks_complete(tmp_path: Path):
|
||||||
|
fake = ProgressivePollRunner()
|
||||||
|
agent = PamDeployAgent(fake_runner=fake)
|
||||||
|
state = agent.create_state(
|
||||||
|
params={**PARAMS, "POLL_INTERVAL_SEC": 0},
|
||||||
|
execution_strategy="fake",
|
||||||
|
config_path=str(tmp_path / "config.txt"),
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
agent.run_deploy_flow(state)
|
||||||
|
|
||||||
|
calls = [call[0] for call in fake.calls]
|
||||||
|
assert calls.count("poll-download-progress") == 3
|
||||||
|
assert calls.count("poll-upgrade-progress") == 4
|
||||||
|
assert "poll-download-progress" in state.completed_global_steps
|
||||||
|
assert state.poll_attempts == {}
|
||||||
|
assert all(item["status"] == "SUCCESS" for item in state.ip_states.values())
|
||||||
|
progress_events = [event for event in state.events if event["type"] == "ACTION_PROGRESS"]
|
||||||
|
assert any(event["stage"] == "poll-download-progress" and "RATE_OF_PROGRESS=10" in event["message"] for event in progress_events)
|
||||||
|
assert any(event["stage"] == "poll-upgrade-progress" and event["ip"] == "192.168.1.10" for event in progress_events)
|
||||||
|
|
||||||
|
|
||||||
|
def test_progress_timeout_pauses_on_current_action(tmp_path: Path):
|
||||||
|
fake = FakeActionRunner(
|
||||||
|
{
|
||||||
|
"poll-download-progress": {
|
||||||
|
"ACTION": "poll-download-progress",
|
||||||
|
"STEP": "RUNNING",
|
||||||
|
"RATE_OF_PROGRESS": "20",
|
||||||
|
"MSG": "running",
|
||||||
|
"MESSAGE": "download 20%",
|
||||||
|
}
|
||||||
|
}
|
||||||
|
)
|
||||||
|
agent = PamDeployAgent(fake_runner=fake)
|
||||||
|
state = agent.create_state(
|
||||||
|
params={**PARAMS, "POLL_INTERVAL_SEC": 0, "DOWNLOAD_POLL_MAX_ATTEMPTS": 2},
|
||||||
|
execution_strategy="fake",
|
||||||
|
config_path=str(tmp_path / "config.txt"),
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
agent.run_deploy_flow(state)
|
||||||
|
|
||||||
|
assert state.paused is True
|
||||||
|
assert state.pause_reason == "progress_timeout"
|
||||||
|
assert state.last_failed_step == "poll-download-progress"
|
||||||
|
assert "poll-download-progress" not in state.completed_global_steps
|
||||||
|
assert state.review_context["stage"] == "poll-download-progress"
|
||||||
|
assert state.poll_attempts["global:poll-download-progress"] == 2
|
||||||
|
|
||||||
|
|
||||||
|
def test_verify_ip_retries_until_success_before_marking_failed(tmp_path: Path):
|
||||||
|
fake = FlakyVerifyRunner()
|
||||||
|
agent = PamDeployAgent(fake_runner=fake)
|
||||||
|
state = agent.create_state(
|
||||||
|
params=PARAMS,
|
||||||
|
execution_strategy="fake",
|
||||||
|
config_path=str(tmp_path / "config.txt"),
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
agent.run_deploy_flow(state)
|
||||||
|
|
||||||
|
assert fake.verify_calls == 2
|
||||||
|
assert state.paused is False
|
||||||
|
assert state.poll_attempts == {}
|
||||||
|
assert state.ip_states["192.168.1.10"]["status"] == "SUCCESS"
|
||||||
|
assert any(
|
||||||
|
event["type"] == "ACTION_PROGRESS"
|
||||||
|
and event["stage"] == "verify-ip"
|
||||||
|
and event["ip"] == "192.168.1.10"
|
||||||
|
for event in state.events
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_create_state_writes_absolute_script_config_path_and_normalized_zip(tmp_path: Path):
|
def test_create_state_writes_absolute_script_config_path_and_normalized_zip(tmp_path: Path):
|
||||||
package_path = tmp_path / "pkg.zip"
|
package_path = tmp_path / "pkg.zip"
|
||||||
params = {**PARAMS, "ZIP_FILE_PATH": str(package_path)}
|
params = {**PARAMS, "ZIP_FILE_PATH": str(package_path)}
|
||||||
@ -71,6 +223,7 @@ def test_create_state_writes_absolute_script_config_path_and_normalized_zip(tmp_
|
|||||||
assert Path(state.trace_file_path).is_absolute()
|
assert Path(state.trace_file_path).is_absolute()
|
||||||
config_text = Path(state.config_path).read_text(encoding="utf-8")
|
config_text = Path(state.config_path).read_text(encoding="utf-8")
|
||||||
assert f"ZIP_FILE_PATH={package_path.resolve()}" in config_text
|
assert f"ZIP_FILE_PATH={package_path.resolve()}" in config_text
|
||||||
|
assert "PARENT_VERSION_NUMBER=\n" in config_text
|
||||||
|
|
||||||
|
|
||||||
def test_global_action_requires_hash_code_from_upload_package(tmp_path: Path):
|
def test_global_action_requires_hash_code_from_upload_package(tmp_path: Path):
|
||||||
@ -110,11 +263,49 @@ def test_run_deploy_flow_stops_on_verify_failure(tmp_path: Path):
|
|||||||
|
|
||||||
agent.run_deploy_flow(state)
|
agent.run_deploy_flow(state)
|
||||||
|
|
||||||
assert state.pending_confirmation == "rollback-ip:192.168.1.10"
|
verify_calls = [call for call in fake.calls if call[0] == "verify-ip" and call[1].get("ip") == "192.168.1.10"]
|
||||||
|
assert len(verify_calls) == 2
|
||||||
|
assert state.pending_confirmation == ""
|
||||||
|
assert state.paused is True
|
||||||
|
assert state.pause_reason == "action_failed"
|
||||||
assert state.ip_states["192.168.1.10"]["status"] == "FAILED"
|
assert state.ip_states["192.168.1.10"]["status"] == "FAILED"
|
||||||
assert state.ip_states["192.168.1.10"]["rollback_status"] == "PENDING_AGENT_CONFIRMATION"
|
assert state.ip_states["192.168.1.10"]["failed_stage"] == "verify-ip"
|
||||||
|
assert state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_NOT_RUN"
|
||||||
assert "192.168.1.11" not in state.ip_states
|
assert "192.168.1.11" not in state.ip_states
|
||||||
assert any(event["type"] == "CONFIRMATION_REQUIRED" for event in state.events)
|
assert any(event["type"] == "ACTION_RETRY_REQUIRED" for event in state.events)
|
||||||
|
assert not any(call[0] == "download-log" for call in fake.calls)
|
||||||
|
|
||||||
|
|
||||||
|
def test_resume_retries_failed_ip_action_without_rollback(tmp_path: Path):
|
||||||
|
fake = FakeActionRunner(
|
||||||
|
{
|
||||||
|
"verify-ip:192.168.1.10": {
|
||||||
|
"ACTION": "verify-ip",
|
||||||
|
"IP": "192.168.1.10",
|
||||||
|
"SUCCESS": "false",
|
||||||
|
"MESSAGE": "health check failed",
|
||||||
|
}
|
||||||
|
}
|
||||||
|
)
|
||||||
|
agent = PamDeployAgent(fake_runner=fake)
|
||||||
|
state = agent.create_state(
|
||||||
|
params=PARAMS,
|
||||||
|
execution_strategy="fake",
|
||||||
|
config_path=str(tmp_path / "config.txt"),
|
||||||
|
)
|
||||||
|
|
||||||
|
agent.run_deploy_flow(state)
|
||||||
|
fake.fixtures = {}
|
||||||
|
agent.resume_state(state)
|
||||||
|
agent.run_deploy_flow(state)
|
||||||
|
|
||||||
|
assert state.pending_confirmation == ""
|
||||||
|
assert state.paused is False
|
||||||
|
assert state.last_failed_step == ""
|
||||||
|
assert state.ip_states["192.168.1.10"]["status"] == "SUCCESS"
|
||||||
|
assert state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_NOT_RUN"
|
||||||
|
assert state.ip_states["192.168.1.11"]["status"] == "SUCCESS"
|
||||||
|
assert not any(call[0] == "rollback-ip" for call in fake.calls)
|
||||||
|
|
||||||
|
|
||||||
def test_action_analysis_event_is_recorded_when_enabled(tmp_path: Path):
|
def test_action_analysis_event_is_recorded_when_enabled(tmp_path: Path):
|
||||||
@ -161,11 +352,35 @@ def test_successful_action_can_be_blocked_by_llm_review(tmp_path: Path):
|
|||||||
assert state.paused is True
|
assert state.paused is True
|
||||||
assert state.pause_reason == "llm_review_blocked"
|
assert state.pause_reason == "llm_review_blocked"
|
||||||
assert state.last_failed_step == "get-token"
|
assert state.last_failed_step == "get-token"
|
||||||
assert state.completed_global_steps == ["get-token"]
|
assert state.completed_global_steps == []
|
||||||
assert state.review_context["stage"] == "get-token"
|
assert state.review_context["stage"] == "get-token"
|
||||||
assert state.review_context["suggested_action"] == "stop and inspect"
|
assert state.review_context["suggested_action"] == "stop and inspect"
|
||||||
|
|
||||||
|
|
||||||
|
def test_resume_retries_llm_blocked_global_action(tmp_path: Path):
|
||||||
|
fake = FakeActionRunner()
|
||||||
|
agent = PamDeployAgent(
|
||||||
|
fake_runner=fake,
|
||||||
|
llm_client=BlockingOnceReviewLlmClient(),
|
||||||
|
)
|
||||||
|
state = agent.create_state(
|
||||||
|
params=PARAMS,
|
||||||
|
execution_strategy="fake",
|
||||||
|
config_path=str(tmp_path / "config.txt"),
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
agent.run_deploy_flow(state)
|
||||||
|
agent.resume_state(state)
|
||||||
|
agent.run_deploy_flow(state)
|
||||||
|
|
||||||
|
called_actions = [call[0] for call in fake.calls]
|
||||||
|
assert called_actions[:2] == ["get-token", "get-token"]
|
||||||
|
assert called_actions.count("get-token") == 2
|
||||||
|
assert state.paused is False
|
||||||
|
assert state.completed_global_steps[0] == "get-token"
|
||||||
|
|
||||||
|
|
||||||
def test_action_review_failure_pauses_flow(tmp_path: Path):
|
def test_action_review_failure_pauses_flow(tmp_path: Path):
|
||||||
agent = PamDeployAgent(
|
agent = PamDeployAgent(
|
||||||
fake_runner=FakeActionRunner(),
|
fake_runner=FakeActionRunner(),
|
||||||
@ -184,10 +399,11 @@ def test_action_review_failure_pauses_flow(tmp_path: Path):
|
|||||||
assert state.pause_reason == "llm_review_blocked"
|
assert state.pause_reason == "llm_review_blocked"
|
||||||
assert state.review_context["stage"] == "get-token"
|
assert state.review_context["stage"] == "get-token"
|
||||||
assert "LLM 审核失败" in state.review_context["possible_reason"]
|
assert "LLM 审核失败" in state.review_context["possible_reason"]
|
||||||
|
assert state.completed_global_steps == []
|
||||||
assert any(event["type"] == "ACTION_ANALYSIS_FAIL" for event in state.events)
|
assert any(event["type"] == "ACTION_ANALYSIS_FAIL" for event in state.events)
|
||||||
|
|
||||||
|
|
||||||
def test_confirm_pending_rollback_runs_rollback_and_resume_continues(tmp_path: Path):
|
def test_explicit_rollback_runs_rollback_and_resume_continues(tmp_path: Path):
|
||||||
fake = FakeActionRunner(
|
fake = FakeActionRunner(
|
||||||
{
|
{
|
||||||
"verify-ip:192.168.1.10": {
|
"verify-ip:192.168.1.10": {
|
||||||
@ -206,18 +422,16 @@ def test_confirm_pending_rollback_runs_rollback_and_resume_continues(tmp_path: P
|
|||||||
)
|
)
|
||||||
|
|
||||||
agent.run_deploy_flow(state)
|
agent.run_deploy_flow(state)
|
||||||
request = agent.build_confirmation_request(state)
|
agent.rollback_ip(state, "192.168.1.10")
|
||||||
agent.confirm_pending(state, approved=True)
|
|
||||||
agent.run_deploy_flow(state)
|
agent.run_deploy_flow(state)
|
||||||
|
|
||||||
assert request["type"] == "rollback-ip"
|
|
||||||
assert state.pending_confirmation == ""
|
assert state.pending_confirmation == ""
|
||||||
assert state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_DONE"
|
assert state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_DONE"
|
||||||
assert state.ip_states["192.168.1.11"]["status"] == "SUCCESS"
|
assert state.ip_states["192.168.1.11"]["status"] == "SUCCESS"
|
||||||
assert any(call[0] == "rollback-ip" for call in fake.calls)
|
assert any(call[0] == "rollback-ip" for call in fake.calls)
|
||||||
|
|
||||||
|
|
||||||
def test_failed_rollback_keeps_confirmation_pending(tmp_path: Path):
|
def test_failed_explicit_rollback_pauses_without_confirmation(tmp_path: Path):
|
||||||
fake = FakeActionRunner(
|
fake = FakeActionRunner(
|
||||||
{
|
{
|
||||||
"verify-ip:192.168.1.10": {
|
"verify-ip:192.168.1.10": {
|
||||||
@ -242,9 +456,11 @@ def test_failed_rollback_keeps_confirmation_pending(tmp_path: Path):
|
|||||||
)
|
)
|
||||||
|
|
||||||
agent.run_deploy_flow(state)
|
agent.run_deploy_flow(state)
|
||||||
agent.confirm_pending(state, approved=True)
|
agent.rollback_ip(state, "192.168.1.10")
|
||||||
|
|
||||||
assert state.pending_confirmation == "rollback-ip:192.168.1.10"
|
assert state.pending_confirmation == ""
|
||||||
|
assert state.paused is True
|
||||||
|
assert state.pause_reason == "rollback_failed"
|
||||||
assert state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_FAILED"
|
assert state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_FAILED"
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@ -1,4 +1,5 @@
|
|||||||
import builtins
|
import builtins
|
||||||
|
import sys
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
@ -6,7 +7,7 @@ import pytest
|
|||||||
from pam_deploy_graph.agent import PamDeployAgent
|
from pam_deploy_graph.agent import PamDeployAgent
|
||||||
from pam_deploy_graph.fake_runner import FakeActionRunner
|
from pam_deploy_graph.fake_runner import FakeActionRunner
|
||||||
from pam_deploy_graph.interactive import InteractiveCliSession, _build_prompt_input
|
from pam_deploy_graph.interactive import InteractiveCliSession, _build_prompt_input
|
||||||
from pam_deploy_graph.models import LlmActionAnalysis
|
from pam_deploy_graph.models import LlmActionAnalysis, LlmIntentResult, LlmSingleActionProposal
|
||||||
|
|
||||||
|
|
||||||
PARAMS = {
|
PARAMS = {
|
||||||
@ -18,11 +19,13 @@ PARAMS = {
|
|||||||
"MODULE_NAME": "Node",
|
"MODULE_NAME": "Node",
|
||||||
"VERSION_NUMBER": "2.0.5",
|
"VERSION_NUMBER": "2.0.5",
|
||||||
"ZIP_FILE_PATH": "C:/pkg.zip",
|
"ZIP_FILE_PATH": "C:/pkg.zip",
|
||||||
|
"VERIFY_INTERVAL_SEC": 0,
|
||||||
|
"VERIFY_MAX_ATTEMPTS": 2,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class BlockingReviewLlmClient:
|
class BlockingReviewLlmClient:
|
||||||
def analyze_action_result(self, *, action, result, state_summary):
|
def analyze_action_result(self, *, action, result):
|
||||||
return LlmActionAnalysis(
|
return LlmActionAnalysis(
|
||||||
action=action,
|
action=action,
|
||||||
has_anomaly=True,
|
has_anomaly=True,
|
||||||
@ -35,6 +38,108 @@ class BlockingReviewLlmClient:
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class FakeTestableLlmClient:
|
||||||
|
def __init__(self) -> None:
|
||||||
|
self.requests: list[str] = []
|
||||||
|
self.chat_requests: list[tuple[str, dict]] = []
|
||||||
|
self.log_requests: list[tuple[str, str | None, str]] = []
|
||||||
|
self.proposal_requests: list[str] = []
|
||||||
|
|
||||||
|
def understand_request(self, text: str) -> LlmIntentResult:
|
||||||
|
self.requests.append(text)
|
||||||
|
return LlmIntentResult(
|
||||||
|
intent="deploy",
|
||||||
|
mode_preference="MCP",
|
||||||
|
strategy_preference="hybrid_node_mcp",
|
||||||
|
confidence=0.91,
|
||||||
|
reasons=["test ok"],
|
||||||
|
)
|
||||||
|
|
||||||
|
def extract_params(self, text, base_params=None):
|
||||||
|
raise AssertionError("llm test should only call understand_request")
|
||||||
|
|
||||||
|
def generate_plan(self, *, params, intent, strategy):
|
||||||
|
raise AssertionError("llm test should only call understand_request")
|
||||||
|
|
||||||
|
def analyze_action_result(self, *, action, result):
|
||||||
|
return LlmActionAnalysis(action=action)
|
||||||
|
|
||||||
|
def chat(self, text, context=None):
|
||||||
|
self.chat_requests.append((text, context or {}))
|
||||||
|
return f"chat answer: {text}"
|
||||||
|
|
||||||
|
def analyze_log(self, log_text, question=None, source_path=""):
|
||||||
|
self.log_requests.append((log_text, question, source_path))
|
||||||
|
return "log analysis answer"
|
||||||
|
|
||||||
|
def propose_action(self, text, allowed_actions, params, state_summary=None):
|
||||||
|
self.proposal_requests.append(text)
|
||||||
|
return LlmSingleActionProposal(
|
||||||
|
action="verify-ip" if "verify" in text else "get-online-ips",
|
||||||
|
ip="192.168.1.10" if "192.168.1.10" in text else "",
|
||||||
|
reason="test proposal",
|
||||||
|
risk_level="medium",
|
||||||
|
requires_confirmation=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class StreamingChatLlmClient(FakeTestableLlmClient):
|
||||||
|
def __init__(self) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.stream_requests: list[tuple[str, dict]] = []
|
||||||
|
|
||||||
|
def chat_stream(self, text, context=None):
|
||||||
|
self.stream_requests.append((text, context or {}))
|
||||||
|
yield "第一句。"
|
||||||
|
yield "<think>隐藏思考</think>"
|
||||||
|
yield "第二句。"
|
||||||
|
|
||||||
|
|
||||||
|
class BrokenStreamingChatLlmClient(FakeTestableLlmClient):
|
||||||
|
def chat_stream(self, text, context=None):
|
||||||
|
raise RuntimeError("stream unavailable")
|
||||||
|
|
||||||
|
|
||||||
|
class FlakyVerifyRunner(FakeActionRunner):
|
||||||
|
"""第一次 verify-ip 失败,后续恢复成功,用于覆盖断点重试。"""
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.verify_calls = 0
|
||||||
|
|
||||||
|
def _fixture_for(self, action, kwargs):
|
||||||
|
if action == "verify-ip" and kwargs.get("ip") == "192.168.1.10":
|
||||||
|
self.verify_calls += 1
|
||||||
|
if self.verify_calls == 1:
|
||||||
|
return {
|
||||||
|
"ACTION": "verify-ip",
|
||||||
|
"IP": "192.168.1.10",
|
||||||
|
"SUCCESS": "false",
|
||||||
|
"MESSAGE": "health check failed",
|
||||||
|
}
|
||||||
|
return super()._fixture_for(action, kwargs)
|
||||||
|
|
||||||
|
|
||||||
|
class ChatProgressRunner(FakeActionRunner):
|
||||||
|
"""让 chat fake 部署产生一次可见的进度更新。"""
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.download_progress = ["40", "100"]
|
||||||
|
|
||||||
|
def _fixture_for(self, action, kwargs):
|
||||||
|
if action == "poll-download-progress":
|
||||||
|
rate = self.download_progress.pop(0) if self.download_progress else "100"
|
||||||
|
return {
|
||||||
|
"ACTION": action,
|
||||||
|
"STEP": "DONE" if rate == "100" else "RUNNING",
|
||||||
|
"RATE_OF_PROGRESS": rate,
|
||||||
|
"MSG": "success" if rate == "100" else "running",
|
||||||
|
"MESSAGE": f"download {rate}%",
|
||||||
|
}
|
||||||
|
return super()._fixture_for(action, kwargs)
|
||||||
|
|
||||||
|
|
||||||
def run_session(session: InteractiveCliSession, inputs: list[str]) -> list[str]:
|
def run_session(session: InteractiveCliSession, inputs: list[str]) -> list[str]:
|
||||||
output: list[str] = []
|
output: list[str] = []
|
||||||
iterator = iter(inputs)
|
iterator = iter(inputs)
|
||||||
@ -93,9 +198,27 @@ def test_chat_run_prints_action_progress(tmp_path: Path):
|
|||||||
assert any("分析完成: verify-ip" in item for item in output)
|
assert any("分析完成: verify-ip" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
def test_chat_greeting_does_not_trigger_structured_analysis(tmp_path: Path):
|
def test_chat_run_prints_progress_poll_updates(tmp_path: Path):
|
||||||
|
checkpoint = tmp_path / "checkpoint.json"
|
||||||
session = InteractiveCliSession(
|
session = InteractiveCliSession(
|
||||||
agent=PamDeployAgent(),
|
agent=PamDeployAgent(fake_runner=ChatProgressRunner()),
|
||||||
|
params={**PARAMS, "POLL_INTERVAL_SEC": 0},
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(checkpoint),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["run", "yes", "yes", "yes", "exit"])
|
||||||
|
|
||||||
|
assert any("进度更新: poll-download-progress" in item for item in output)
|
||||||
|
assert any("RATE_OF_PROGRESS=40" in item for item in output)
|
||||||
|
assert session.state is not None
|
||||||
|
assert "poll-download-progress" in session.state.completed_global_steps
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_greeting_goes_to_llm_chat_without_structured_analysis(tmp_path: Path):
|
||||||
|
llm = FakeTestableLlmClient()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(llm_client=llm),
|
||||||
params=PARAMS,
|
params=PARAMS,
|
||||||
strategy="fake",
|
strategy="fake",
|
||||||
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
@ -104,10 +227,208 @@ def test_chat_greeting_does_not_trigger_structured_analysis(tmp_path: Path):
|
|||||||
output = run_session(session, ["你好", "exit"])
|
output = run_session(session, ["你好", "exit"])
|
||||||
|
|
||||||
assert session.last_analysis is None
|
assert session.last_analysis is None
|
||||||
assert any("可以输入 help 查看命令" in item for item in output)
|
assert llm.chat_requests[0][0] == "你好"
|
||||||
|
assert any("正在询问 LLM: FakeTestableLlmClient" in item for item in output)
|
||||||
|
assert any("chat answer: 你好" in item for item in output)
|
||||||
assert not any("已生成结构化理解" in item for item in output)
|
assert not any("已生成结构化理解" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_ask_command_uses_llm_chat(tmp_path: Path):
|
||||||
|
llm = FakeTestableLlmClient()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["ask 这个 agent 能做什么", "exit"])
|
||||||
|
|
||||||
|
assert llm.chat_requests[0][0] == "这个 agent 能做什么"
|
||||||
|
assert llm.chat_requests[0][1]["params"]["CLIENT_SECRET"] == "***"
|
||||||
|
assert any("chat answer: 这个 agent 能做什么" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_ask_uses_streaming_chat_when_available(tmp_path: Path):
|
||||||
|
llm = StreamingChatLlmClient()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["ask 你好", "exit"])
|
||||||
|
|
||||||
|
assert llm.stream_requests[0][0] == "你好"
|
||||||
|
assert llm.chat_requests == []
|
||||||
|
assert any("第一句。" in item for item in output)
|
||||||
|
assert any("第二句。" in item for item in output)
|
||||||
|
assert not any("隐藏思考" in item or "<think>" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_ask_falls_back_when_streaming_fails(tmp_path: Path):
|
||||||
|
llm = BrokenStreamingChatLlmClient()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["ask 你好", "exit"])
|
||||||
|
|
||||||
|
assert llm.chat_requests[0][0] == "你好"
|
||||||
|
assert any("LLM 流式输出失败,改用普通请求" in item for item in output)
|
||||||
|
assert any("chat answer: 你好" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_ask_strips_think_from_non_streaming_chat(tmp_path: Path):
|
||||||
|
class ThinkChatLlmClient(FakeTestableLlmClient):
|
||||||
|
def chat(self, text, context=None):
|
||||||
|
self.chat_requests.append((text, context or {}))
|
||||||
|
return "可见<think>隐藏思考</think>结论"
|
||||||
|
|
||||||
|
llm = ThinkChatLlmClient()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["ask 你好", "exit"])
|
||||||
|
|
||||||
|
assert any("可见结论" in item for item in output)
|
||||||
|
assert not any("隐藏思考" in item or "<think>" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_log_analyze_reads_tail_and_redacts(tmp_path: Path):
|
||||||
|
llm = FakeTestableLlmClient()
|
||||||
|
log_path = tmp_path / "agent.log"
|
||||||
|
log_path.write_text(
|
||||||
|
"\n".join(
|
||||||
|
[
|
||||||
|
"line 1 CLIENT_SECRET=real-secret",
|
||||||
|
"line 2 ok",
|
||||||
|
"line 3 ERROR failed",
|
||||||
|
]
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, [f"log analyze {log_path} 请找异常 --tail 2", "exit"])
|
||||||
|
|
||||||
|
log_text, question, source_path = llm.log_requests[0]
|
||||||
|
assert "line 1" not in log_text
|
||||||
|
assert "real-secret" not in log_text
|
||||||
|
assert "line 3 ERROR failed" in log_text
|
||||||
|
assert question == "请找异常"
|
||||||
|
assert source_path == str(log_path)
|
||||||
|
assert any("log analysis answer" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_log_analyze_strips_think_from_answer(tmp_path: Path):
|
||||||
|
class ThinkLogLlmClient(FakeTestableLlmClient):
|
||||||
|
def analyze_log(self, log_text, question=None, source_path=""):
|
||||||
|
self.log_requests.append((log_text, question, source_path))
|
||||||
|
return "<think>隐藏日志分析</think>日志结论"
|
||||||
|
|
||||||
|
llm = ThinkLogLlmClient()
|
||||||
|
log_path = tmp_path / "agent.log"
|
||||||
|
log_path.write_text("ERROR failed", encoding="utf-8")
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, [f"log analyze {log_path}", "exit"])
|
||||||
|
|
||||||
|
assert any("日志结论" in item for item in output)
|
||||||
|
assert not any("隐藏日志分析" in item or "<think>" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_action_propose_only_shows_plan_without_execution(tmp_path: Path):
|
||||||
|
llm = FakeTestableLlmClient()
|
||||||
|
fake = FakeActionRunner()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(fake_runner=fake, llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["action propose 请 verify-ip 192.168.1.10", "exit"])
|
||||||
|
|
||||||
|
assert llm.proposal_requests == ["请 verify-ip 192.168.1.10"]
|
||||||
|
assert fake.calls == []
|
||||||
|
assert any("单 action 计划" in item for item in output)
|
||||||
|
assert any("- action: verify-ip" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_action_run_llm_requires_confirmation_before_execution(tmp_path: Path):
|
||||||
|
llm = FakeTestableLlmClient()
|
||||||
|
fake = FakeActionRunner()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(fake_runner=fake, llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["action run llm 请 verify-ip 192.168.1.10", "no", "exit"])
|
||||||
|
|
||||||
|
assert fake.calls == []
|
||||||
|
assert any("已取消单 action 执行" in item for item in output)
|
||||||
|
|
||||||
|
output = run_session(session, ["action run llm 请 verify-ip 192.168.1.10", "yes", "exit"])
|
||||||
|
|
||||||
|
assert ("verify-ip", {"ip": "192.168.1.10"}) in fake.calls
|
||||||
|
assert session.state is not None
|
||||||
|
assert any(event["type"] == "SINGLE_ACTION_DONE" for event in session.state.events)
|
||||||
|
assert any("单 action 执行完成" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_action_run_missing_ip_is_friendly(tmp_path: Path):
|
||||||
|
fake = FakeActionRunner()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(fake_runner=fake),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["action run verify-ip", "exit"])
|
||||||
|
|
||||||
|
assert fake.calls == []
|
||||||
|
assert any("需要提供 ip" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_action_run_manual_executes_fake_action(tmp_path: Path):
|
||||||
|
fake = FakeActionRunner()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(fake_runner=fake),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["action run get-online-ips", "yes", "exit"])
|
||||||
|
|
||||||
|
assert ("get-online-ips", {"ip": None}) in fake.calls
|
||||||
|
assert session.state is not None
|
||||||
|
assert session.state.online_ips == ["192.168.1.10", "192.168.1.11"]
|
||||||
|
assert any("单 action 执行完成" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
def test_chat_preflight_blocks_missing_zip_path_before_confirm(tmp_path: Path):
|
def test_chat_preflight_blocks_missing_zip_path_before_confirm(tmp_path: Path):
|
||||||
missing_package = tmp_path / "missing.zip"
|
missing_package = tmp_path / "missing.zip"
|
||||||
session = InteractiveCliSession(
|
session = InteractiveCliSession(
|
||||||
@ -141,7 +462,28 @@ def test_chat_action_failure_does_not_report_langgraph_unavailable(tmp_path: Pat
|
|||||||
assert not any("LangGraph 确认运行器不可用" in item for item in output)
|
assert not any("LangGraph 确认运行器不可用" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
def test_chat_approve_then_resume_continues_after_failed_ip(tmp_path: Path):
|
def test_chat_resume_retries_failed_ip_without_rollback(tmp_path: Path):
|
||||||
|
fake = FlakyVerifyRunner()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(fake_runner=fake),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["run", "yes", "yes", "yes", "exit"])
|
||||||
|
|
||||||
|
assert session.state is not None
|
||||||
|
assert session.state.pending_confirmation == ""
|
||||||
|
assert session.state.paused is False
|
||||||
|
assert session.state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_NOT_RUN"
|
||||||
|
assert session.state.ip_states["192.168.1.10"]["status"] == "SUCCESS"
|
||||||
|
assert session.state.ip_states["192.168.1.11"]["status"] == "SUCCESS"
|
||||||
|
assert not any(call[0] == "rollback-ip" for call in fake.calls)
|
||||||
|
assert any("进度更新: verify-ip" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_explicit_rollback_command_rolls_back_failed_ip(tmp_path: Path):
|
||||||
fake = FakeActionRunner(
|
fake = FakeActionRunner(
|
||||||
{
|
{
|
||||||
"verify-ip:192.168.1.10": {
|
"verify-ip:192.168.1.10": {
|
||||||
@ -159,12 +501,14 @@ def test_chat_approve_then_resume_continues_after_failed_ip(tmp_path: Path):
|
|||||||
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
)
|
)
|
||||||
|
|
||||||
run_session(session, ["run", "yes", "yes", "yes", "approve", "resume", "exit"])
|
output = run_session(session, ["run", "yes", "yes", "yes", "rollback", "resume", "exit"])
|
||||||
|
|
||||||
assert session.state is not None
|
assert session.state is not None
|
||||||
assert session.state.pending_confirmation == ""
|
assert session.state.pending_confirmation == ""
|
||||||
assert session.state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_DONE"
|
assert session.state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_DONE"
|
||||||
assert session.state.ip_states["192.168.1.11"]["status"] == "SUCCESS"
|
assert session.state.ip_states["192.168.1.11"]["status"] == "SUCCESS"
|
||||||
|
assert any(call[0] == "rollback-ip" for call in fake.calls)
|
||||||
|
assert any("回滚已完成;如需继续主流程,输入 resume" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
def test_chat_params_events_and_checkpoint_commands(tmp_path: Path):
|
def test_chat_params_events_and_checkpoint_commands(tmp_path: Path):
|
||||||
@ -185,7 +529,7 @@ def test_chat_params_events_and_checkpoint_commands(tmp_path: Path):
|
|||||||
"yes",
|
"yes",
|
||||||
"yes",
|
"yes",
|
||||||
"yes",
|
"yes",
|
||||||
"events 2",
|
"events 20",
|
||||||
"list checkpoints",
|
"list checkpoints",
|
||||||
"load checkpoint " + str(checkpoint),
|
"load checkpoint " + str(checkpoint),
|
||||||
"exit",
|
"exit",
|
||||||
@ -260,6 +604,24 @@ def test_chat_llm_review_block_message_is_visible(tmp_path: Path):
|
|||||||
assert any("如需继续,输入 resume" in item for item in output)
|
assert any("如需继续,输入 resume" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_llm_test_command_uses_current_client(tmp_path: Path):
|
||||||
|
llm = FakeTestableLlmClient()
|
||||||
|
session = InteractiveCliSession(
|
||||||
|
agent=PamDeployAgent(llm_client=llm),
|
||||||
|
params=PARAMS,
|
||||||
|
strategy="fake",
|
||||||
|
checkpoint_path=str(tmp_path / "checkpoint.json"),
|
||||||
|
)
|
||||||
|
|
||||||
|
output = run_session(session, ["llm test 检查模型", "exit"])
|
||||||
|
|
||||||
|
assert llm.requests == ["检查模型"]
|
||||||
|
assert any("正在测试 LLM: FakeTestableLlmClient" in item for item in output)
|
||||||
|
assert any("LLM 测试通过" in item for item in output)
|
||||||
|
assert any("- intent: deploy" in item for item in output)
|
||||||
|
assert any("- strategy: hybrid_node_mcp" in item for item in output)
|
||||||
|
|
||||||
|
|
||||||
def test_chat_can_hot_load_mcp_config(tmp_path: Path):
|
def test_chat_can_hot_load_mcp_config(tmp_path: Path):
|
||||||
mcp_config = tmp_path / "mcp.json"
|
mcp_config = tmp_path / "mcp.json"
|
||||||
mcp_config.write_text('{"transport": "stdio", "command": "python"}', encoding="utf-8")
|
mcp_config.write_text('{"transport": "stdio", "command": "python"}', encoding="utf-8")
|
||||||
@ -285,3 +647,15 @@ def test_prompt_history_creates_runtime_dir(tmp_path: Path, monkeypatch):
|
|||||||
|
|
||||||
assert callable(prompt)
|
assert callable(prompt)
|
||||||
assert (tmp_path / "runtime").is_dir()
|
assert (tmp_path / "runtime").is_dir()
|
||||||
|
|
||||||
|
|
||||||
|
def test_prompt_toolkit_enabled_when_frozen(tmp_path: Path, monkeypatch):
|
||||||
|
pytest.importorskip("prompt_toolkit")
|
||||||
|
monkeypatch.chdir(tmp_path)
|
||||||
|
monkeypatch.setattr(sys, "frozen", True, raising=False)
|
||||||
|
|
||||||
|
prompt = _build_prompt_input(builtins.input)
|
||||||
|
|
||||||
|
assert callable(prompt)
|
||||||
|
assert prompt is not builtins.input
|
||||||
|
assert (tmp_path / "runtime").is_dir()
|
||||||
|
|||||||
@ -14,10 +14,12 @@ PARAMS = {
|
|||||||
"MODULE_NAME": "Node",
|
"MODULE_NAME": "Node",
|
||||||
"VERSION_NUMBER": "2.0.5",
|
"VERSION_NUMBER": "2.0.5",
|
||||||
"ZIP_FILE_PATH": "C:/pkg.zip",
|
"ZIP_FILE_PATH": "C:/pkg.zip",
|
||||||
|
"VERIFY_INTERVAL_SEC": 0,
|
||||||
|
"VERIFY_MAX_ATTEMPTS": 2,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
def test_langgraph_runtime_interrupts_and_resumes_confirmation(tmp_path: Path):
|
def test_langgraph_runtime_pauses_failure_and_resume_retries(tmp_path: Path):
|
||||||
fake = FakeActionRunner(
|
fake = FakeActionRunner(
|
||||||
{
|
{
|
||||||
"verify-ip:192.168.1.10": {
|
"verify-ip:192.168.1.10": {
|
||||||
@ -39,16 +41,22 @@ def test_langgraph_runtime_interrupts_and_resumes_confirmation(tmp_path: Path):
|
|||||||
|
|
||||||
first = runtime.start(state)
|
first = runtime.start(state)
|
||||||
|
|
||||||
assert first.interrupted is True
|
assert first.interrupted is False
|
||||||
assert runtime.waiting_confirmation is True
|
assert runtime.waiting_confirmation is False
|
||||||
assert first.confirmation["type"] == "rollback-ip"
|
assert first.confirmation == {}
|
||||||
assert first.confirmation["ip"] == "192.168.1.10"
|
assert first.state is not None
|
||||||
|
assert first.state.paused is True
|
||||||
|
assert first.state.pending_confirmation == ""
|
||||||
|
assert first.state.ip_states["192.168.1.10"]["failed_stage"] == "verify-ip"
|
||||||
|
|
||||||
second = runtime.resume(approved=True)
|
fake.fixtures = {}
|
||||||
|
agent.resume_state(first.state)
|
||||||
|
second = runtime.start(first.state)
|
||||||
|
|
||||||
assert second.interrupted is False
|
assert second.interrupted is False
|
||||||
assert runtime.waiting_confirmation is False
|
assert runtime.waiting_confirmation is False
|
||||||
assert second.state is not None
|
assert second.state is not None
|
||||||
assert second.state.pending_confirmation == ""
|
assert second.state.pending_confirmation == ""
|
||||||
assert second.state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_DONE"
|
assert second.state.paused is False
|
||||||
|
assert second.state.ip_states["192.168.1.10"]["rollback_status"] == "ROLLBACK_NOT_RUN"
|
||||||
assert second.state.ip_states["192.168.1.11"]["status"] == "SUCCESS"
|
assert second.state.ip_states["192.168.1.11"]["status"] == "SUCCESS"
|
||||||
|
|||||||
@ -1,4 +1,5 @@
|
|||||||
from dataclasses import asdict
|
from dataclasses import asdict
|
||||||
|
import json
|
||||||
|
|
||||||
from pam_deploy_graph.agent import PamDeployAgent
|
from pam_deploy_graph.agent import PamDeployAgent
|
||||||
from pam_deploy_graph.checkpoint_store import redact_mapping
|
from pam_deploy_graph.checkpoint_store import redact_mapping
|
||||||
@ -20,13 +21,22 @@ def test_understand_request_prefers_hybrid_for_mcp():
|
|||||||
def test_extract_params_from_key_value_text():
|
def test_extract_params_from_key_value_text():
|
||||||
result = RuleBasedLlmClient().extract_params(
|
result = RuleBasedLlmClient().extract_params(
|
||||||
"HOME_BASE_URL=https://x CLIENT_ID=id CLIENT_SECRET=s AIRPORT_CODE=HET "
|
"HOME_BASE_URL=https://x CLIENT_ID=id CLIENT_SECRET=s AIRPORT_CODE=HET "
|
||||||
"APP_NAME=PAM MODULE_NAME=Node VERSION_NUMBER=2.0.5 ZIP_FILE_PATH=C:/pkg.zip"
|
"APP_NAME=PAM MODULE_NAME=Node VERSION_NUMBER=2.0.5 "
|
||||||
|
"parentVersionNumber=2.0.4 ZIP_FILE_PATH=C:/pkg.zip"
|
||||||
)
|
)
|
||||||
assert result.extracted_params["AIRPORT_CODE"] == "HET"
|
assert result.extracted_params["AIRPORT_CODE"] == "HET"
|
||||||
|
assert result.extracted_params["PARENT_VERSION_NUMBER"] == "2.0.4"
|
||||||
assert result.missing_required_params == []
|
assert result.missing_required_params == []
|
||||||
assert "CLIENT_SECRET" in result.sensitive_fields_present
|
assert "CLIENT_SECRET" in result.sensitive_fields_present
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_parent_version_from_chinese_text():
|
||||||
|
result = RuleBasedLlmClient().extract_params("请部署版本 2.0.5,云下载继承版本 2.0.4 的规则")
|
||||||
|
|
||||||
|
assert result.extracted_params["VERSION_NUMBER"] == "2.0.5"
|
||||||
|
assert result.extracted_params["PARENT_VERSION_NUMBER"] == "2.0.4"
|
||||||
|
|
||||||
|
|
||||||
def test_analyze_request_returns_structured_objects():
|
def test_analyze_request_returns_structured_objects():
|
||||||
agent = PamDeployAgent()
|
agent = PamDeployAgent()
|
||||||
result = agent.analyze_request(
|
result = agent.analyze_request(
|
||||||
@ -219,13 +229,206 @@ def test_openai_compatible_client_analyzes_action_result_with_redaction():
|
|||||||
ok=False,
|
ok=False,
|
||||||
values={"CLIENT_SECRET": "real-secret", "SUCCESS": "false"},
|
values={"CLIENT_SECRET": "real-secret", "SUCCESS": "false"},
|
||||||
stderr="x" * 1200,
|
stderr="x" * 1200,
|
||||||
error_summary="failed",
|
|
||||||
),
|
),
|
||||||
state_summary={"params": {"CLIENT_SECRET": "real-secret"}},
|
|
||||||
)
|
)
|
||||||
|
|
||||||
serialized_prompt = str(calls[0])
|
serialized_prompt = str(calls[0])
|
||||||
|
input_payload = _llm_input_payload(calls[0])
|
||||||
assert analysis.has_anomaly is True
|
assert analysis.has_anomaly is True
|
||||||
assert analysis.severity == "high"
|
assert analysis.severity == "high"
|
||||||
assert "real-secret" not in serialized_prompt
|
assert "real-secret" not in serialized_prompt
|
||||||
assert "[已截断]" in serialized_prompt
|
assert "state_summary" not in input_payload
|
||||||
|
assert input_payload["result"]["diagnostic_log"].startswith("[已截断]...")
|
||||||
|
|
||||||
|
|
||||||
|
def test_openai_compatible_client_omits_success_script_logs_from_action_review():
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
def transport(url, headers, payload, timeout_sec):
|
||||||
|
calls.append(payload)
|
||||||
|
return {
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"message": {
|
||||||
|
"content": (
|
||||||
|
'{"action":"get-online-ips","has_anomaly":false,"severity":"info",'
|
||||||
|
'"possible_reason":"","suggested_action":"continue",'
|
||||||
|
'"requires_confirmation":false,"should_continue":true}'
|
||||||
|
)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
client = OpenAICompatibleLlmClient(
|
||||||
|
base_url="https://llm.example/v1",
|
||||||
|
api_key="secret-key",
|
||||||
|
model="model-a",
|
||||||
|
transport=transport,
|
||||||
|
)
|
||||||
|
|
||||||
|
client.analyze_action_result(
|
||||||
|
action="get-online-ips",
|
||||||
|
result=ActionResult(
|
||||||
|
action="get-online-ips",
|
||||||
|
backend="script",
|
||||||
|
ok=True,
|
||||||
|
values={"ACTION": "get-online-ips", "COUNT": "1", "IP": ["10.4.1.1"]},
|
||||||
|
stdout="ACTION=get-online-ips\nCOUNT=1\nIP=10.4.1.1\n",
|
||||||
|
stderr="[INFO] [FLOW][START] get_token\n[INFO] [FLOW][DONE] get_online_ips\n",
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
input_payload = _llm_input_payload(calls[0])
|
||||||
|
result_payload = input_payload["result"]
|
||||||
|
assert "diagnostic_log" not in result_payload
|
||||||
|
assert "stdout" not in result_payload
|
||||||
|
assert "stderr" not in result_payload
|
||||||
|
assert "[FLOW][START]" not in json.dumps(input_payload, ensure_ascii=False)
|
||||||
|
|
||||||
|
|
||||||
|
def test_openai_compatible_client_supports_plain_chat():
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
def transport(url, headers, payload, timeout_sec):
|
||||||
|
calls.append(payload)
|
||||||
|
return {"choices": [{"message": {"content": "普通回答"}}]}
|
||||||
|
|
||||||
|
client = OpenAICompatibleLlmClient(
|
||||||
|
base_url="https://llm.example/v1",
|
||||||
|
api_key="secret-key",
|
||||||
|
model="model-a",
|
||||||
|
transport=transport,
|
||||||
|
)
|
||||||
|
|
||||||
|
answer = client.chat("你好", context={"CLIENT_SECRET": "real-secret"})
|
||||||
|
|
||||||
|
serialized_prompt = str(calls[0])
|
||||||
|
assert answer == "普通回答"
|
||||||
|
assert "response_format" not in calls[0]
|
||||||
|
assert "real-secret" not in serialized_prompt
|
||||||
|
assert "不要自动触发部署" in calls[0]["messages"][0]["content"]
|
||||||
|
assert "不要输出 `<think>`" in calls[0]["messages"][0]["content"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_openai_compatible_client_strips_think_from_plain_chat():
|
||||||
|
def transport(url, headers, payload, timeout_sec):
|
||||||
|
return {"choices": [{"message": {"content": "开头<think>内部思考</think>结论"}}]}
|
||||||
|
|
||||||
|
client = OpenAICompatibleLlmClient(
|
||||||
|
base_url="https://llm.example/v1",
|
||||||
|
api_key="secret-key",
|
||||||
|
model="model-a",
|
||||||
|
transport=transport,
|
||||||
|
)
|
||||||
|
|
||||||
|
answer = client.chat("你好")
|
||||||
|
|
||||||
|
assert answer == "开头结论"
|
||||||
|
|
||||||
|
|
||||||
|
def test_openai_compatible_client_streams_plain_chat_and_filters_think():
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
def stream_transport(url, headers, payload, timeout_sec):
|
||||||
|
calls.append((url, headers, payload, timeout_sec))
|
||||||
|
return iter(["开头", "<thi", "nk>内部思考", "</think>结论", "。"])
|
||||||
|
|
||||||
|
client = OpenAICompatibleLlmClient(
|
||||||
|
base_url="https://llm.example/v1",
|
||||||
|
api_key="secret-key",
|
||||||
|
model="model-a",
|
||||||
|
stream_transport=stream_transport,
|
||||||
|
)
|
||||||
|
|
||||||
|
answer = "".join(client.chat_stream("你好", context={"CLIENT_SECRET": "real-secret"}))
|
||||||
|
|
||||||
|
assert answer == "开头结论。"
|
||||||
|
assert calls[0][0] == "https://llm.example/v1/chat/completions"
|
||||||
|
assert calls[0][1]["Authorization"] == "Bearer secret-key"
|
||||||
|
assert calls[0][2]["stream"] is True
|
||||||
|
assert "response_format" not in calls[0][2]
|
||||||
|
assert "real-secret" not in json.dumps(calls[0][2], ensure_ascii=False)
|
||||||
|
|
||||||
|
|
||||||
|
def test_openai_compatible_client_analyzes_log_with_redaction():
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
def transport(url, headers, payload, timeout_sec):
|
||||||
|
calls.append(payload)
|
||||||
|
return {"choices": [{"message": {"content": "<think>隐藏分析</think>日志分析"}}]}
|
||||||
|
|
||||||
|
client = OpenAICompatibleLlmClient(
|
||||||
|
base_url="https://llm.example/v1",
|
||||||
|
api_key="secret-key",
|
||||||
|
model="model-a",
|
||||||
|
transport=transport,
|
||||||
|
)
|
||||||
|
|
||||||
|
answer = client.analyze_log("ERROR CLIENT_SECRET=real-secret", question="为什么失败", source_path="agent.log")
|
||||||
|
|
||||||
|
input_payload = _llm_input_payload(calls[0])
|
||||||
|
assert answer == "日志分析"
|
||||||
|
assert input_payload["source_path"] == "agent.log"
|
||||||
|
assert input_payload["question"] == "为什么失败"
|
||||||
|
assert "real-secret" not in json.dumps(input_payload, ensure_ascii=False)
|
||||||
|
assert "不要因为日志来自 stderr" in calls[0]["messages"][0]["content"]
|
||||||
|
assert "不要输出 `<think>`" in calls[0]["messages"][0]["content"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_openai_compatible_client_proposes_single_action():
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
def transport(url, headers, payload, timeout_sec):
|
||||||
|
calls.append(payload)
|
||||||
|
return {
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"message": {
|
||||||
|
"content": (
|
||||||
|
'{"action":"verify-ip","ip":"192.168.1.10","kwargs":{"timeout_sec":10},'
|
||||||
|
'"reason":"用户要求健康检查","risk_level":"low","requires_confirmation":false}'
|
||||||
|
)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
client = OpenAICompatibleLlmClient(
|
||||||
|
base_url="https://llm.example/v1",
|
||||||
|
api_key="secret-key",
|
||||||
|
model="model-a",
|
||||||
|
transport=transport,
|
||||||
|
)
|
||||||
|
|
||||||
|
proposal = client.propose_action(
|
||||||
|
"检查 192.168.1.10",
|
||||||
|
["verify-ip", "get-online-ips"],
|
||||||
|
{"CLIENT_SECRET": "real-secret"},
|
||||||
|
state_summary={"node_url_present": True},
|
||||||
|
)
|
||||||
|
|
||||||
|
input_payload = _llm_input_payload(calls[0])
|
||||||
|
assert proposal.action == "verify-ip"
|
||||||
|
assert proposal.ip == "192.168.1.10"
|
||||||
|
assert proposal.kwargs == {"timeout_sec": 10}
|
||||||
|
assert proposal.risk_level == "low"
|
||||||
|
assert proposal.requires_confirmation is True
|
||||||
|
assert "real-secret" not in json.dumps(input_payload, ensure_ascii=False)
|
||||||
|
|
||||||
|
|
||||||
|
def test_rule_based_client_proposes_only_explicit_action():
|
||||||
|
client = RuleBasedLlmClient()
|
||||||
|
|
||||||
|
proposal = client.propose_action("请 verify-ip 192.168.1.10", ["verify-ip"], {}, {})
|
||||||
|
unknown = client.propose_action("帮我检查一下", ["verify-ip"], {}, {})
|
||||||
|
|
||||||
|
assert proposal.action == "verify-ip"
|
||||||
|
assert proposal.ip == "192.168.1.10"
|
||||||
|
assert unknown.action == ""
|
||||||
|
|
||||||
|
|
||||||
|
def _llm_input_payload(request_payload):
|
||||||
|
content = request_payload["messages"][1]["content"]
|
||||||
|
_, _, raw_json = content.partition("输入 JSON:\n")
|
||||||
|
return json.loads(raw_json)
|
||||||
|
|||||||
83
tests/test_llm_text_filter.py
Normal file
83
tests/test_llm_text_filter.py
Normal file
@ -0,0 +1,83 @@
|
|||||||
|
from pam_deploy_graph.llm.text_filter import filter_thinking_chunks, strip_thinking_text
|
||||||
|
|
||||||
|
|
||||||
|
def test_strip_thinking_text_removes_complete_block():
|
||||||
|
text = "开头<think>这里是很长的内部思考\n不应该展示</think>结论"
|
||||||
|
|
||||||
|
assert strip_thinking_text(text) == "开头结论"
|
||||||
|
|
||||||
|
|
||||||
|
def test_strip_thinking_text_removes_unclosed_block():
|
||||||
|
text = "可见内容\n<THINK>未闭合的内部思考不应该展示"
|
||||||
|
|
||||||
|
assert strip_thinking_text(text) == "可见内容"
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_thinking_chunks_handles_split_tags():
|
||||||
|
chunks = ["回答", "<thi", "nk>隐藏", "内容</th", "ink>继续。"]
|
||||||
|
|
||||||
|
visible = list(filter_thinking_chunks(chunks))
|
||||||
|
|
||||||
|
assert "".join(visible) == "回答继续。"
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_thinking_chunks_drops_unclosed_think_tail():
|
||||||
|
chunks = ["回答", "<think>", "隐藏内容"]
|
||||||
|
|
||||||
|
visible = list(filter_thinking_chunks(chunks))
|
||||||
|
|
||||||
|
assert "".join(visible) == "回答"
|
||||||
|
|
||||||
|
|
||||||
|
def test_strip_thinking_text_removes_explicit_thinking_process_without_tags():
|
||||||
|
text = """Thinking Process:
|
||||||
|
1. **Analyze the Request:**
|
||||||
|
* Input: JSON object containing context and user_text ("你是谁?").
|
||||||
|
* Role: PAM Deployment Agent Interaction Assistant.
|
||||||
|
* Constraints:
|
||||||
|
* Do NOT automatically trigger deployment, rollback, upgrade, script execution, or MCP calls.
|
||||||
|
* Do NOT output secrets.
|
||||||
|
2. **Determine the Response:**
|
||||||
|
* The user is asking about my identity.
|
||||||
|
* I need to introduce myself briefly.
|
||||||
|
3. **Drafting the Response:**
|
||||||
|
* Greeting/Identity: 我是 PAM 部署 Agent 的交互助手。
|
||||||
|
* Function: 我可以回答普通问题、解释命令和部署流程。
|
||||||
|
4. **Refining the Response:**
|
||||||
|
* Keep it concise and friendly.
|
||||||
|
|
||||||
|
我是 PAM 部署 Agent 的交互助手。
|
||||||
|
我可以回答普通问题、解释当前 Agent 的命令和部署流程。
|
||||||
|
"""
|
||||||
|
|
||||||
|
visible = strip_thinking_text(text)
|
||||||
|
|
||||||
|
assert "Thinking Process" not in visible
|
||||||
|
assert "Analyze the Request" not in visible
|
||||||
|
assert "Determine the Response" not in visible
|
||||||
|
assert "Drafting the Response" not in visible
|
||||||
|
assert "我是 PAM 部署 Agent 的交互助手。" in visible
|
||||||
|
assert visible.startswith("我是 PAM 部署 Agent")
|
||||||
|
|
||||||
|
|
||||||
|
def test_strip_thinking_text_keeps_content_after_final_answer_marker():
|
||||||
|
text = """Reasoning Process:
|
||||||
|
I should not expose this.
|
||||||
|
Final Answer: 可以,我只展示最终回答。
|
||||||
|
"""
|
||||||
|
|
||||||
|
assert strip_thinking_text(text) == "可以,我只展示最终回答。"
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_thinking_chunks_suppresses_explicit_reasoning_until_finish():
|
||||||
|
chunks = [
|
||||||
|
"Think",
|
||||||
|
"ing Process:\n",
|
||||||
|
"I should hide this reasoning.\n",
|
||||||
|
"Final Answer: ",
|
||||||
|
"这是最终回答。",
|
||||||
|
]
|
||||||
|
|
||||||
|
visible = list(filter_thinking_chunks(chunks))
|
||||||
|
|
||||||
|
assert "".join(visible) == "这是最终回答。"
|
||||||
66
tests/test_logging_utils.py
Normal file
66
tests/test_logging_utils.py
Normal file
@ -0,0 +1,66 @@
|
|||||||
|
import logging
|
||||||
|
from logging.handlers import TimedRotatingFileHandler
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from pam_deploy_graph.logging_utils import configure_logging, json_for_log, redact_for_log
|
||||||
|
|
||||||
|
|
||||||
|
def test_redact_for_log_masks_sensitive_keys_and_inline_assignments():
|
||||||
|
payload = {
|
||||||
|
"CLIENT_SECRET": "home-secret",
|
||||||
|
"api_key": "llm-key",
|
||||||
|
"nested": {
|
||||||
|
"Authorization": "Bearer token-value",
|
||||||
|
"message": "CLIENT_SECRET=abc api_key:xyz Authorization=Bearer raw-token header Bearer plain-token",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
redacted = redact_for_log(payload)
|
||||||
|
serialized = json_for_log(payload)
|
||||||
|
|
||||||
|
assert redacted["CLIENT_SECRET"] == "***"
|
||||||
|
assert redacted["api_key"] == "***"
|
||||||
|
assert redacted["nested"]["Authorization"] == "***"
|
||||||
|
assert "home-secret" not in serialized
|
||||||
|
assert "llm-key" not in serialized
|
||||||
|
assert "token-value" not in serialized
|
||||||
|
assert "CLIENT_SECRET=***" in serialized
|
||||||
|
assert "api_key:***" in serialized
|
||||||
|
assert "Authorization=***" in serialized
|
||||||
|
assert "Bearer ***" in serialized
|
||||||
|
assert "raw-token" not in serialized
|
||||||
|
assert "plain-token" not in serialized
|
||||||
|
|
||||||
|
|
||||||
|
def test_configure_logging_uses_daily_rotation_and_retention(tmp_path: Path):
|
||||||
|
log_path = tmp_path / "pam_deploy_agent.log"
|
||||||
|
package_logger = logging.getLogger("pam_deploy_graph")
|
||||||
|
previous_handlers = list(package_logger.handlers)
|
||||||
|
for handler in previous_handlers:
|
||||||
|
package_logger.removeHandler(handler)
|
||||||
|
|
||||||
|
try:
|
||||||
|
result = configure_logging(log_file=log_path, level="DEBUG", retention_days=3)
|
||||||
|
|
||||||
|
assert result == log_path
|
||||||
|
handlers = [handler for handler in package_logger.handlers if isinstance(handler, TimedRotatingFileHandler)]
|
||||||
|
assert len(handlers) == 1
|
||||||
|
handler = handlers[0]
|
||||||
|
assert Path(handler.baseFilename) == log_path.resolve()
|
||||||
|
assert handler.when == "MIDNIGHT"
|
||||||
|
assert handler.backupCount == 3
|
||||||
|
assert package_logger.level == logging.DEBUG
|
||||||
|
|
||||||
|
configure_logging(log_file=log_path, level="INFO", retention_days=5)
|
||||||
|
|
||||||
|
handlers = [handler for handler in package_logger.handlers if isinstance(handler, TimedRotatingFileHandler)]
|
||||||
|
assert len(handlers) == 1
|
||||||
|
assert handlers[0] is handler
|
||||||
|
assert handler.backupCount == 5
|
||||||
|
assert package_logger.level == logging.INFO
|
||||||
|
finally:
|
||||||
|
for handler in list(package_logger.handlers):
|
||||||
|
package_logger.removeHandler(handler)
|
||||||
|
handler.close()
|
||||||
|
for handler in previous_handlers:
|
||||||
|
package_logger.addHandler(handler)
|
||||||
@ -215,6 +215,51 @@ def test_mcp_runner_passes_hash_code_and_node_url():
|
|||||||
assert calls[0][1]["nodeUrl"] == "https://pam.node"
|
assert calls[0][1]["nodeUrl"] == "https://pam.node"
|
||||||
|
|
||||||
|
|
||||||
|
def test_mcp_runner_passes_parent_version_only_for_download_task():
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
class Client:
|
||||||
|
def call_tool(self, tool_name, arguments):
|
||||||
|
calls.append((tool_name, arguments))
|
||||||
|
return {"ACTION": "create-download-task", "SUCCESS": "true"}
|
||||||
|
|
||||||
|
runner = McpActionRunner(client=Client())
|
||||||
|
|
||||||
|
result = runner.run(
|
||||||
|
"create-download-task",
|
||||||
|
params={
|
||||||
|
"VERSION_NUMBER": "2.0.5",
|
||||||
|
"PARENT_VERSION_NUMBER": "2.0.4",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result.ok is True
|
||||||
|
assert calls[0][1]["versionNumber"] == "2.0.5"
|
||||||
|
assert calls[0][1]["parentVersionNumber"] == "2.0.4"
|
||||||
|
|
||||||
|
|
||||||
|
def test_mcp_runner_omits_blank_parent_version():
|
||||||
|
calls = []
|
||||||
|
|
||||||
|
class Client:
|
||||||
|
def call_tool(self, tool_name, arguments):
|
||||||
|
calls.append((tool_name, arguments))
|
||||||
|
return {"ACTION": "create-download-task", "SUCCESS": "true"}
|
||||||
|
|
||||||
|
runner = McpActionRunner(client=Client())
|
||||||
|
|
||||||
|
result = runner.run(
|
||||||
|
"create-download-task",
|
||||||
|
params={
|
||||||
|
"VERSION_NUMBER": "2.0.5",
|
||||||
|
"PARENT_VERSION_NUMBER": "",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result.ok is True
|
||||||
|
assert "parentVersionNumber" not in calls[0][1]
|
||||||
|
|
||||||
|
|
||||||
def _write_json_config(tmpdir, payload):
|
def _write_json_config(tmpdir, payload):
|
||||||
path = tmpdir / "mcp.json"
|
path = tmpdir / "mcp.json"
|
||||||
path.write_text(__import__("json").dumps(payload), encoding="utf-8")
|
path.write_text(__import__("json").dumps(payload), encoding="utf-8")
|
||||||
|
|||||||
@ -1,3 +1,5 @@
|
|||||||
|
from pathlib import Path
|
||||||
|
|
||||||
from pam_deploy_graph.script_runner import ScriptActionRunner
|
from pam_deploy_graph.script_runner import ScriptActionRunner
|
||||||
|
|
||||||
|
|
||||||
@ -46,3 +48,23 @@ def test_build_powershell_action_command():
|
|||||||
"-RollbackStopFirst",
|
"-RollbackStopFirst",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_shell_rollback_uses_query_parameters_not_form_body():
|
||||||
|
text = Path("doc_scripts/deploy.sh").read_text(encoding="utf-8")
|
||||||
|
start = text.index("rollback_ip()")
|
||||||
|
end = text.index("run_manual_rollback()", start)
|
||||||
|
rollback_block = text[start:end]
|
||||||
|
|
||||||
|
assert "/api/mcp/version/upgrade/rollback?${rollback_query}" in rollback_block
|
||||||
|
assert "application/x-www-form-urlencoded" not in rollback_block
|
||||||
|
|
||||||
|
|
||||||
|
def test_powershell_rollback_uses_query_parameters_not_form_body():
|
||||||
|
text = Path("doc_scripts/deploy.ps1").read_text(encoding="utf-8")
|
||||||
|
start = text.index("function Invoke-Rollback")
|
||||||
|
end = text.index("function Invoke-IpDeploy", start)
|
||||||
|
rollback_block = text[start:end]
|
||||||
|
|
||||||
|
assert "/api/mcp/version/upgrade/rollback?$query" in rollback_block
|
||||||
|
assert "-Body $body" not in rollback_block
|
||||||
|
assert "application/x-www-form-urlencoded" not in rollback_block
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user