diff --git a/doc_scripts/PAM智能部署 Agent Skill 文档.md.md b/doc_scripts/PAM智能部署 Agent Skill 文档.md.md index fecc62a..12f793f 100644 --- a/doc_scripts/PAM智能部署 Agent Skill 文档.md.md +++ b/doc_scripts/PAM智能部署 Agent Skill 文档.md.md @@ -48,6 +48,8 @@ description: 基于 PAM HOME/NODE 流程执行软件发布、下载、升级、 8. Windows 脚本模式默认优先 `deploy.ps1`,不要默认使用 `deploy.bat`。 9. 当前目录如果只有文档而没有真实脚本文件,先根据参考实现落地脚本,再决定是否执行。 10. `download-cloud` 只负责触发云下载任务;后续必须异步调用进度接口并持续展示状态/进度,直到成功、失败或超时。 +11. 脚本模式下,每个关键方法执行前后都输出统一流程日志,至少包含 `[FLOW][START]`、`[FLOW][DONE]`、`[FLOW][FAIL]`。 +12. 当检测到需要回滚时,脚本只标记 `PENDING_AGENT_CONFIRMATION(...)`,不得自动执行回滚;必须由 Agent 先向用户确认,再走手动回滚入口或直接调用回滚接口。 ## 统一部署流程 @@ -61,10 +63,10 @@ description: 基于 PAM HOME/NODE 流程执行软件发布、下载、升级、 | 2.3 | 发布版本 | `PUT {HOME_BASE_URL}/api/version/upgrade/profile?...` | | 3.1 | 获取 Node 地址 | `GET {HOME_BASE_URL}/api/mcp/airport/target-node?airportCode={airportCode}` | | 3.2 | 获取在线工作站 IP | `GET {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/ips?...` | -| 3.3 | 下载软件包到 Node | `GET {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/download-cloud?...` | -| 3.3b | 异步轮询并展示下载进度 | `GET {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/download-cloud/progress?...&versionNumer={versionNumber}` | -| 4.1 | 对每个 IP 执行升级 | `POST {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade` | -| 4.2 | 启动应用 | `POST {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/start-stop` | +| 3.3 | 下载软件包到 Node | `GET {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/download-cloud?...&timeOut=0` | +| 3.3b | 异步轮询并展示下载进度 | `GET {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/download-cloud/progress?...&versionNumber={versionNumber}` | +| 4.1 | 对每个 IP 执行升级 | `POST {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade?airportCode=...&targetIp=...` | +| 4.2 | 启动应用 | `POST {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/start-stop?airportCode=...&targetIp=...&runStart=true` | | 4.3 | 健康检测 | `GET {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/verify?...` | | 4.4 | 下载日志 | `GET {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/log-download?...` | | 4.x | 失败回滚 | `POST {HOME_BASE_URL}/node-proxy/{airportCode}/api/mcp/version/upgrade/rollback` | @@ -83,6 +85,13 @@ description: 基于 PAM HOME/NODE 流程执行软件发布、下载、升级、 当 `msg=success`、`step=DONE`、`rateOfProgress=100` 时,判定云下载完成;其中 `rateOfProgress` 就是下载进度值,应持续展示。 +接口参数约定补充: + +- `POST /api/mcp/version/upgrade` 的业务参数直接拼到 `?query`,不要放在 body 表单里。 +- `POST /api/mcp/version/upgrade/start-stop` 的业务参数直接拼到 `?query`,不要放在 body 表单里。 +- 启停接口参数名统一使用 `runStart`,不要再用 `runstart`。 +- `download-cloud` 创建任务接口固定传 `timeOut=0`,表示任务创建成功后立即返回,再通过进度接口异步轮询,不要等待长超时。 + ## MCP 模式 1. 直接调用 PAM MCP 提供的能力完成上述流程,不生成本地脚本文件。 @@ -110,13 +119,13 @@ description: 基于 PAM HOME/NODE 流程执行软件发布、下载、升级、 ## 失败处理与回滚 1. `Step 2` 或 `Step 3` 失败时,终止整个部署,并指出失败阶段。 -2. `Step 4.1` 升级失败时,记录该 IP 失败原因,尝试回滚,然后继续处理其他在线 IP,除非用户要求全有或全无。 -3. `Step 4.3` 健康检测失败时,执行以下顺序: - - 调用 `start-stop` 停止应用,`runstart=false` +2. `Step 4.1` 升级失败时,记录该 IP 失败原因,并把回滚状态标记为 `PENDING_AGENT_CONFIRMATION(stopFirst=false)`;Agent 需要先向用户确认是否回滚,再决定是否调用 `rollback`。 +3. `Step 4.3` 健康检测失败时,记录失败原因,并把回滚状态标记为 `PENDING_AGENT_CONFIRMATION(stopFirst=true)`;若用户确认回滚,再执行以下顺序: + - 调用 `start-stop` 停止应用,`runStart=false` - 调用 `rollback` - 再次执行健康检测 - 下载回滚阶段日志 -4. 回滚是否成功也必须写入最终报告,不能仅记录“已尝试回滚”。 +4. 最终报告必须写清楚回滚是“未执行”“待 Agent 确认”还是“已执行及结果”,不能只记录“已尝试回滚”。 ## 输出要求 @@ -125,7 +134,7 @@ description: 基于 PAM HOME/NODE 流程执行软件发布、下载、升级、 - 本次模式:`MCP` 或 `API脚本` - 机场、应用、模块、版本 - 在线工作站总数、成功数、失败数 -- 每个 IP 的状态、失败阶段、失败原因、回滚结果、日志位置或日志摘要 +- 每个 IP 的状态、失败阶段、失败原因、回滚结果或 `PENDING_AGENT_CONFIRMATION(...)` 状态、日志位置或日志摘要 - 如果是脚本模式:实际生成或执行的文件名 可按以下结构输出: diff --git a/doc_scripts/PAM智能部署 Shell & Bat 脚本实现.md.md b/doc_scripts/PAM智能部署 Shell & Bat 脚本实现.md.md index 0398a4a..310e7ac 100644 --- a/doc_scripts/PAM智能部署 Shell & Bat 脚本实现.md.md +++ b/doc_scripts/PAM智能部署 Shell & Bat 脚本实现.md.md @@ -28,8 +28,10 @@ - `config.txt` - `deploy.sh` 或 `deploy.ps1` - 仅在用户明确要求时再提供 `deploy.bat` -7. NODE 侧接口路径统一使用 `node-proxy`;`download-cloud/progress` 需额外携带 `versionNumer`,并以异步轮询方式持续展示下载进度。 +7. NODE 侧接口路径统一使用 `node-proxy`;`download-cloud/progress` 需额外携带 `versionNumber`,并以异步轮询方式持续展示下载进度。 8. `download-cloud/progress` 的完成判定优先读取 `msg`、`step`、`rateOfProgress`;当 `msg=success`、`step=DONE`、`rateOfProgress=100` 时代表下载完成,其中 `rateOfProgress` 即下载进度值。 +9. 正式部署脚本不会自动执行回滚;发现需要回滚时,只输出 `PENDING_AGENT_CONFIRMATION(stopFirst=...)`,由 Agent 先和用户确认,再调用手动回滚入口。 +10. `POST /api/mcp/version/upgrade` 和 `POST /api/mcp/version/upgrade/start-stop` 的业务参数都直接放在 URL query 中,不再使用 body 表单;启停接口参数名使用 `runStart`;`download-cloud` 固定传 `timeOut=0` 创建任务。 ## 0.1 当前实现边界 @@ -382,14 +384,14 @@ main() { # Step 3.3: 下载软件包到 Node log_info "Step 3.3: 下载软件包到 Node..." local download_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud" - local download_params="?versionNumber=${VERSION_NUMBER}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=${TIMEOUT}" + local download_params="?versionNumber=${VERSION_NUMBER}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=0" http_request "GET" "${download_url}${download_params}" \ "" \ "airport-code:${AIRPORT_CODE},Target-Node:${NODE_URL}" > /dev/null # 轮询下载进度 - local progress_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud/progress?applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&airportCode=${AIRPORT_CODE}&versionNumer=${VERSION_NUMBER}" + local progress_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud/progress?applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&airportCode=${AIRPORT_CODE}&versionNumber=${VERSION_NUMBER}" poll_progress "$progress_url" 60 2 if [ $? -ne 0 ]; then log_error "软件包下载失败" @@ -407,11 +409,10 @@ main() { # 4.1: 执行升级 log_info "Step 4.1: 执行升级..." - local upgrade_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade" - local upgrade_data="airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&versionNumber=${VERSION_NUMBER}&action=${ACTION_TYPE}&autoStart=false&timeOut=${TIMEOUT}" + local upgrade_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade?airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&versionNumber=${VERSION_NUMBER}&action=${ACTION_TYPE}&autoStart=false&timeOut=${TIMEOUT}" local upgrade_response - upgrade_response=$(http_request "POST" "$upgrade_url" "$upgrade_data" "Target-Node:${NODE_URL}") + upgrade_response=$(http_request "POST" "$upgrade_url" "" "Target-Node:${NODE_URL}") local upgrade_success upgrade_success=$(echo $upgrade_response | jq -r '.success' 2>/dev/null) @@ -420,19 +421,13 @@ main() { ip_status="FAILURE" ip_message="Upgrade failed" log_error "升级失败: $upgrade_response" - log_warn "尝试回滚..." - - # 回滚逻辑 - local rollback_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/rollback" - local rollback_data="airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=${TIMEOUT}" - http_request "POST" "$rollback_url" "$rollback_data" "Target-Node:${NODE_URL}" > /dev/null - log_warn "已触发回滚" + log_warn "需要回滚,但当前脚本不会自动执行。请由 Agent 与用户确认后,再调用手动回滚入口。" + rollback_result="PENDING_AGENT_CONFIRMATION(stopFirst=false)" else # 4.2: 启动应用 log_info "Step 4.2: 启动应用..." - local start_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/start-stop" - local start_data="airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&runstart=true" - http_request "POST" "$start_url" "$start_data" "Target-Node:${NODE_URL}" > /dev/null + local start_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/start-stop?airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&runStart=true" + http_request "POST" "$start_url" "" "Target-Node:${NODE_URL}" > /dev/null # 4.3: 健康检测 log_info "Step 4.3: 健康检测..." @@ -448,11 +443,9 @@ main() { ip_message=$(echo $verify_response | jq -r '.message // "Unknown error"' 2>/dev/null) log_error "健康检测失败: ${ip_message} (原始响应: $verify_response)" - # 健康检测失败也可触发回滚(可选) - log_warn "尝试回滚..." - local rollback_url="${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/rollback" - local rollback_data="airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=${TIMEOUT}" - http_request "POST" "$rollback_url" "$rollback_data" "Target-Node:${NODE_URL}" > /dev/null + # 健康检测失败时只标记待确认回滚,不自动执行 + log_warn "需要回滚,但当前脚本不会自动执行。请由 Agent 与用户确认后,再调用手动回滚入口。" + rollback_result="PENDING_AGENT_CONFIRMATION(stopFirst=true)" else log_info "健康检测通过" fi diff --git a/doc_scripts/deploy.ps1 b/doc_scripts/deploy.ps1 index 693a574..ad5e920 100644 --- a/doc_scripts/deploy.ps1 +++ b/doc_scripts/deploy.ps1 @@ -1,5 +1,7 @@ param( [string]$ConfigPath = (Join-Path $PSScriptRoot 'config.txt'), + [string]$RollbackIp = '', + [switch]$RollbackStopFirst, [switch]$Help ) @@ -11,6 +13,7 @@ function Show-DeployUsage { @' Usage: powershell -File .\deploy.ps1 [-ConfigPath .\config.txt] + powershell -File .\deploy.ps1 [-ConfigPath .\config.txt] -RollbackIp 192.168.1.10 [-RollbackStopFirst] Notes: - deploy.bat is only a wrapper for this script. @@ -22,6 +25,78 @@ Notes: function Write-Info([string]$Message) { Write-Host "[INFO] $Message" } function Write-WarnLog([string]$Message) { Write-Host "[WARN] $Message" } function Write-ErrLog([string]$Message) { Write-Host "[ERROR] $Message" } +$script:ActiveConfigPath = $ConfigPath + +function Write-FlowStart([string]$Name, [string]$Detail = '') { + if ($Detail) { + Write-Info "[FLOW][START] $Name | $Detail" + } else { + Write-Info "[FLOW][START] $Name" + } +} + +function Write-FlowDone([string]$Name, [string]$Detail = '') { + if ($Detail) { + Write-Info "[FLOW][DONE] $Name | $Detail" + } else { + Write-Info "[FLOW][DONE] $Name" + } +} + +function Write-FlowFail([string]$Name, [string]$Detail = '') { + if ($Detail) { + Write-ErrLog "[FLOW][FAIL] $Name | $Detail" + } else { + Write-ErrLog "[FLOW][FAIL] $Name" + } +} + +function Invoke-FlowStep { + param( + [string]$Name, + [scriptblock]$Action, + [string]$Detail = '' + ) + + Write-FlowStart -Name $Name -Detail $Detail + try { + $result = & $Action + Write-FlowDone -Name $Name + return $result + } catch { + Write-FlowFail -Name $Name -Detail $_.Exception.Message + throw + } +} + +function Get-ManualRollbackCommand { + param( + [string]$Ip, + [bool]$StopFirst + ) + + $command = "powershell -File .\deploy.ps1 -ConfigPath `"$script:ActiveConfigPath`" -RollbackIp `"$Ip`"" + if ($StopFirst) { + $command += ' -RollbackStopFirst' + } + + return $command +} + +function Get-PendingRollbackStatus { + param( + [string]$Ip, + [string]$Stage, + [bool]$StopFirst, + [string]$Reason + ) + + $status = "PENDING_AGENT_CONFIRMATION(stopFirst=$($StopFirst.ToString().ToLowerInvariant()))" + $command = Get-ManualRollbackCommand -Ip $Ip -StopFirst $StopFirst + Write-WarnLog "检测到需要回滚: ip=$Ip stage=$Stage reason=$Reason stopFirst=$StopFirst" + Write-WarnLog "当前脚本不会自动执行回滚。请由 Agent 与用户确认后,再执行: $command" + return $status +} function Convert-ResponseContent { param([AllowNull()][string]$Content) @@ -411,7 +486,7 @@ function Wait-DownloadProgress { applicationName = $Config.APP_NAME moduleName = $Config.MODULE_NAME airportCode = $Config.AIRPORT_CODE - versionNumer = $Config.VERSION_NUMBER + versionNumber = $Config.VERSION_NUMBER }) $progressUrl = "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/download-cloud/progress?$query" @@ -466,7 +541,7 @@ function Download-CloudToNode { versionNumber = $Config.VERSION_NUMBER applicationName = $Config.APP_NAME moduleName = $Config.MODULE_NAME - timeOut = $Config.TIMEOUT + timeOut = '0' }) [void](Invoke-PamWebRequest -Method GET -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/download-cloud?$query" -Token $Token -Headers @{ @@ -480,7 +555,7 @@ function Download-CloudToNode { function Invoke-UpgradeRequest { param($Config, [string]$Token, [string]$NodeUrl, [string]$Ip) - $body = Join-RequestPairs ([ordered]@{ + $query = Join-RequestPairs ([ordered]@{ airportCode = $Config.AIRPORT_CODE targetIp = $Ip applicationName = $Config.APP_NAME @@ -491,41 +566,41 @@ function Invoke-UpgradeRequest { timeOut = $Config.TIMEOUT }) - Invoke-PamWebRequest -Method POST -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade" -Token $Token -Headers @{ + Invoke-PamWebRequest -Method POST -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade?$query" -Token $Token -Headers @{ 'Target-Node' = $NodeUrl - } -Body $body -ContentType 'application/x-www-form-urlencoded' + } } function Start-Application { param($Config, [string]$Token, [string]$NodeUrl, [string]$Ip) - $body = Join-RequestPairs ([ordered]@{ + $query = Join-RequestPairs ([ordered]@{ airportCode = $Config.AIRPORT_CODE targetIp = $Ip applicationName = $Config.APP_NAME moduleName = $Config.MODULE_NAME - runstart = 'true' + runStart = 'true' }) - [void](Invoke-PamWebRequest -Method POST -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/start-stop" -Token $Token -Headers @{ + [void](Invoke-PamWebRequest -Method POST -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/start-stop?$query" -Token $Token -Headers @{ 'Target-Node' = $NodeUrl - } -Body $body -ContentType 'application/x-www-form-urlencoded') + }) } function Stop-Application { param($Config, [string]$Token, [string]$NodeUrl, [string]$Ip) - $body = Join-RequestPairs ([ordered]@{ + $query = Join-RequestPairs ([ordered]@{ airportCode = $Config.AIRPORT_CODE targetIp = $Ip applicationName = $Config.APP_NAME moduleName = $Config.MODULE_NAME - runstart = 'false' + runStart = 'false' }) - [void](Invoke-PamWebRequest -Method POST -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/start-stop" -Token $Token -Headers @{ + [void](Invoke-PamWebRequest -Method POST -Url "$($Config.HOME_BASE_URL)/node-proxy/$($Config.AIRPORT_CODE)/api/mcp/version/upgrade/start-stop?$query" -Token $Token -Headers @{ 'Target-Node' = $NodeUrl - } -Body $body -ContentType 'application/x-www-form-urlencoded') + }) } function Verify-Ip { @@ -632,9 +707,13 @@ function Invoke-IpDeploy { Write-Info "Processing IP: $Ip" try { - $upgrade = Invoke-UpgradeRequest -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + $upgrade = Invoke-FlowStep -Name "Invoke-UpgradeRequest[$Ip]" -Action { + Invoke-UpgradeRequest -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } } catch { - $logFile = Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + $logFile = Invoke-FlowStep -Name "Download-DeployLog[$Ip]" -Action { + Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } return [pscustomobject]@{ Ip = $Ip Status = 'FAILED' @@ -646,10 +725,12 @@ function Invoke-IpDeploy { } if ((Get-ResponseValue -Response $upgrade -Candidates @('success')) -ne 'true') { - $rollback = Invoke-Rollback -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip -StopFirst:$false - $logFile = Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip $message = Get-ResponseValue -Response $upgrade -Candidates @('message') if (-not $message) { $message = 'Upgrade failed' } + $rollback = Get-PendingRollbackStatus -Ip $Ip -Stage 'UPGRADE' -StopFirst:$false -Reason $message + $logFile = Invoke-FlowStep -Name "Download-DeployLog[$Ip]" -Action { + Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } return [pscustomobject]@{ Ip = $Ip Status = 'FAILED' @@ -661,10 +742,14 @@ function Invoke-IpDeploy { } try { - Start-Application -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + Invoke-FlowStep -Name "Start-Application[$Ip]" -Action { + Start-Application -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } | Out-Null } catch { - $rollback = Invoke-Rollback -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip -StopFirst:$true - $logFile = Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + $rollback = Get-PendingRollbackStatus -Ip $Ip -Stage 'START' -StopFirst:$true -Reason 'Application start failed' + $logFile = Invoke-FlowStep -Name "Download-DeployLog[$Ip]" -Action { + Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } return [pscustomobject]@{ Ip = $Ip Status = 'FAILED' @@ -676,10 +761,14 @@ function Invoke-IpDeploy { } try { - $verify = Verify-Ip -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + $verify = Invoke-FlowStep -Name "Verify-Ip[$Ip]" -Action { + Verify-Ip -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } } catch { - $rollback = Invoke-Rollback -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip -StopFirst:$true - $logFile = Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + $rollback = Get-PendingRollbackStatus -Ip $Ip -Stage 'VERIFY' -StopFirst:$true -Reason 'Health check request failed' + $logFile = Invoke-FlowStep -Name "Download-DeployLog[$Ip]" -Action { + Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } return [pscustomobject]@{ Ip = $Ip Status = 'FAILED' @@ -691,7 +780,9 @@ function Invoke-IpDeploy { } if ((Get-ResponseValue -Response $verify -Candidates @('success')) -eq 'true') { - $logFile = Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + $logFile = Invoke-FlowStep -Name "Download-DeployLog[$Ip]" -Action { + Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } return [pscustomobject]@{ Ip = $Ip Status = 'SUCCESS' @@ -702,10 +793,12 @@ function Invoke-IpDeploy { } } - $rollback = Invoke-Rollback -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip -StopFirst:$true - $logFile = Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip $message = Get-ResponseValue -Response $verify -Candidates @('message') if (-not $message) { $message = 'Health check failed' } + $rollback = Get-PendingRollbackStatus -Ip $Ip -Stage 'VERIFY' -StopFirst:$true -Reason $message + $logFile = Invoke-FlowStep -Name "Download-DeployLog[$Ip]" -Action { + Download-DeployLog -Config $Config -Token $Token -NodeUrl $NodeUrl -Ip $Ip + } return [pscustomobject]@{ Ip = $Ip Status = 'FAILED' @@ -749,18 +842,37 @@ function Write-DeployReport { function Invoke-PamDeploy { param([string]$ConfigPath) - $config = Get-PamConfig -Path $ConfigPath - Test-ZipFile -Config $config + $script:ActiveConfigPath = $ConfigPath + $config = Invoke-FlowStep -Name 'Get-PamConfig' -Detail "path=$ConfigPath" -Action { + Get-PamConfig -Path $ConfigPath + } + Invoke-FlowStep -Name 'Test-ZipFile' -Action { + Test-ZipFile -Config $config + } | Out-Null Write-Info "Deploy start: airport=$($config.AIRPORT_CODE), version=$($config.VERSION_NUMBER), module=$($config.APP_NAME)/$($config.MODULE_NAME)" - $token = Get-Token -Config $config - New-VersionRecord -Config $config -Token $token - $hashCode = Upload-Package -Config $config -Token $token - Publish-Version -Config $config -Token $token -HashCode $hashCode - $nodeUrl = Get-NodeUrl -Config $config -Token $token - $ips = Get-OnlineIps -Config $config -Token $token -NodeUrl $nodeUrl - Download-CloudToNode -Config $config -Token $token -NodeUrl $nodeUrl + $token = Invoke-FlowStep -Name 'Get-Token' -Action { + Get-Token -Config $config + } + Invoke-FlowStep -Name 'New-VersionRecord' -Action { + New-VersionRecord -Config $config -Token $token + } | Out-Null + $hashCode = Invoke-FlowStep -Name 'Upload-Package' -Action { + Upload-Package -Config $config -Token $token + } + Invoke-FlowStep -Name 'Publish-Version' -Action { + Publish-Version -Config $config -Token $token -HashCode $hashCode + } | Out-Null + $nodeUrl = Invoke-FlowStep -Name 'Get-NodeUrl' -Action { + Get-NodeUrl -Config $config -Token $token + } + $ips = Invoke-FlowStep -Name 'Get-OnlineIps' -Action { + Get-OnlineIps -Config $config -Token $token -NodeUrl $nodeUrl + } + Invoke-FlowStep -Name 'Download-CloudToNode' -Action { + Download-CloudToNode -Config $config -Token $token -NodeUrl $nodeUrl + } | Out-Null $results = [System.Collections.Generic.List[object]]::new() foreach ($ip in $ips) { @@ -770,6 +882,33 @@ function Invoke-PamDeploy { Write-DeployReport -Config $config -Results $results -TotalCount $ips.Count } +function Invoke-PamManualRollback { + param( + [string]$ConfigPath, + [string]$Ip, + [bool]$StopFirst + ) + + $script:ActiveConfigPath = $ConfigPath + $config = Invoke-FlowStep -Name 'Get-PamConfig' -Detail "path=$ConfigPath" -Action { + Get-PamConfig -Path $ConfigPath + } + + Write-Info "Manual rollback start: airport=$($config.AIRPORT_CODE), ip=$Ip, stopFirst=$StopFirst" + $token = Invoke-FlowStep -Name 'Get-Token' -Action { + Get-Token -Config $config + } + $nodeUrl = Invoke-FlowStep -Name 'Get-NodeUrl' -Action { + Get-NodeUrl -Config $config -Token $token + } + $result = Invoke-FlowStep -Name "Invoke-Rollback[$Ip]" -Action { + Invoke-Rollback -Config $config -Token $token -NodeUrl $nodeUrl -Ip $Ip -StopFirst:$StopFirst + } + + Write-Info "Manual rollback done: ip=$Ip result=$result" + Write-Host "ROLLBACK RESULT: $result" +} + if ($Help) { Show-DeployUsage exit 0 @@ -777,7 +916,11 @@ if ($Help) { if ($MyInvocation.InvocationName -ne '.') { try { - Invoke-PamDeploy -ConfigPath $ConfigPath + if ($RollbackIp) { + Invoke-PamManualRollback -ConfigPath $ConfigPath -Ip $RollbackIp -StopFirst:$RollbackStopFirst.IsPresent + } else { + Invoke-PamDeploy -ConfigPath $ConfigPath + } } catch { Write-ErrLog $_ exit 1 diff --git a/doc_scripts/deploy.sh b/doc_scripts/deploy.sh index 9cc2282..0cb496e 100644 --- a/doc_scripts/deploy.sh +++ b/doc_scripts/deploy.sh @@ -6,6 +6,7 @@ set -uo pipefail SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" DEFAULT_CONFIG_PATH="${SCRIPT_DIR}/config.txt" +ACTIVE_CONFIG_PATH="$DEFAULT_CONFIG_PATH" TOKEN="" HASH_CODE="" @@ -25,6 +26,7 @@ usage() { cat <<'EOF' 用法: ./deploy.sh [--config /path/to/config.txt] + ./deploy.sh [--config /path/to/config.txt] --rollback-ip 192.168.1.10 [--rollback-stop-first] 配置项: HOME_BASE_URL @@ -45,6 +47,97 @@ log_info() { printf '[INFO] %s\n' "$*"; } log_warn() { printf '[WARN] %s\n' "$*"; } log_error() { printf '[ERROR] %s\n' "$*" >&2; } +log_flow_start() { + local name="$1" + shift || true + if (($#)); then + log_info "[FLOW][START] ${name} | $*" + else + log_info "[FLOW][START] ${name}" + fi +} + +log_flow_done() { + local name="$1" + shift || true + if (($#)); then + log_info "[FLOW][DONE] ${name} | $*" + else + log_info "[FLOW][DONE] ${name}" + fi +} + +log_flow_fail() { + local name="$1" + shift || true + if (($#)); then + log_error "[FLOW][FAIL] ${name} | $*" + else + log_error "[FLOW][FAIL] ${name}" + fi +} + +run_flow_step() { + local flow_name="$1" + shift + + log_flow_start "$flow_name" + if "$@"; then + log_flow_done "$flow_name" + return 0 + fi + + local exit_code=$? + log_flow_fail "$flow_name" "exit=${exit_code}" + return "$exit_code" +} + +run_flow_capture() { + local __var_name="$1" + local flow_name="$2" + shift 2 + + local output="" + log_flow_start "$flow_name" + if output="$("$@")"; then + printf -v "$__var_name" '%s' "$output" + log_flow_done "$flow_name" + return 0 + fi + + local exit_code=$? + printf -v "$__var_name" '%s' "$output" + local detail="exit=${exit_code}" + if [[ -n "$output" ]]; then + detail="${detail} output=${output//$'\n'/ }" + fi + log_flow_fail "$flow_name" "$detail" + return "$exit_code" +} + +manual_rollback_command() { + local ip="$1" + local stop_first="$2" + local command="./deploy.sh --config \"$ACTIVE_CONFIG_PATH\" --rollback-ip \"$ip\"" + if [[ "$stop_first" == "true" ]]; then + command="${command} --rollback-stop-first" + fi + printf '%s' "$command" +} + +pending_rollback_status() { + local ip="$1" + local stage="$2" + local reason="$3" + local stop_first="$4" + local command + + command="$(manual_rollback_command "$ip" "$stop_first")" + printf '[WARN] 检测到需要回滚: ip=%s, stage=%s, reason=%s, stopFirst=%s\n' "$ip" "$stage" "$reason" "$stop_first" >&2 + printf '[WARN] 当前脚本不会自动执行回滚。请由 Agent 与用户确认后,再执行: %s\n' "$command" >&2 + printf 'PENDING_AGENT_CONFIRMATION(stopFirst=%s)' "$stop_first" +} + timestamp_now() { date '+%Y-%m-%d %H:%M:%S' } @@ -730,7 +823,7 @@ poll_download_progress() { download_cloud_to_node() { log_info "Step 3.3: 下载软件包到 Node..." http_request "GET" \ - "${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud?versionNumber=${VERSION_NUMBER}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=${TIMEOUT}" \ + "${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/download-cloud?versionNumber=${VERSION_NUMBER}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&timeOut=0" \ "" \ "" \ "Target-Node: ${NODE_URL}" \ @@ -742,27 +835,27 @@ download_cloud_to_node() { upgrade_ip() { local ip="$1" http_request "POST" \ - "${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade" \ - "airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&versionNumber=${VERSION_NUMBER}&action=${ACTION_TYPE}&autoStart=false&timeOut=${TIMEOUT}" \ - "application/x-www-form-urlencoded" \ + "${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade?airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&versionNumber=${VERSION_NUMBER}&action=${ACTION_TYPE}&autoStart=false&timeOut=${TIMEOUT}" \ + "" \ + "" \ "Target-Node: ${NODE_URL}" } start_application() { local ip="$1" http_request "POST" \ - "${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/start-stop" \ - "airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&runstart=true" \ - "application/x-www-form-urlencoded" \ + "${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/start-stop?airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&runStart=true" \ + "" \ + "" \ "Target-Node: ${NODE_URL}" >/dev/null } stop_application() { local ip="$1" http_request "POST" \ - "${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/start-stop" \ - "airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&runstart=false" \ - "application/x-www-form-urlencoded" \ + "${HOME_BASE_URL}/node-proxy/${AIRPORT_CODE}/api/mcp/version/upgrade/start-stop?airportCode=${AIRPORT_CODE}&targetIp=${ip}&applicationName=${APP_NAME}&moduleName=${MODULE_NAME}&runStart=false" \ + "" \ + "" \ "Target-Node: ${NODE_URL}" >/dev/null } @@ -865,6 +958,28 @@ rollback_ip() { fi } +run_manual_rollback() { + local config_path="$1" + local ip="$2" + local stop_first="$3" + local rollback_result="" + + ACTIVE_CONFIG_PATH="$config_path" + init_runtime + load_config "$config_path" + ensure_dependencies + + log_info "PAM 手动回滚开始" + log_info "机场: ${AIRPORT_CODE}, 目标IP: ${ip}, stopFirst=${stop_first}" + + run_flow_step "get_token" get_token || return 1 + run_flow_step "get_node_url" get_node_url || return 1 + run_flow_capture rollback_result "rollback_ip[${ip}]" rollback_ip "$ip" "$stop_first" || return 1 + + log_info "手动回滚完成: ip=${ip}, result=${rollback_result}" + printf 'ROLLBACK RESULT: %s\n' "$rollback_result" +} + add_result() { local ip="$1" local status="$2" @@ -892,59 +1007,59 @@ deploy_one_ip() { local ip="$1" log_info "处理工作站: $ip" - local upgrade_response - if ! upgrade_response=$(upgrade_ip "$ip"); then - local log_file - log_file="$(download_log "$ip")" + local upgrade_response="" + if ! run_flow_capture upgrade_response "upgrade_ip[${ip}]" upgrade_ip "$ip"; then + local log_file="" + run_flow_capture log_file "download_log[${ip}]" download_log "$ip" || true add_result "$ip" "FAILED" "UPGRADE" "Upgrade request failed" "ROLLBACK_NOT_RUN" "$log_file" return fi if [[ "$(json_value "$upgrade_response" '.success')" != "true" ]]; then - local rollback_result - rollback_result="$(rollback_ip "$ip" "false")" - local log_file - log_file="$(download_log "$ip")" local message message="$(json_value "$upgrade_response" '.message')" [[ -z "$message" ]] && message="Upgrade failed" + local rollback_result + rollback_result="$(pending_rollback_status "$ip" "UPGRADE" "$message" "false")" + local log_file="" + run_flow_capture log_file "download_log[${ip}]" download_log "$ip" || true add_result "$ip" "FAILED" "UPGRADE" "$message" "$rollback_result" "$log_file" return fi - if ! start_application "$ip"; then + if ! run_flow_step "start_application[${ip}]" start_application "$ip"; then local rollback_result - rollback_result="$(rollback_ip "$ip" "true")" - local log_file - log_file="$(download_log "$ip")" + rollback_result="$(pending_rollback_status "$ip" "START" "Application start failed" "true")" + local log_file="" + run_flow_capture log_file "download_log[${ip}]" download_log "$ip" || true add_result "$ip" "FAILED" "START" "Application start failed" "$rollback_result" "$log_file" return fi - local verify_response - if ! verify_response="$(verify_ip "$ip")"; then + local verify_response="" + if ! run_flow_capture verify_response "verify_ip[${ip}]" verify_ip "$ip"; then local rollback_result - rollback_result="$(rollback_ip "$ip" "true")" - local log_file - log_file="$(download_log "$ip")" + rollback_result="$(pending_rollback_status "$ip" "VERIFY" "Health check request failed" "true")" + local log_file="" + run_flow_capture log_file "download_log[${ip}]" download_log "$ip" || true add_result "$ip" "FAILED" "VERIFY" "Health check request failed" "$rollback_result" "$log_file" return fi if [[ "$(json_value "$verify_response" '.success')" == "true" ]]; then - local log_file - log_file="$(download_log "$ip")" + local log_file="" + run_flow_capture log_file "download_log[${ip}]" download_log "$ip" || true add_result "$ip" "SUCCESS" "-" "-" "-" "$log_file" return fi - local rollback_result - rollback_result="$(rollback_ip "$ip" "true")" - local log_file - log_file="$(download_log "$ip")" local message message="$(json_value "$verify_response" '.message')" [[ -z "$message" ]] && message="Health check failed" + local rollback_result + rollback_result="$(pending_rollback_status "$ip" "VERIFY" "$message" "true")" + local log_file="" + run_flow_capture log_file "download_log[${ip}]" download_log "$ip" || true add_result "$ip" "FAILED" "VERIFY" "$message" "$rollback_result" "$log_file" } @@ -979,6 +1094,8 @@ init_runtime() { main() { local config_path="$DEFAULT_CONFIG_PATH" + local manual_rollback_ip="" + local manual_rollback_stop_first="false" while (($#)); do case "$1" in @@ -987,6 +1104,15 @@ main() { config_path="$2" shift 2 ;; + --rollback-ip) + [[ $# -lt 2 ]] && { log_error "--rollback-ip 缺少IP"; exit 1; } + manual_rollback_ip="$2" + shift 2 + ;; + --rollback-stop-first) + manual_rollback_stop_first="true" + shift + ;; -h|--help) usage exit 0 @@ -999,6 +1125,12 @@ main() { esac done + ACTIVE_CONFIG_PATH="$config_path" + if [[ -n "$manual_rollback_ip" ]]; then + run_manual_rollback "$config_path" "$manual_rollback_ip" "$manual_rollback_stop_first" + return + fi + init_runtime load_config "$config_path" ensure_dependencies @@ -1007,13 +1139,13 @@ main() { log_info "PAM 智能部署开始" log_info "机场: ${AIRPORT_CODE}, 版本: ${VERSION_NUMBER}, 模块: ${APP_NAME}/${MODULE_NAME}" - get_token - create_version - upload_package - publish_version - get_node_url - get_online_ips - download_cloud_to_node + run_flow_step "get_token" get_token || exit 1 + run_flow_step "create_version" create_version || exit 1 + run_flow_step "upload_package" upload_package || exit 1 + run_flow_step "publish_version" publish_version || exit 1 + run_flow_step "get_node_url" get_node_url || exit 1 + run_flow_step "get_online_ips" get_online_ips || exit 1 + run_flow_step "download_cloud_to_node" download_cloud_to_node || exit 1 for ip in "${ONLINE_IPS[@]}"; do deploy_one_ip "$ip" diff --git a/doc_scripts/当前脚本情况总结.md b/doc_scripts/当前脚本情况总结.md index 39243f1..5fa7493 100644 --- a/doc_scripts/当前脚本情况总结.md +++ b/doc_scripts/当前脚本情况总结.md @@ -42,9 +42,15 @@ 9. 启动应用 10. 健康检查 11. 下载日志 -12. 失败时触发回滚 +12. 失败时标记回滚待确认,由 Agent 与用户确认后再执行手动回滚 13. 输出最终部署报告 +当前接口约定补充: + +- `/api/mcp/version/upgrade` 使用 query 参数,不再使用 body 表单。 +- `/api/mcp/version/upgrade/start-stop` 使用 query 参数,不再使用 body 表单,且参数名使用 `runStart`。 +- `download-cloud` 固定传 `timeOut=0`,仅用于创建下载任务;后续进度通过 `download-cloud/progress` 异步查询。 + ### 2.2 测试脚本 `test_deploy.sh` / `test_deploy.ps1` 当前支持: @@ -70,6 +76,16 @@ 12. `api/mcp/version/upgrade/log-download` 13. `api/mcp/version/upgrade/rollback` +正式部署脚本当前不会自动执行 `rollback`,而是输出待确认状态;需要实际回滚时,应由 Agent 与用户确认后,再调用手动回滚入口: + +```bash +bash ./deploy.sh --config ./config.txt --rollback-ip 192.168.1.10 --rollback-stop-first +``` + +```powershell +powershell -File .\deploy.ps1 -ConfigPath .\config.txt -RollbackIp 192.168.1.10 -RollbackStopFirst +``` + ## 3. 当前运行方式 ### 3.1 Windows @@ -231,6 +247,14 @@ bash ./test_deploy.sh --config ./config.txt --mode full --max-ips 1 ### 5.1 业务日志 +正式部署脚本与手动回滚入口当前会输出统一流程日志,例如: + +```text +[FLOW][START] Get-Token +[FLOW][DONE] Get-Token +[FLOW][FAIL] Verify-Ip[192.168.1.10] | ... +``` + 测试脚本当前会在控制台边跑边打印: ```text