feat: 增强 edge-agent 服务控制与便携打包能力

- 将 Windows/Linux service control 执行器从占位实现推进到可用 - 新增 service control 测试，覆盖 status/start/stop/restart 主路径 - 增强 edge-agent 启动脚本，优先使用包内私有 Python 运行时 - 增强 Windows/Linux 打包脚本，支持携带私有 Python 运行时 - 更新 edge-agent README 与当前进度总结 - 新增 dist 忽略规则，避免打包产物污染仓库
2026-04-09 11:26:42 +08:00 · 2026-04-09 11:26:42 +08:00 · 591df2d18e
commit 591df2d18e
parent 2c7714268f
10 changed files with 292 additions and 27 deletions
--- a/.gitignore
+++ b/.gitignore
@ -1,5 +1,6 @@
 .venv/
 data/
 dist/
 __pycache__/
 .pytest_cache/
 *.pyc
--- a/edge-agent/README.md
+++ b/edge-agent/README.md
@ -37,6 +37,8 @@ C:\Users\MH\AppData\Local\Programs\Python\Python311\python.exe -m pytest edge-ag
   `check_port`
   `check_process`
   `grep_log`
   `windows_service_control`
   `linux_service_control`
 4. current bootstrap implements:
   heartbeat
   pull task
@ -54,7 +56,10 @@ Current repo includes:
 4. `scripts/package-linux.sh`
 These scripts currently prepare a portable package skeleton and startup entrypoints.
-They do not yet bundle a private Python runtime.
+Current Windows package script already bundles a private Python runtime into:
 `runtime/python/`
 Current Linux package script supports bundling a private Python runtime directory passed in by argument or `EDGE_PYTHON_HOME`.
 ## Packaging Direction
@ -66,3 +71,12 @@ For user-side delivery, this edge agent is intended to be bundled as:
 ## Current Verification Baseline
 Current edge-agent baseline: `10 passed`
 ## Verified Packaging
 Current verified artifact:
 1. Windows portable package zip has been generated and verified to include:
   `start.ps1`
   `app/main.py`
   `runtime/python/python.exe`
--- a/edge-agent/app/executors/linux_service_executor.py
+++ b/edge-agent/app/executors/linux_service_executor.py
@ -1,8 +1,40 @@
 from __future__ import annotations
 import subprocess
 from typing import Any
 class LinuxServiceExecutor:
    def execute(self, params: dict[str, Any]) -> tuple[bool, str, dict[str, Any], dict[str, Any]]:
-        return False, "linux service executor not implemented", {"params": params}, {}
+        service_name = str(params["service_name"])
        action = str(params.get("action", "status")).lower()
        scope = str(params.get("scope", "system")).lower()
        if action == "status":
            return self._query_status(service_name, action, scope)
        if action in {"start", "stop", "restart"}:
            self._run_systemctl([action, service_name], scope=scope, check=False)
            return self._query_status(service_name, action, scope)
        return False, f"unsupported action: {action}", self._build_data(service_name, action, scope, None), {}
    def _query_status(self, service_name: str, action: str, scope: str) -> tuple[bool, str, dict[str, Any], dict[str, Any]]:
        result = self._run_systemctl(["is-active", service_name], scope=scope, check=False)
        service_status = result.stdout.strip() or result.stderr.strip() or None
        success = result.returncode == 0
        message = "service status queried" if success else (service_status or "service query failed")
        return success, message, self._build_data(service_name, action, scope, service_status), {"raw_output": (result.stdout + "\n" + result.stderr).strip()}
    def _run_systemctl(self, command: list[str], scope: str, check: bool) -> subprocess.CompletedProcess[str]:
        full_command = ["systemctl"]
        if scope == "user":
            full_command.append("--user")
        full_command.extend(command)
        return subprocess.run(full_command, capture_output=True, text=True, check=check)
    def _build_data(self, service_name: str, action: str, scope: str, service_status: str | None) -> dict[str, Any]:
        return {
            "service_name": service_name,
            "action": action,
            "scope": scope,
            "service_status": service_status,
        }
--- a/edge-agent/app/executors/windows_service_executor.py
+++ b/edge-agent/app/executors/windows_service_executor.py
@ -1,8 +1,64 @@
 from __future__ import annotations
 import subprocess
 from typing import Any
 class WindowsServiceExecutor:
    def execute(self, params: dict[str, Any]) -> tuple[bool, str, dict[str, Any], dict[str, Any]]:
-        return False, "windows service executor not implemented", {"params": params}, {}
+        service_name = str(params["service_name"])
        action = str(params.get("action", "status")).lower()
        if action == "status":
            return self._query_status(service_name, action)
        if action == "start":
            status_before = self._query_service_status(service_name)
            if status_before == "RUNNING":
                return True, "service already running", self._build_data(service_name, action, status_before), {}
            self._run_command(["sc.exe", "start", service_name])
            return self._query_status(service_name, action)
        if action == "stop":
            status_before = self._query_service_status(service_name)
            if status_before == "STOPPED":
                return True, "service already stopped", self._build_data(service_name, action, status_before), {}
            self._run_command(["sc.exe", "stop", service_name])
            return self._query_status(service_name, action)
        if action == "restart":
            stop_success, stop_message, stop_data, stop_evidence = self.execute({"service_name": service_name, "action": "stop"})
            if not stop_success:
                return stop_success, stop_message, stop_data, stop_evidence
            start_success, start_message, start_data, start_evidence = self.execute({"service_name": service_name, "action": "start"})
            start_data["previous_action"] = "stop"
            start_evidence["stop"] = stop_evidence
            return start_success, start_message, start_data, start_evidence
        return False, f"unsupported action: {action}", self._build_data(service_name, action, None), {}
    def _query_status(self, service_name: str, action: str) -> tuple[bool, str, dict[str, Any], dict[str, Any]]:
        result = self._run_command(["sc.exe", "query", service_name], check=False)
        service_status = self._parse_service_status(result.stdout + "\n" + result.stderr)
        success = result.returncode == 0 and service_status not in {None, "NOT_FOUND"}
        message = "service status queried" if success else (result.stderr.strip() or result.stdout.strip() or "service query failed")
        return success, message, self._build_data(service_name, action, service_status), {"raw_output": (result.stdout + "\n" + result.stderr).strip()}
    def _query_service_status(self, service_name: str) -> str | None:
        result = self._run_command(["sc.exe", "query", service_name], check=False)
        return self._parse_service_status(result.stdout + "\n" + result.stderr)
    def _run_command(self, command: list[str], check: bool = True) -> subprocess.CompletedProcess[str]:
        return subprocess.run(command, capture_output=True, text=True, check=check)
    def _parse_service_status(self, text: str) -> str | None:
        normalized = text.upper()
        if "FAILED 1060" in normalized or "DOES NOT EXIST" in normalized:
            return "NOT_FOUND"
        for candidate in ("RUNNING", "STOPPED", "START_PENDING", "STOP_PENDING", "PAUSED"):
            if candidate in normalized:
                return candidate
        return None
    def _build_data(self, service_name: str, action: str, service_status: str | None) -> dict[str, Any]:
        return {
            "service_name": service_name,
            "action": action,
            "service_status": service_status,
        }
--- a/edge-agent/scripts/package-linux.sh
+++ b/edge-agent/scripts/package-linux.sh
@ -3,17 +3,26 @@ set -euo pipefail
 ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 DIST_DIR="$ROOT_DIR/dist"
-PACKAGE_ROOT="$DIST_DIR/edge-agent-linux"
+TIMESTAMP="$(date +%Y%m%d-%H%M%S)"
-ARCHIVE_PATH="$DIST_DIR/edge-agent-linux.tar.gz"
+PACKAGE_ROOT="$DIST_DIR/edge-agent-linux-$TIMESTAMP"
 RUNTIME_ROOT="$PACKAGE_ROOT/runtime/python"
 ARCHIVE_PATH="$DIST_DIR/edge-agent-linux-$TIMESTAMP.tar.gz"
 PYTHON_HOME="${1:-${EDGE_PYTHON_HOME:-}}"
 if [[ -z "$PYTHON_HOME" ]]; then
  echo "Python runtime directory is required. Pass it as the first argument or set EDGE_PYTHON_HOME." >&2
  exit 1
 fi
 rm -rf "$PACKAGE_ROOT"
 mkdir -p "$PACKAGE_ROOT"
 mkdir -p "$RUNTIME_ROOT"
 mkdir -p "$DIST_DIR"
 cp -r "$ROOT_DIR/app" "$PACKAGE_ROOT/"
 cp "$ROOT_DIR/README.md" "$PACKAGE_ROOT/"
 cp "$ROOT_DIR/pyproject.toml" "$PACKAGE_ROOT/"
-cp "$ROOT_DIR/scripts/start-linux.sh" "$PACKAGE_ROOT/"
+cp "$ROOT_DIR/scripts/start-linux.sh" "$PACKAGE_ROOT/start.sh"
 cp -r "$PYTHON_HOME"/. "$RUNTIME_ROOT"/
 tar -czf "$ARCHIVE_PATH" -C "$PACKAGE_ROOT" .
 echo "$ARCHIVE_PATH"
--- a/edge-agent/scripts/package-windows.ps1
+++ b/edge-agent/scripts/package-windows.ps1
@ -1,24 +1,53 @@
 param(
    [string]$PythonHome = $env:EDGE_PYTHON_HOME
 )
 $ErrorActionPreference = "Stop"
 function Resolve-PythonHome {
    param([string]$InputPythonHome)
    if ($InputPythonHome) {
        return (Resolve-Path -LiteralPath $InputPythonHome).Path
    }
    $candidates = @(
        "C:\Users\$env:USERNAME\AppData\Local\Programs\Python\Python311",
        "C:\Users\$env:USERNAME\AppData\Local\Programs\Python\Python312",
        "C:\Python311",
        "C:\Python312"
    )
    foreach ($candidate in $candidates) {
        if (Test-Path (Join-Path $candidate "python.exe")) {
            return (Resolve-Path -LiteralPath $candidate).Path
        }
    }
    throw "PythonHome not provided and no local Python runtime directory was found."
 }
 $root = Split-Path -Parent $PSScriptRoot
 $dist = Join-Path $root "dist"
-$packageRoot = Join-Path $dist "edge-agent-windows"
+$timestamp = Get-Date -Format "yyyyMMdd-HHmmss"
-$zipPath = Join-Path $dist "edge-agent-windows.zip"
+$packageRoot = Join-Path $dist "edge-agent-windows-$timestamp"
 $runtimeRoot = Join-Path $packageRoot "runtime\python"
 $zipPath = Join-Path $dist "edge-agent-windows-$timestamp.zip"
 $resolvedPythonHome = Resolve-PythonHome -InputPythonHome $PythonHome
 if (Test-Path $packageRoot) {
    Remove-Item -LiteralPath $packageRoot -Recurse -Force
 }
 if (Test-Path $zipPath) {
    Remove-Item -LiteralPath $zipPath -Force
 }
-New-Item -ItemType Directory -Path $packageRoot | Out-Null
+New-Item -ItemType Directory -Path $packageRoot -Force | Out-Null
 New-Item -ItemType Directory -Path $runtimeRoot -Force | Out-Null
 New-Item -ItemType Directory -Path $dist -Force | Out-Null
 Copy-Item -LiteralPath (Join-Path $root "app") -Destination $packageRoot -Recurse
 Copy-Item -LiteralPath (Join-Path $root "README.md") -Destination $packageRoot
 Copy-Item -LiteralPath (Join-Path $root "pyproject.toml") -Destination $packageRoot
-Copy-Item -LiteralPath (Join-Path $PSScriptRoot "start-windows.ps1") -Destination $packageRoot
+Copy-Item -LiteralPath (Join-Path $PSScriptRoot "start-windows.ps1") -Destination (Join-Path $packageRoot "start.ps1")
 Get-ChildItem -LiteralPath $resolvedPythonHome -Force | Copy-Item -Destination $runtimeRoot -Recurse
 Compress-Archive -Path (Join-Path $packageRoot "*") -DestinationPath $zipPath -Force
 Write-Output $zipPath
--- a/edge-agent/scripts/start-linux.sh
+++ b/edge-agent/scripts/start-linux.sh
@ -2,10 +2,15 @@
 set -euo pipefail
 ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
-PYTHON_BIN="$ROOT_DIR/.venv/bin/python"
+RUNTIME_PYTHON="$ROOT_DIR/runtime/python/bin/python3"
 VENV_PYTHON="$ROOT_DIR/.venv/bin/python"
-if [[ ! -x "$PYTHON_BIN" ]]; then
+if [[ -x "$RUNTIME_PYTHON" ]]; then
-  echo "Python runtime not found at $PYTHON_BIN" >&2
+  PYTHON_BIN="$RUNTIME_PYTHON"
 elif [[ -x "$VENV_PYTHON" ]]; then
  PYTHON_BIN="$VENV_PYTHON"
 else
  echo "Python runtime not found at $RUNTIME_PYTHON or $VENV_PYTHON" >&2
  exit 1
 fi
--- a/edge-agent/scripts/start-windows.ps1
+++ b/edge-agent/scripts/start-windows.ps1
@ -1,10 +1,17 @@
 $ErrorActionPreference = "Stop"
 $root = Split-Path -Parent $PSScriptRoot
-$python = Join-Path $root ".venv\Scripts\python.exe"
+$runtimePython = Join-Path $root "runtime\python\python.exe"
 $venvPython = Join-Path $root ".venv\Scripts\python.exe"
-if (-not (Test-Path $python)) {
+if (Test-Path $runtimePython) {
-    throw "Python runtime not found at $python"
+    $python = $runtimePython
 }
 elseif (Test-Path $venvPython) {
    $python = $venvPython
 }
 else {
    throw "Python runtime not found. Checked: $runtimePython and $venvPython"
 }
 $env:PYTHONPATH = $root
--- a/edge-agent/tests/test_service_executors.py
+++ b/edge-agent/tests/test_service_executors.py
@ -0,0 +1,91 @@
 from __future__ import annotations
 from unittest.mock import patch
 from app.executors.linux_service_executor import LinuxServiceExecutor
 from app.executors.windows_service_executor import WindowsServiceExecutor
 class DummyCompletedProcess:
    def __init__(self, stdout: str = "", stderr: str = "", returncode: int = 0) -> None:
        self.stdout = stdout
        self.stderr = stderr
        self.returncode = returncode
 def test_windows_service_executor_status_running() -> None:
    query_output = "STATE              : 4  RUNNING"
    with patch(
        "app.executors.windows_service_executor.subprocess.run",
        return_value=DummyCompletedProcess(stdout=query_output, returncode=0),
    ):
        success, message, data, evidence = WindowsServiceExecutor().execute(
            {"service_name": "Spooler", "action": "status"}
        )
    assert success is True
    assert message == "service status queried"
    assert data["service_status"] == "RUNNING"
    assert "RUNNING" in evidence["raw_output"]
 def test_windows_service_executor_restart() -> None:
    responses = [
        DummyCompletedProcess(stdout="STATE              : 4  RUNNING", returncode=0),
        DummyCompletedProcess(stdout="", returncode=0),
        DummyCompletedProcess(stdout="STATE              : 1  STOPPED", returncode=0),
        DummyCompletedProcess(stdout="STATE              : 1  STOPPED", returncode=0),
        DummyCompletedProcess(stdout="", returncode=0),
        DummyCompletedProcess(stdout="STATE              : 4  RUNNING", returncode=0),
    ]
    with patch(
        "app.executors.windows_service_executor.subprocess.run",
        side_effect=responses,
    ):
        success, message, data, evidence = WindowsServiceExecutor().execute(
            {"service_name": "Spooler", "action": "restart"}
        )
    assert success is True
    assert data["service_status"] == "RUNNING"
    assert data["previous_action"] == "stop"
    assert "stop" in evidence
 def test_linux_service_executor_status_inactive() -> None:
    with patch(
        "app.executors.linux_service_executor.subprocess.run",
        return_value=DummyCompletedProcess(stdout="inactive\n", returncode=3),
    ):
        success, message, data, evidence = LinuxServiceExecutor().execute(
            {"service_name": "nginx", "action": "status"}
        )
    assert success is False
    assert message == "inactive"
    assert data["service_status"] == "inactive"
    assert "inactive" in evidence["raw_output"]
 def test_linux_service_executor_restart_user_scope() -> None:
    responses = [
        DummyCompletedProcess(stdout="", returncode=0),
        DummyCompletedProcess(stdout="active\n", returncode=0),
    ]
    with patch(
        "app.executors.linux_service_executor.subprocess.run",
        side_effect=responses,
    ) as mocked_run:
        success, message, data, evidence = LinuxServiceExecutor().execute(
            {"service_name": "demo.service", "action": "restart", "scope": "user"}
        )
    assert success is True
    assert message == "service status queried"
    assert data["service_status"] == "active"
    assert data["scope"] == "user"
    first_command = mocked_run.call_args_list[0].args[0]
    assert first_command[:2] == ["systemctl", "--user"]
    assert "restart" in first_command
--- a/智能化部署agent-当前进度总结.md
+++ b/智能化部署agent-当前进度总结.md
@ -12,6 +12,25 @@
 **文档收口 + demo 代码骨架落地 + 主链路验证**
 ### 1.1 MVP 进度统计(每轮更新)
 以下进度为当前 MVP 目标的估算进度,用于每轮结束后滚动更新:
 1. 需求方案与技术架构: 已完成
 2. demo 后端主链路: 已完成
 3. identity / approval / software-a demo 接口: 已完成
 4. edge 接入与调度链路: 已完成
 5. 基础验证执行器: 已完成
 6. service control 执行器: 已完成
 7. 审计 / 报告 / 聚合指标: 已完成第一轮
 8. 失败路径与幂等性测试: 已完成第一轮
 9. 便携打包与私有运行时: Windows 已完成验证, Linux 完成脚本待验证
 10. 真实场景联调: 进行中
 当前 MVP 进度估算:
 **约 85%**
 ---
 ## 2. 已完成的文档产出
@ -144,6 +163,8 @@ demo 接口定义文档已覆盖:
 21. 已补充 `edge-agent` 启动脚本与便携打包脚本,覆盖 Windows `zip` 与 Linux `tar.gz` 两类交付方向。
 22. 已补充 `edge-agent` 基础测试,覆盖 `http_health_check` 执行器和轮询调度器主路径。
 23. 已补充 `edge-agent` 基础执行器实现,新增 `check_port`、`check_process`、`grep_log` 三类能力并接入工具注册表。
 24. 已将 Windows / Linux 的 service control 执行器从占位实现推进为可用版本,支持 `status`、`start`、`stop`、`restart`。
 25. 已将便携打包脚本增强为携带私有 Python 运行时,并完成 Windows 便携包实际打包验证。
 ### 3.8 当前代码可运行范围
@ -169,14 +190,14 @@ demo 接口定义文档已覆盖:
 11. 本地 `edge-agent` 当前已具备:
   启动脚本、打包脚本、基础执行器测试和轮询调度测试。
 12. 本地 `edge-agent` 当前已具备已注册工具:
-   `http_health_check`、`check_port`、`check_process`、`grep_log`
+   `http_health_check`、`check_port`、`check_process`、`grep_log`、`windows_service_control`、`linux_service_control`
 当前测试基线:
 1. 共 20 条测试通过。
 2. 使用 `sqlite:///:memory:` 做回归验证。
 3. 当前主链路已不是“只有接口壳”,而是具备最小闭环行为。
-4. `edge-agent` 侧基础测试共 10 条通过。
+4. `edge-agent` 侧基础测试共 14 条通过。
 ---
@ -280,12 +301,12 @@ demo 接口定义文档已覆盖:
 当前还未收口,或仅实现了最小版本的工作包括:
-1. 本地 `edge-agent` 初始化代码与打包脚本已完成第一轮,但尚未接入私有 Python 运行时和真正的便携发布流程。
+1. 本地 `edge-agent` 初始化代码与打包脚本已完成第一轮,Windows 私有运行时便携包已验证,Linux 私有运行时打包脚本待实际验证。
 2. 文件型 SQLite / PostgreSQL 实库运行验证。
 3. 身份 demo / 审批 demo 与任务主链路的权限、审批决策联动细化。
 4. 任务级聚合指标已完成第一轮,但更细的任务级指标拆分仍可继续增强,如等待时长细分、失败步骤占比、阶段级统计。
-5. 更真实的验证插件实现,尤其是服务控制、日志时间范围过滤、进程指标扩展。
+5. 更真实的验证插件实现,尤其是日志时间范围过滤、进程指标扩展和更多健康检查方式。
-6. 部署脚本和运行脚本进一步完善,包括私有运行时打包。
+6. 部署脚本和运行脚本进一步完善,包括 Linux 私有运行时打包验证和安装/升级流程。
 7. OpenAPI 扩展到第二批接口。
 8. 更多测试用例与联调脚本。
@ -316,7 +337,7 @@ demo 接口定义文档已覆盖:
 当前状态:
-**SQLite / 去 Redis / 最小 DDL / 首批 OpenAPI / FastAPI 骨架 / 主接口 / demo adapter / edge 接口 / 第一轮任务级聚合指标 / 第一轮失败与幂等性测试 / edge-agent 初始化骨架 / edge-agent 启动与打包脚本 / edge-agent 基础测试,均已完成第一轮落地。**
+**SQLite / 去 Redis / 最小 DDL / 首批 OpenAPI / FastAPI 骨架 / 主接口 / demo adapter / edge 接口 / 第一轮任务级聚合指标 / 第一轮失败与幂等性测试 / edge-agent 初始化骨架 / edge-agent 启动与打包脚本 / edge-agent 基础测试 / service control 执行器 / Windows 私有运行时便携打包,均已完成第一轮落地。**
 ---
@ -352,7 +373,7 @@ demo 接口定义文档已覆盖:
 1. 再补更细的任务级指标拆分。
 2. 再补审计细节和聚合摘要。
-3. 继续补本地 Agent 执行器与真正的便携运行时打包。
+3. 继续补本地 Agent 更真实的日志/进程/健康检查执行能力,并验证 Linux 私有运行时打包。
 4. 再补第二批 OpenAPI。
 ### 7.2 如果上下文快满,有什么影响