Compare commits
12 Commits
74c1db2f14
...
v1.1.4
| Author | SHA1 | Date | |
|---|---|---|---|
| b5fc83065c | |||
| ef31a054c0 | |||
| ff35510ef0 | |||
| 21592ae8a0 | |||
| f01547df35 | |||
| 4a2532a83b | |||
| b962265168 | |||
| 38acca6484 | |||
| 8d36ef495d | |||
| 7ac5d54a84 | |||
| ac3c7e2b4c | |||
| d8ea772c24 |
5
.gitignore
vendored
5
.gitignore
vendored
@@ -1,2 +1,7 @@
|
||||
.DS_Store
|
||||
bin/
|
||||
agent
|
||||
node.log
|
||||
node.pid
|
||||
config.yaml
|
||||
.DS_Store
|
||||
|
||||
50
INSTALL.md
50
INSTALL.md
@@ -169,6 +169,43 @@ EOF
|
||||
|
||||
**注意:** 使用 `run.sh` 启动的好处是每次启动会自动拉取最新代码并重新编译。
|
||||
|
||||
### 3.1. 配置说明
|
||||
|
||||
**配置优先级(从高到低):**
|
||||
1. 环境变量 `BACKEND_URL`(最高优先级)
|
||||
2. 配置文件 `config.yaml` 中的 `backend.url`
|
||||
3. 默认值
|
||||
|
||||
**重要说明:**
|
||||
- 环境变量 `BACKEND_URL` 会**覆盖**配置文件中的设置
|
||||
- 即使配置文件存在,设置环境变量后也会优先使用环境变量的值
|
||||
- 这确保了编译后的二进制文件不会硬编码后端地址
|
||||
- 配置文件不会被编译进二进制文件,是运行时读取的
|
||||
|
||||
**使用环境变量(推荐):**
|
||||
```bash
|
||||
# 在 systemd 服务文件中设置
|
||||
Environment="BACKEND_URL=http://your-backend-server:8080"
|
||||
|
||||
# 或在命令行中设置
|
||||
BACKEND_URL=http://your-backend-server:8080 ./run.sh start
|
||||
```
|
||||
|
||||
**使用配置文件:**
|
||||
创建 `/opt/linkmaster-node/config.yaml`:
|
||||
```yaml
|
||||
server:
|
||||
port: 2200
|
||||
backend:
|
||||
url: http://your-backend-server:8080
|
||||
heartbeat:
|
||||
interval: 60
|
||||
log:
|
||||
file: node.log
|
||||
level: info
|
||||
debug: false
|
||||
```
|
||||
|
||||
### 4. 启动服务
|
||||
|
||||
```bash
|
||||
@@ -177,7 +214,11 @@ sudo systemctl enable linkmaster-node
|
||||
sudo systemctl start linkmaster-node
|
||||
```
|
||||
|
||||
**注意:** 确保 `BACKEND_URL` 环境变量指向后端服务器的实际地址和端口(默认 8080),不是前端地址。
|
||||
**重要说明:**
|
||||
- 确保 `BACKEND_URL` 环境变量指向后端服务器的实际地址和端口(默认 8080),不是前端地址
|
||||
- `BACKEND_URL` 环境变量会**覆盖**配置文件中的 `backend.url` 设置(优先级最高)
|
||||
- 即使配置文件存在,设置环境变量后也会优先使用环境变量的值
|
||||
- 这确保了编译后的二进制文件不会硬编码后端地址
|
||||
|
||||
## 防火墙配置
|
||||
|
||||
@@ -238,12 +279,19 @@ sudo lsof -i :2200
|
||||
|
||||
**解决:**
|
||||
- 检查后端地址是否正确(应该是 `http://backend-server:8080`,不是前端地址)
|
||||
- 检查环境变量 `BACKEND_URL` 是否设置正确(优先级最高)
|
||||
- 检查配置文件 `config.yaml` 中的 `backend.url` 是否正确
|
||||
- 检查网络连通性:`ping your-backend-server`
|
||||
- 检查端口是否开放:`telnet your-backend-server 8080` 或 `nc -zv your-backend-server 8080`
|
||||
- 检查防火墙规则(确保后端服务器的 8080 端口开放)
|
||||
- 检查后端服务是否运行:`curl http://your-backend-server:8080/api/public/nodes/online`
|
||||
- 如果使用前端代理,节点端仍需要直接连接后端,不能使用前端地址
|
||||
|
||||
**配置优先级说明:**
|
||||
- 环境变量 `BACKEND_URL` 优先级最高,会覆盖配置文件中的设置
|
||||
- 如果同时设置了环境变量和配置文件,优先使用环境变量的值
|
||||
- 这确保了编译后的二进制文件不会硬编码后端地址
|
||||
|
||||
## 卸载
|
||||
|
||||
```bash
|
||||
|
||||
2
Makefile
2
Makefile
@@ -7,7 +7,7 @@ build-linux:
|
||||
GOOS=linux GOARCH=amd64 go build -o bin/linkmaster-node-linux ./cmd/agent
|
||||
|
||||
build-all:
|
||||
@./build-all.sh
|
||||
@./all-build.sh
|
||||
|
||||
clean:
|
||||
rm -rf bin/
|
||||
|
||||
326
README.md
326
README.md
@@ -13,6 +13,8 @@ LinkMaster 节点服务,用于执行网络测试任务。
|
||||
- FindPing IP段批量ping检测
|
||||
- 持续 Ping/TCPing 测试
|
||||
- 心跳上报
|
||||
- 日志文件输出(支持配置日志文件路径和级别)
|
||||
- 心跳故障排查工具
|
||||
|
||||
## 安装
|
||||
|
||||
@@ -83,10 +85,24 @@ BACKEND_URL=http://your-backend-server:8080 ./run.sh start
|
||||
|
||||
## 配置
|
||||
|
||||
### 配置优先级
|
||||
|
||||
配置按以下优先级加载(高优先级会覆盖低优先级):
|
||||
|
||||
1. **环境变量**(最高优先级)
|
||||
2. **配置文件** `config.yaml`
|
||||
3. **默认值**
|
||||
|
||||
### 环境变量
|
||||
|
||||
- `BACKEND_URL`: 后端服务地址(必需,默认: http://localhost:8080)
|
||||
- `BACKEND_URL`: 后端服务地址(**优先级最高**,会覆盖配置文件中的设置)
|
||||
- `CONFIG_PATH`: 配置文件路径(可选,默认: config.yaml)
|
||||
- `LOG_FILE`: 日志文件路径(可选,默认: node.log)
|
||||
|
||||
**重要说明:**
|
||||
- `BACKEND_URL` 环境变量会**覆盖**配置文件中的 `backend.url` 设置
|
||||
- 即使配置文件存在,设置环境变量后也会优先使用环境变量的值
|
||||
- 这确保了编译后的二进制文件不会硬编码后端地址
|
||||
|
||||
### 配置文件(可选)
|
||||
|
||||
@@ -96,12 +112,29 @@ BACKEND_URL=http://your-backend-server:8080 ./run.sh start
|
||||
server:
|
||||
port: 2200
|
||||
backend:
|
||||
url: http://your-backend-server:8080
|
||||
url: http://your-backend-server:8080 # 会被 BACKEND_URL 环境变量覆盖
|
||||
heartbeat:
|
||||
interval: 60
|
||||
log:
|
||||
file: node.log # 日志文件路径(默认: node.log,空则输出到标准错误)
|
||||
level: info # 日志级别: debug, info, warn, error(默认: info)
|
||||
debug: false
|
||||
node:
|
||||
id: 0 # 节点ID(通过心跳自动获取)
|
||||
ip: "" # 节点IP(通过心跳自动获取)
|
||||
country: "" # 国家(通过心跳自动获取)
|
||||
province: "" # 省份(通过心跳自动获取)
|
||||
city: "" # 城市(通过心跳自动获取)
|
||||
isp: "" # ISP(通过心跳自动获取)
|
||||
```
|
||||
|
||||
**配置说明:**
|
||||
- `backend.url`: 后端服务地址,会被 `BACKEND_URL` 环境变量覆盖
|
||||
- `log.file`: 日志文件路径。如果为空,日志将输出到标准错误(stderr)
|
||||
- `log.level`: 日志级别,支持 `debug`、`info`、`warn`、`error`
|
||||
- `node.*`: 节点信息通过心跳自动获取并保存,无需手动配置
|
||||
- 配置文件不会被编译进二进制文件,是运行时读取的
|
||||
|
||||
## 运行脚本
|
||||
|
||||
使用 `run.sh` 脚本管理节点端。**每次启动时会自动拉取最新代码并重新编译**:
|
||||
@@ -268,36 +301,36 @@ BACKEND_URL=http://192.168.1.100:8080 ./run.sh start
|
||||
- 需要确保源码目录存在且是 Git 仓库
|
||||
- 需要 Go 环境已安装并在 PATH 中
|
||||
|
||||
### 5. build-all.sh - 跨平台编译脚本
|
||||
### 5. all-build.sh - 跨平台编译脚本
|
||||
|
||||
编译多个操作系统和架构的二进制文件,支持并行编译。
|
||||
编译多个操作系统和架构的二进制文件,支持并行编译。**版本号自动从 `version.json` 读取**。
|
||||
|
||||
**使用方法:**
|
||||
|
||||
```bash
|
||||
# 编译所有平台
|
||||
./build-all.sh
|
||||
# 编译所有平台(自动使用 version.json 中的版本号)
|
||||
./all-build.sh
|
||||
|
||||
# 只编译指定平台
|
||||
./build-all.sh -p linux/amd64
|
||||
./all-build.sh -p linux/amd64
|
||||
|
||||
# 编译前清理输出目录
|
||||
./build-all.sh -c
|
||||
./all-build.sh -c
|
||||
|
||||
# 设置并行编译数量
|
||||
./build-all.sh -j 2
|
||||
./all-build.sh -j 2
|
||||
|
||||
# 设置版本号
|
||||
./build-all.sh -v 1.0.0
|
||||
# 覆盖版本号(覆盖 version.json 中的版本)
|
||||
./all-build.sh -v 1.0.0
|
||||
|
||||
# 只生成不带版本号的文件
|
||||
./build-all.sh -s
|
||||
./all-build.sh -s
|
||||
|
||||
# 列出所有支持的平台
|
||||
./build-all.sh -l
|
||||
./all-build.sh -l
|
||||
|
||||
# 显示帮助信息
|
||||
./build-all.sh -h
|
||||
./all-build.sh -h
|
||||
```
|
||||
|
||||
**支持的平台:**
|
||||
@@ -309,53 +342,63 @@ BACKEND_URL=http://192.168.1.100:8080 ./run.sh start
|
||||
- `windows/arm64` - Windows ARM64
|
||||
|
||||
**功能特性:**
|
||||
- 支持并行编译(默认 4 个任务)
|
||||
- 自动生成带版本号和不带版本号的文件
|
||||
- 输出到 `bin/` 目录
|
||||
- 显示编译进度和结果
|
||||
- 支持清理输出目录
|
||||
- ✅ **自动从 `version.json` 读取版本号**(无需手动指定)
|
||||
- ✅ 支持并行编译(默认 4 个任务)
|
||||
- ✅ 自动生成带版本号和不带版本号的文件
|
||||
- ✅ 输出到 `bin/` 目录
|
||||
- ✅ 显示编译进度和结果
|
||||
- ✅ 支持清理输出目录
|
||||
|
||||
**输出文件:**
|
||||
- `bin/agent-{os}-{arch}` - 不带版本号的二进制文件
|
||||
- `bin/agent-{os}-{arch}-{version}` - 带版本号的二进制文件
|
||||
- Windows 平台会自动添加 `.exe` 扩展名
|
||||
|
||||
### 6. upload.sh - 发布上传脚本
|
||||
**版本管理:**
|
||||
版本号统一从 `version.json` 文件读取:
|
||||
```json
|
||||
{
|
||||
"version": "1.1.3",
|
||||
"tag": "v1.1.3"
|
||||
}
|
||||
```
|
||||
|
||||
将编译好的二进制文件上传到 Releases 或通过其他方式发布。
|
||||
### 6. all-upload-release.sh - 发布上传脚本
|
||||
|
||||
将编译好的二进制文件上传到 Releases 或通过其他方式发布。**版本号和标签自动从 `version.json` 读取,Token 已硬编码**。
|
||||
|
||||
**使用方法:**
|
||||
|
||||
```bash
|
||||
# 上传到 Gitea Releases(自动从 .git/config 读取仓库信息)
|
||||
./upload.sh -m gitea -t v1.0.0 -v 1.0.0
|
||||
# 上传到 Gitea Releases(自动从 version.json 和 .git/config 读取信息)
|
||||
./all-upload-release.sh -m gitea
|
||||
|
||||
# 指定 Gitea 访问令牌
|
||||
./upload.sh -m gitea -t v1.0.0 -v 1.0.0 -T your_token
|
||||
# 上传到 Gitea Releases(覆盖版本号和标签)
|
||||
./all-upload-release.sh -m gitea -t v1.2.0 -v 1.2.0
|
||||
|
||||
# 上传到 GitHub Releases
|
||||
./upload.sh -m github -r owner/repo -t v1.0.0 -v 1.0.0
|
||||
./all-upload-release.sh -m github -r owner/repo -t v1.0.0 -v 1.0.0
|
||||
|
||||
# 通过 SCP 上传
|
||||
./upload.sh -m scp -H example.com -u user -d /path/to/release
|
||||
./all-upload-release.sh -m scp -H example.com -u user -d /path/to/release
|
||||
|
||||
# 通过 SCP 上传(指定私钥)
|
||||
./upload.sh -m scp -H example.com -u user -d /path/to/release -k ~/.ssh/id_rsa
|
||||
./all-upload-release.sh -m scp -H example.com -u user -d /path/to/release -k ~/.ssh/id_rsa
|
||||
|
||||
# 通过 FTP 上传
|
||||
./upload.sh -m ftp -H ftp.example.com -u user -d /path/to/release
|
||||
./all-upload-release.sh -m ftp -H ftp.example.com -u user -d /path/to/release
|
||||
|
||||
# 复制到本地目录
|
||||
./upload.sh -m local -d /path/to/release
|
||||
./all-upload-release.sh -m local -d /path/to/release
|
||||
|
||||
# 只打包不上传
|
||||
./upload.sh --pack-only -v 1.0.0
|
||||
./all-upload-release.sh --pack-only
|
||||
|
||||
# 不上传压缩包,直接上传二进制文件
|
||||
./upload.sh -m scp --no-pack -H example.com -u user -d /path/to/release
|
||||
./all-upload-release.sh -m scp --no-pack -H example.com -u user -d /path/to/release
|
||||
|
||||
# 显示帮助信息
|
||||
./upload.sh -h
|
||||
./all-upload-release.sh -h
|
||||
```
|
||||
|
||||
**支持的上传方式:**
|
||||
@@ -366,19 +409,21 @@ BACKEND_URL=http://192.168.1.100:8080 ./run.sh start
|
||||
- `local` - 复制到本地目录
|
||||
|
||||
**功能特性:**
|
||||
- 自动打包二进制文件(tar.gz 或 zip)
|
||||
- 自动创建发布说明
|
||||
- 支持指定平台上传
|
||||
- 支持自定义版本号和标签
|
||||
- 支持自定义发布说明
|
||||
- 自动检测并处理已存在的 Release
|
||||
- ✅ **自动从 `version.json` 读取版本号和标签**(无需手动指定)
|
||||
- ✅ **Token 已硬编码**(无需手动指定)
|
||||
- ✅ 自动打包二进制文件(tar.gz 或 zip)
|
||||
- ✅ 自动创建发布说明
|
||||
- ✅ 支持指定平台上传
|
||||
- ✅ 支持自定义版本号和标签(覆盖配置文件)
|
||||
- ✅ 支持自定义发布说明
|
||||
- ✅ 自动检测并处理已存在的 Release
|
||||
|
||||
**参数说明:**
|
||||
- `-m, --method`: 上传方式(gitea|github|scp|ftp|local)
|
||||
- `-v, --version`: 版本号(默认: 时间戳)
|
||||
- `-t, --tag`: Git 标签(Releases 需要)
|
||||
- `-m, --method`: 上传方式(gitea|github|scp|ftp|local,默认: gitea)
|
||||
- `-v, --version`: 版本号(默认: 从 version.json 读取)
|
||||
- `-t, --tag`: Git 标签(默认: 从 version.json 读取)
|
||||
- `-p, --platform`: 只上传指定平台
|
||||
- `-T, --token`: 访问令牌(Gitea/GitHub)
|
||||
- `-T, --token`: 访问令牌(已硬编码,此选项已废弃)
|
||||
- `-H, --host`: 主机地址(SCP/FTP)
|
||||
- `-u, --user`: 用户名(SCP/FTP)
|
||||
- `-d, --dest`: 目标路径(SCP/FTP/local)
|
||||
@@ -386,6 +431,24 @@ BACKEND_URL=http://192.168.1.100:8080 ./run.sh start
|
||||
- `--pack-only`: 只打包不上传
|
||||
- `--no-pack`: 不上传压缩包,直接上传二进制文件
|
||||
|
||||
**版本管理:**
|
||||
版本号和标签统一从 `version.json` 文件读取:
|
||||
```json
|
||||
{
|
||||
"version": "1.1.3",
|
||||
"tag": "v1.1.3"
|
||||
}
|
||||
```
|
||||
|
||||
**典型工作流程:**
|
||||
```bash
|
||||
# 1. 编译所有平台(自动使用 version.json 中的版本号)
|
||||
./all-build.sh
|
||||
|
||||
# 2. 上传到 Gitea Releases(自动使用 version.json 中的版本号和标签)
|
||||
./all-upload-release.sh -m gitea
|
||||
```
|
||||
|
||||
### 7. vendor.sh - Vendor 依赖打包脚本
|
||||
|
||||
将项目依赖下载到 vendor 目录,客户端克隆后可直接编译,无需网络连接。
|
||||
@@ -464,3 +527,180 @@ go build -mod=vendor -o agent ./cmd/agent
|
||||
### GET /api/health
|
||||
|
||||
健康检查
|
||||
|
||||
## 故障排查
|
||||
|
||||
### 心跳同步问题排查
|
||||
|
||||
如果节点无法同步心跳,可以使用排查脚本进行诊断:
|
||||
|
||||
```bash
|
||||
# 运行心跳故障排查脚本
|
||||
./check-heartbeat.sh
|
||||
```
|
||||
|
||||
排查脚本会自动检查以下项目:
|
||||
|
||||
1. **进程状态** - 检查节点进程是否正在运行
|
||||
2. **配置文件** - 检查配置文件是否存在和正确
|
||||
3. **网络连接** - 检查能否连接到后端服务器
|
||||
4. **日志分析** - 分析日志中的心跳相关错误
|
||||
5. **手动测试** - 手动发送心跳测试连接
|
||||
6. **系统资源** - 检查磁盘空间和内存使用情况
|
||||
|
||||
**常见问题及解决方案:**
|
||||
|
||||
1. **进程未运行**
|
||||
```bash
|
||||
./run.sh start
|
||||
```
|
||||
|
||||
2. **网络连接失败**
|
||||
- 检查后端服务是否正常运行
|
||||
- 检查防火墙规则(确保可以访问后端端口)
|
||||
- 检查 BACKEND_URL 配置是否正确
|
||||
|
||||
3. **心跳发送失败**
|
||||
- 查看日志: `./run.sh logs`
|
||||
- 检查后端服务日志
|
||||
- 确认后端 `/api/node/heartbeat` 接口正常
|
||||
|
||||
4. **配置文件问题**
|
||||
- 检查 `config.yaml` 文件格式是否正确
|
||||
- 确认 `BACKEND_URL` 环境变量或配置文件中的 URL 正确
|
||||
|
||||
5. **查看详细日志**
|
||||
```bash
|
||||
# 实时查看日志
|
||||
./run.sh logs
|
||||
|
||||
# 查看完整日志
|
||||
./run.sh logs-all
|
||||
```
|
||||
|
||||
### 日志功能
|
||||
|
||||
节点端支持将日志直接写入文件,便于排查问题和监控运行状态。
|
||||
|
||||
**日志配置方式:**
|
||||
|
||||
1. **环境变量**(推荐)
|
||||
```bash
|
||||
LOG_FILE=/var/log/linkmaster-node.log ./run.sh start
|
||||
```
|
||||
|
||||
2. **配置文件**
|
||||
在 `config.yaml` 中配置:
|
||||
```yaml
|
||||
log:
|
||||
file: node.log # 日志文件路径
|
||||
level: info # 日志级别: debug, info, warn, error
|
||||
```
|
||||
|
||||
3. **默认行为**
|
||||
- 默认日志文件:`node.log`(当前目录)
|
||||
- 默认日志级别:`info`
|
||||
- 如果未设置日志文件,日志输出到标准错误(stderr)
|
||||
|
||||
**日志特性:**
|
||||
- ✅ 自动创建日志文件和目录
|
||||
- ✅ 追加模式,不会覆盖已有日志
|
||||
- ✅ JSON 格式,便于日志分析
|
||||
- ✅ 包含调用信息(文件名和行号)
|
||||
- ✅ Error 级别日志包含堆栈信息
|
||||
|
||||
**查看日志:**
|
||||
```bash
|
||||
# 实时查看日志
|
||||
tail -f node.log
|
||||
|
||||
# 查看心跳相关日志
|
||||
grep -i "心跳" node.log
|
||||
|
||||
# 查看错误日志
|
||||
grep -i "error" node.log
|
||||
|
||||
# 查看最后100行
|
||||
tail -n 100 node.log
|
||||
```
|
||||
|
||||
## 心跳机制
|
||||
|
||||
节点会定期向后端发送心跳,上报节点状态和获取节点信息。
|
||||
|
||||
### 心跳请求字段
|
||||
|
||||
心跳请求包含以下字段:
|
||||
|
||||
- `type`: 固定值 `pingServer`
|
||||
- `version`: 协议版本号,固定值 `2`
|
||||
- `host_name`: 节点主机名(自动读取系统主机名)
|
||||
|
||||
### 心跳响应
|
||||
|
||||
心跳响应包含以下节点信息:
|
||||
|
||||
- `node_id`: 节点ID
|
||||
- `node_ip`: 节点外网IP
|
||||
- `country`: 国家
|
||||
- `province`: 省份
|
||||
- `city`: 城市
|
||||
- `isp`: ISP
|
||||
|
||||
这些信息会自动保存到配置文件中,用于后续的数据推送。
|
||||
|
||||
## 持续测试功能
|
||||
|
||||
节点支持持续 Ping 和 TCPing 测试,测试结果会自动推送到后端服务器。
|
||||
|
||||
### 功能特性
|
||||
|
||||
- ✅ 实时推送测试结果到后端
|
||||
- ✅ 批量推送优化(减少HTTP请求频率)
|
||||
- ✅ 自动清理超时任务
|
||||
- ✅ 资源自动清理(防止内存泄漏)
|
||||
- ✅ 详细的调试日志(debug模式)
|
||||
|
||||
### 数据推送
|
||||
|
||||
- 测试结果会自动推送到后端 `/api/public/node/continuous/result` 接口
|
||||
- 推送包含节点ID、IP、位置信息和测试结果
|
||||
- 如果后端任务不存在,节点端会自动停止对应任务
|
||||
|
||||
## 更新日志
|
||||
|
||||
### v1.1.4 (最新)
|
||||
|
||||
**新增功能:**
|
||||
- ✨ 心跳请求新增 `version` 字段(协议版本号,默认值:2)
|
||||
- ✨ 心跳请求新增 `host_name` 字段(自动读取系统主机名)
|
||||
- ✨ 支持环境变量 `BACKEND_URL` 覆盖配置文件中的后端地址
|
||||
- ✨ 持续测试功能增强,支持批量推送和自动清理
|
||||
|
||||
**改进:**
|
||||
- 🔧 修复持续测试数据推送的锁管理问题
|
||||
- 🔧 修复任务停止时未清理推送缓冲的内存泄漏问题
|
||||
- 🔧 优化配置加载逻辑,环境变量优先级最高
|
||||
- 🔧 增强日志记录,添加详细的调试信息
|
||||
- 📝 完善文档,添加配置优先级和心跳机制说明
|
||||
|
||||
### v1.1.3
|
||||
|
||||
**新增功能:**
|
||||
- ✨ 添加日志文件输出功能,支持配置日志文件路径和级别
|
||||
- ✨ 添加心跳故障排查工具 `check-heartbeat.sh`
|
||||
- ✨ 支持通过环境变量 `LOG_FILE` 设置日志文件路径
|
||||
- ✨ 日志自动创建目录,支持相对路径和绝对路径
|
||||
|
||||
**改进:**
|
||||
- 🔧 优化日志初始化逻辑,支持直接写入文件
|
||||
- 🔧 改进配置加载,支持日志配置项
|
||||
- 📝 完善文档,添加故障排查章节
|
||||
|
||||
### v1.0.0
|
||||
|
||||
- 🎉 初始版本发布
|
||||
- ✅ 支持 HTTP GET/POST 测试
|
||||
- ✅ 支持 Ping、DNS、Traceroute 等网络测试
|
||||
- ✅ 支持持续 Ping/TCPing 测试
|
||||
- ✅ 支持心跳上报
|
||||
|
||||
@@ -15,9 +15,48 @@ NC='\033[0m' # No Color
|
||||
# 项目信息
|
||||
PROJECT_NAME="agent"
|
||||
BUILD_DIR="bin"
|
||||
VERSION="${VERSION:-$(date +%Y%m%d-%H%M%S)}"
|
||||
MAIN_PACKAGE="./cmd/agent"
|
||||
|
||||
# 版本配置文件路径
|
||||
VERSION_FILE="version.json"
|
||||
|
||||
# 从版本配置文件读取版本信息
|
||||
read_version_config() {
|
||||
local version_file="${VERSION_FILE}"
|
||||
|
||||
if [ ! -f "$version_file" ]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
# 检查是否有 jq 命令
|
||||
if command -v jq &> /dev/null; then
|
||||
local version=$(jq -r '.version' "$version_file" 2>/dev/null)
|
||||
|
||||
if [ -n "$version" ] && [ "$version" != "null" ]; then
|
||||
echo "$version"
|
||||
return 0
|
||||
fi
|
||||
else
|
||||
# 如果没有 jq,使用 grep 和 sed 解析 JSON
|
||||
local version=$(grep -o '"version"[[:space:]]*:[[:space:]]*"[^"]*"' "$version_file" 2>/dev/null | sed 's/.*"version"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/')
|
||||
|
||||
if [ -n "$version" ]; then
|
||||
echo "$version"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
|
||||
return 1
|
||||
}
|
||||
|
||||
# 初始化版本号(从配置文件读取,如果失败则使用时间戳)
|
||||
VERSION_CONFIG=$(read_version_config)
|
||||
if [ $? -eq 0 ] && [ -n "$VERSION_CONFIG" ]; then
|
||||
VERSION="${VERSION:-$VERSION_CONFIG}"
|
||||
else
|
||||
VERSION="${VERSION:-$(date +%Y%m%d-%H%M%S)}"
|
||||
fi
|
||||
|
||||
# 支持的平台列表
|
||||
# 格式: OS/ARCH
|
||||
PLATFORMS=(
|
||||
@@ -40,7 +79,7 @@ usage() {
|
||||
echo " -l, --list 列出所有支持的平台"
|
||||
echo " -c, --clean 编译前清理输出目录"
|
||||
echo " -j, --jobs N 并行编译数量 (默认: 4)"
|
||||
echo " -v, --version VERSION 设置版本号 (默认: 时间戳)"
|
||||
echo " -v, --version VERSION 设置版本号 (默认: 从 version.json 读取)"
|
||||
echo " -s, --simple-only 只生成不带版本号的文件(默认生成两个)"
|
||||
echo ""
|
||||
echo -e "${BLUE}示例:${NC}"
|
||||
@@ -16,10 +16,59 @@ NC='\033[0m' # No Color
|
||||
# 项目信息
|
||||
PROJECT_NAME="agent"
|
||||
BUILD_DIR="bin"
|
||||
VERSION="${VERSION:-$(date +%Y%m%d-%H%M%S)}"
|
||||
RELEASE_DIR="release"
|
||||
TEMP_DIR=$(mktemp -d)
|
||||
|
||||
|
||||
|
||||
# Gitea Token (硬编码)
|
||||
GITEA_TOKEN="3becb08eee31b422481ce1b8986de1cd645b468e"
|
||||
|
||||
# 版本配置文件路径
|
||||
VERSION_FILE="version.json"
|
||||
|
||||
# 从版本配置文件读取版本信息
|
||||
read_version_config() {
|
||||
local version_file="${VERSION_FILE}"
|
||||
|
||||
if [ ! -f "$version_file" ]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
# 检查是否有 jq 命令
|
||||
if command -v jq &> /dev/null; then
|
||||
local version=$(jq -r '.version' "$version_file" 2>/dev/null)
|
||||
local tag=$(jq -r '.tag' "$version_file" 2>/dev/null)
|
||||
|
||||
if [ -n "$version" ] && [ "$version" != "null" ]; then
|
||||
echo "$version|$tag"
|
||||
return 0
|
||||
fi
|
||||
else
|
||||
# 如果没有 jq,使用 grep 和 sed 解析 JSON
|
||||
local version=$(grep -o '"version"[[:space:]]*:[[:space:]]*"[^"]*"' "$version_file" 2>/dev/null | sed 's/.*"version"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/')
|
||||
local tag=$(grep -o '"tag"[[:space:]]*:[[:space:]]*"[^"]*"' "$version_file" 2>/dev/null | sed 's/.*"tag"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/')
|
||||
|
||||
if [ -n "$version" ]; then
|
||||
echo "$version|$tag"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
|
||||
return 1
|
||||
}
|
||||
|
||||
# 初始化版本号(从配置文件读取,如果失败则使用时间戳)
|
||||
VERSION_CONFIG=$(read_version_config)
|
||||
if [ $? -eq 0 ] && [ -n "$VERSION_CONFIG" ]; then
|
||||
IFS='|' read -r config_version config_tag <<< "$VERSION_CONFIG"
|
||||
VERSION="${VERSION:-$config_version}"
|
||||
DEFAULT_TAG="${config_tag}"
|
||||
else
|
||||
VERSION="${VERSION:-$(date +%Y%m%d-%H%M%S)}"
|
||||
DEFAULT_TAG=""
|
||||
fi
|
||||
|
||||
# 支持的平台列表
|
||||
PLATFORMS=(
|
||||
"linux/amd64"
|
||||
@@ -51,7 +100,7 @@ usage() {
|
||||
echo " -t, --tag TAG Git标签 (GitHub/Gitea Releases需要)"
|
||||
echo " -r, --repo REPO 仓库 (格式: owner/repo,默认从.git/config读取)"
|
||||
echo " -b, --base-url URL Gitea基础URL (默认从.git/config读取)"
|
||||
echo " -T, --token TOKEN 访问令牌 (Gitea需要,也可通过GITEA_TOKEN环境变量)"
|
||||
echo " -T, --token TOKEN 访问令牌 (已硬编码,此选项已废弃)"
|
||||
echo " -d, --dest DEST 目标路径 (SCP/FTP/local需要)"
|
||||
echo " -H, --host HOST 主机地址 (SCP/FTP需要)"
|
||||
echo " -u, --user USER 用户名 (SCP/FTP需要)"
|
||||
@@ -64,9 +113,9 @@ usage() {
|
||||
echo ""
|
||||
echo -e "${BLUE}上传方式说明:${NC}"
|
||||
echo ""
|
||||
echo -e "${CYAN}Gitea Releases (自动从.git/config读取):${NC}"
|
||||
echo -e "${CYAN}Gitea Releases (自动从.git/config和version.json读取):${NC}"
|
||||
echo " $0 -m gitea"
|
||||
echo " $0 -m gitea -t v1.0.0 -v 1.0.0"
|
||||
echo " $0 -m gitea -t v1.0.0 -v 1.0.0 -T your_token"
|
||||
echo ""
|
||||
echo -e "${CYAN}GitHub Releases:${NC}"
|
||||
echo " $0 -m github -r owner/repo -t v1.0.0 -v 1.0.0"
|
||||
@@ -185,7 +234,7 @@ check_dependencies() {
|
||||
check_build_files() {
|
||||
if [ ! -d "$BUILD_DIR" ] || [ -z "$(ls -A $BUILD_DIR 2>/dev/null)" ]; then
|
||||
echo -e "${RED}错误: 构建目录为空或不存在${NC}"
|
||||
echo "请先运行 ./build-all.sh 编译项目"
|
||||
echo "请先运行 ./all-build.sh 编译项目"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
@@ -206,7 +255,7 @@ check_build_files() {
|
||||
|
||||
if [ $found -eq 0 ]; then
|
||||
echo -e "${RED}错误: 未找到任何构建文件${NC}"
|
||||
echo "请先运行 ./build-all.sh 编译项目"
|
||||
echo "请先运行 ./all-build.sh 编译项目"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
@@ -385,12 +434,13 @@ upload_gitea() {
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Token 已硬编码,确保使用硬编码的 token
|
||||
if [ -z "$token" ]; then
|
||||
token="${GITEA_TOKEN}"
|
||||
fi
|
||||
|
||||
if [ -z "$token" ]; then
|
||||
echo -e "${RED}错误: 访问令牌未指定,使用 -T TOKEN 或设置 GITEA_TOKEN 环境变量${NC}"
|
||||
echo -e "${RED}错误: 访问令牌未配置${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
@@ -682,10 +732,10 @@ upload_local() {
|
||||
main() {
|
||||
local method="gitea"
|
||||
local selected_platforms=()
|
||||
local tag=""
|
||||
local tag="${DEFAULT_TAG}"
|
||||
local repo=""
|
||||
local base_url=""
|
||||
local token=""
|
||||
local token="${GITEA_TOKEN}" # 使用硬编码的 token
|
||||
local dest=""
|
||||
local host=""
|
||||
local user=""
|
||||
@@ -727,7 +777,7 @@ main() {
|
||||
shift 2
|
||||
;;
|
||||
-T|--token)
|
||||
token="$2"
|
||||
echo -e "${YELLOW}警告: Token 已硬编码,-T 参数将被忽略${NC}"
|
||||
shift 2
|
||||
;;
|
||||
-d|--dest)
|
||||
@@ -847,9 +897,10 @@ main() {
|
||||
if [ -z "$repo" ]; then
|
||||
repo="${git_owner}/${git_repo_name}"
|
||||
fi
|
||||
if [ -z "$token" ] && [ -n "$git_token" ]; then
|
||||
token="$git_token"
|
||||
fi
|
||||
# Token 已硬编码,不从 git config 读取
|
||||
# if [ -z "$token" ] && [ -n "$git_token" ]; then
|
||||
# token="$git_token"
|
||||
# fi
|
||||
echo -e "${CYAN}[信息]${NC} 从 .git/config 读取仓库信息: ${repo}"
|
||||
fi
|
||||
fi
|
||||
512
check-heartbeat.sh
Executable file
512
check-heartbeat.sh
Executable file
@@ -0,0 +1,512 @@
|
||||
#!/bin/bash
|
||||
|
||||
# ============================================
|
||||
# LinkMaster 节点心跳故障排查脚本
|
||||
# 用途:诊断节点心跳同步问题
|
||||
# ============================================
|
||||
|
||||
set -e
|
||||
|
||||
# 颜色输出
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
CYAN='\033[0;36m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# 脚本目录
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
cd "$SCRIPT_DIR"
|
||||
|
||||
# 配置
|
||||
BINARY_NAME="agent"
|
||||
LOG_FILE="node.log"
|
||||
PID_FILE="node.pid"
|
||||
CONFIG_FILE="${CONFIG_PATH:-config.yaml}"
|
||||
|
||||
# 检查结果
|
||||
ISSUES=0
|
||||
WARNINGS=0
|
||||
|
||||
# 打印分隔线
|
||||
print_separator() {
|
||||
echo -e "${CYAN}========================================${NC}"
|
||||
}
|
||||
|
||||
# 打印检查项标题
|
||||
print_check_title() {
|
||||
echo -e "\n${BLUE}▶ $1${NC}"
|
||||
}
|
||||
|
||||
# 打印成功信息
|
||||
print_success() {
|
||||
echo -e "${GREEN}✓ $1${NC}"
|
||||
}
|
||||
|
||||
# 打印警告信息
|
||||
print_warning() {
|
||||
echo -e "${YELLOW}⚠ $1${NC}"
|
||||
((WARNINGS++))
|
||||
}
|
||||
|
||||
# 打印错误信息
|
||||
print_error() {
|
||||
echo -e "${RED}✗ $1${NC}"
|
||||
((ISSUES++))
|
||||
}
|
||||
|
||||
# 打印信息
|
||||
print_info() {
|
||||
echo -e "${CYAN}ℹ $1${NC}"
|
||||
}
|
||||
|
||||
# 获取PID
|
||||
get_pid() {
|
||||
if [ -f "$PID_FILE" ]; then
|
||||
PID=$(cat "$PID_FILE")
|
||||
if ps -p "$PID" > /dev/null 2>&1; then
|
||||
echo "$PID"
|
||||
else
|
||||
rm -f "$PID_FILE"
|
||||
echo ""
|
||||
fi
|
||||
else
|
||||
echo ""
|
||||
fi
|
||||
}
|
||||
|
||||
# 1. 检查进程状态
|
||||
check_process() {
|
||||
print_check_title "检查进程状态"
|
||||
|
||||
PID=$(get_pid)
|
||||
if [ -z "$PID" ]; then
|
||||
print_error "节点进程未运行"
|
||||
print_info "请使用 ./run.sh start 启动服务"
|
||||
return 1
|
||||
else
|
||||
print_success "节点进程正在运行 (PID: $PID)"
|
||||
|
||||
# 检查进程运行时间
|
||||
if command -v ps > /dev/null 2>&1; then
|
||||
RUNTIME=$(ps -o etime= -p "$PID" 2>/dev/null | tr -d ' ')
|
||||
if [ -n "$RUNTIME" ]; then
|
||||
print_info "进程运行时间: $RUNTIME"
|
||||
fi
|
||||
fi
|
||||
|
||||
# 检查进程资源使用
|
||||
if command -v ps > /dev/null 2>&1; then
|
||||
CPU_MEM=$(ps -o %cpu,%mem= -p "$PID" 2>/dev/null | tr -d ' ')
|
||||
if [ -n "$CPU_MEM" ]; then
|
||||
print_info "CPU/内存使用: $CPU_MEM"
|
||||
fi
|
||||
fi
|
||||
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
# 2. 检查配置文件
|
||||
check_config() {
|
||||
print_check_title "检查配置文件"
|
||||
|
||||
if [ ! -f "$CONFIG_FILE" ]; then
|
||||
print_warning "配置文件不存在: $CONFIG_FILE"
|
||||
print_info "将使用环境变量和默认配置"
|
||||
|
||||
# 检查环境变量
|
||||
if [ -n "$BACKEND_URL" ]; then
|
||||
print_info "使用环境变量 BACKEND_URL: $BACKEND_URL"
|
||||
else
|
||||
print_warning "未设置 BACKEND_URL 环境变量,将使用默认值: http://localhost:8080"
|
||||
fi
|
||||
return 0
|
||||
fi
|
||||
|
||||
print_success "配置文件存在: $CONFIG_FILE"
|
||||
|
||||
# 检查配置文件内容
|
||||
if command -v yq > /dev/null 2>&1; then
|
||||
BACKEND_URL_FROM_CONFIG=$(yq eval '.backend.url' "$CONFIG_FILE" 2>/dev/null || echo "")
|
||||
HEARTBEAT_INTERVAL=$(yq eval '.heartbeat.interval' "$CONFIG_FILE" 2>/dev/null || echo "")
|
||||
NODE_ID=$(yq eval '.node.id' "$CONFIG_FILE" 2>/dev/null || echo "")
|
||||
NODE_IP=$(yq eval '.node.ip' "$CONFIG_FILE" 2>/dev/null || echo "")
|
||||
else
|
||||
# 使用 grep 和 sed 简单解析
|
||||
BACKEND_URL_FROM_CONFIG=$(grep -E "^\s*url:" "$CONFIG_FILE" | head -1 | sed 's/.*url:\s*//' | tr -d '"' | tr -d "'" || echo "")
|
||||
HEARTBEAT_INTERVAL=$(grep -E "^\s*interval:" "$CONFIG_FILE" | head -1 | sed 's/.*interval:\s*//' | tr -d '"' | tr -d "'" || echo "")
|
||||
NODE_ID=$(grep -E "^\s*id:" "$CONFIG_FILE" | head -1 | sed 's/.*id:\s*//' | tr -d '"' | tr -d "'" || echo "")
|
||||
NODE_IP=$(grep -E "^\s*ip:" "$CONFIG_FILE" | head -1 | sed 's/.*ip:\s*//' | tr -d '"' | tr -d "'" || echo "")
|
||||
fi
|
||||
|
||||
# 确定使用的后端URL
|
||||
if [ -n "$BACKEND_URL" ]; then
|
||||
FINAL_BACKEND_URL="$BACKEND_URL"
|
||||
print_info "使用环境变量 BACKEND_URL: $FINAL_BACKEND_URL"
|
||||
elif [ -n "$BACKEND_URL_FROM_CONFIG" ]; then
|
||||
FINAL_BACKEND_URL="$BACKEND_URL_FROM_CONFIG"
|
||||
print_info "使用配置文件中的后端URL: $FINAL_BACKEND_URL"
|
||||
else
|
||||
FINAL_BACKEND_URL="http://localhost:8080"
|
||||
print_warning "未找到后端URL配置,使用默认值: $FINAL_BACKEND_URL"
|
||||
fi
|
||||
|
||||
if [ -n "$HEARTBEAT_INTERVAL" ]; then
|
||||
print_info "心跳间隔: ${HEARTBEAT_INTERVAL}秒"
|
||||
else
|
||||
print_info "心跳间隔: 60秒 (默认值)"
|
||||
fi
|
||||
|
||||
if [ -n "$NODE_ID" ] && [ "$NODE_ID" != "0" ] && [ "$NODE_ID" != "null" ]; then
|
||||
print_success "节点ID已配置: $NODE_ID"
|
||||
else
|
||||
print_warning "节点ID未配置或为0,将在首次心跳时获取"
|
||||
fi
|
||||
|
||||
if [ -n "$NODE_IP" ] && [ "$NODE_IP" != "null" ]; then
|
||||
print_success "节点IP已配置: $NODE_IP"
|
||||
else
|
||||
print_warning "节点IP未配置,将在首次心跳时获取"
|
||||
fi
|
||||
|
||||
export FINAL_BACKEND_URL
|
||||
}
|
||||
|
||||
# 3. 检查网络连接
|
||||
check_network() {
|
||||
print_check_title "检查网络连接"
|
||||
|
||||
if [ -z "$FINAL_BACKEND_URL" ]; then
|
||||
print_error "无法确定后端URL,跳过网络检查"
|
||||
return 1
|
||||
fi
|
||||
|
||||
# 提取主机和端口
|
||||
BACKEND_HOST=$(echo "$FINAL_BACKEND_URL" | sed -E 's|https?://||' | cut -d'/' -f1 | cut -d':' -f1)
|
||||
BACKEND_PORT=$(echo "$FINAL_BACKEND_URL" | sed -E 's|https?://||' | cut -d'/' -f1 | cut -d':' -f2)
|
||||
|
||||
if [ -z "$BACKEND_PORT" ]; then
|
||||
if echo "$FINAL_BACKEND_URL" | grep -q "https://"; then
|
||||
BACKEND_PORT=443
|
||||
else
|
||||
BACKEND_PORT=80
|
||||
fi
|
||||
fi
|
||||
|
||||
print_info "后端地址: $BACKEND_HOST:$BACKEND_PORT"
|
||||
|
||||
# 检查DNS解析
|
||||
if command -v nslookup > /dev/null 2>&1 || command -v host > /dev/null 2>&1; then
|
||||
if command -v nslookup > /dev/null 2>&1; then
|
||||
if nslookup "$BACKEND_HOST" > /dev/null 2>&1; then
|
||||
print_success "DNS解析成功: $BACKEND_HOST"
|
||||
else
|
||||
print_error "DNS解析失败: $BACKEND_HOST"
|
||||
return 1
|
||||
fi
|
||||
elif command -v host > /dev/null 2>&1; then
|
||||
if host "$BACKEND_HOST" > /dev/null 2>&1; then
|
||||
print_success "DNS解析成功: $BACKEND_HOST"
|
||||
else
|
||||
print_error "DNS解析失败: $BACKEND_HOST"
|
||||
return 1
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
# 检查端口连通性
|
||||
if command -v nc > /dev/null 2>&1; then
|
||||
if nc -z -w 3 "$BACKEND_HOST" "$BACKEND_PORT" 2>/dev/null; then
|
||||
print_success "端口连通性检查通过: $BACKEND_HOST:$BACKEND_PORT"
|
||||
else
|
||||
print_error "端口无法连接: $BACKEND_HOST:$BACKEND_PORT"
|
||||
print_info "可能原因: 防火墙阻止、后端服务未启动、网络不通"
|
||||
return 1
|
||||
fi
|
||||
elif command -v timeout > /dev/null 2>&1 && command -v bash > /dev/null 2>&1; then
|
||||
# 使用 bash 内置的 TCP 连接测试
|
||||
if timeout 3 bash -c "echo > /dev/tcp/$BACKEND_HOST/$BACKEND_PORT" 2>/dev/null; then
|
||||
print_success "端口连通性检查通过: $BACKEND_HOST:$BACKEND_PORT"
|
||||
else
|
||||
print_error "端口无法连接: $BACKEND_HOST:$BACKEND_PORT"
|
||||
print_info "可能原因: 防火墙阻止、后端服务未启动、网络不通"
|
||||
return 1
|
||||
fi
|
||||
else
|
||||
print_warning "无法检查端口连通性(需要 nc 或 timeout 命令)"
|
||||
fi
|
||||
|
||||
# 检查HTTP连接
|
||||
HEARTBEAT_URL="${FINAL_BACKEND_URL%/}/api/node/heartbeat"
|
||||
print_info "测试心跳接口: $HEARTBEAT_URL"
|
||||
|
||||
if command -v curl > /dev/null 2>&1; then
|
||||
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 --max-time 10 \
|
||||
-X POST \
|
||||
-H "Content-Type: application/x-www-form-urlencoded" \
|
||||
-d "type=pingServer" \
|
||||
"$HEARTBEAT_URL" 2>/dev/null || echo "000")
|
||||
|
||||
if [ "$HTTP_CODE" = "200" ]; then
|
||||
print_success "心跳接口响应正常 (HTTP 200)"
|
||||
elif [ "$HTTP_CODE" = "000" ]; then
|
||||
print_error "无法连接到心跳接口"
|
||||
print_info "可能原因: 网络不通、后端服务未启动、防火墙阻止"
|
||||
return 1
|
||||
else
|
||||
print_warning "心跳接口返回异常状态码: HTTP $HTTP_CODE"
|
||||
print_info "这可能是正常的,取决于后端实现"
|
||||
fi
|
||||
elif command -v wget > /dev/null 2>&1; then
|
||||
HTTP_CODE=$(wget --spider --server-response --timeout=5 --tries=1 \
|
||||
--post-data="type=pingServer" \
|
||||
--header="Content-Type: application/x-www-form-urlencoded" \
|
||||
"$HEARTBEAT_URL" 2>&1 | grep -E "HTTP/" | tail -1 | awk '{print $2}' || echo "000")
|
||||
|
||||
if [ "$HTTP_CODE" = "200" ]; then
|
||||
print_success "心跳接口响应正常 (HTTP 200)"
|
||||
elif [ "$HTTP_CODE" = "000" ]; then
|
||||
print_error "无法连接到心跳接口"
|
||||
return 1
|
||||
else
|
||||
print_warning "心跳接口返回异常状态码: HTTP $HTTP_CODE"
|
||||
fi
|
||||
else
|
||||
print_warning "无法测试HTTP连接(需要 curl 或 wget 命令)"
|
||||
fi
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
# 4. 检查日志
|
||||
check_logs() {
|
||||
print_check_title "检查日志文件"
|
||||
|
||||
if [ ! -f "$LOG_FILE" ]; then
|
||||
print_warning "日志文件不存在: $LOG_FILE"
|
||||
print_info "如果服务刚启动,日志文件可能还未创建"
|
||||
return 0
|
||||
fi
|
||||
|
||||
print_success "日志文件存在: $LOG_FILE"
|
||||
|
||||
# 检查日志文件大小
|
||||
LOG_SIZE=$(stat -f%z "$LOG_FILE" 2>/dev/null || stat -c%s "$LOG_FILE" 2>/dev/null || echo "0")
|
||||
if [ "$LOG_SIZE" -gt 10485760 ]; then
|
||||
print_warning "日志文件较大: $(($LOG_SIZE / 1024 / 1024))MB"
|
||||
fi
|
||||
|
||||
# 检查最近的心跳记录
|
||||
print_info "查找最近的心跳记录..."
|
||||
|
||||
HEARTBEAT_SUCCESS=$(grep -i "心跳发送成功\|heartbeat.*success\|心跳响应" "$LOG_FILE" 2>/dev/null | tail -5 || true)
|
||||
HEARTBEAT_FAILED=$(grep -i "心跳发送失败\|heartbeat.*fail\|发送心跳失败" "$LOG_FILE" 2>/dev/null | tail -5 || true)
|
||||
HEARTBEAT_ERROR=$(grep -i "error.*heartbeat\|心跳.*error" "$LOG_FILE" 2>/dev/null | tail -5 || true)
|
||||
|
||||
if [ -n "$HEARTBEAT_SUCCESS" ]; then
|
||||
echo -e "${GREEN}最近成功的心跳记录:${NC}"
|
||||
echo "$HEARTBEAT_SUCCESS" | while IFS= read -r line; do
|
||||
echo " $line"
|
||||
done
|
||||
fi
|
||||
|
||||
if [ -n "$HEARTBEAT_FAILED" ]; then
|
||||
echo -e "${YELLOW}最近失败的心跳记录:${NC}"
|
||||
echo "$HEARTBEAT_FAILED" | while IFS= read -r line; do
|
||||
echo " $line"
|
||||
done
|
||||
((WARNINGS++))
|
||||
fi
|
||||
|
||||
if [ -n "$HEARTBEAT_ERROR" ]; then
|
||||
echo -e "${RED}最近的心跳错误记录:${NC}"
|
||||
echo "$HEARTBEAT_ERROR" | while IFS= read -r line; do
|
||||
echo " $line"
|
||||
done
|
||||
((ISSUES++))
|
||||
fi
|
||||
|
||||
# 检查最近的错误
|
||||
RECENT_ERRORS=$(grep -i "error\|fail\|panic" "$LOG_FILE" 2>/dev/null | tail -10 || true)
|
||||
if [ -n "$RECENT_ERRORS" ]; then
|
||||
echo -e "${YELLOW}最近的错误记录(最后10条):${NC}"
|
||||
echo "$RECENT_ERRORS" | while IFS= read -r line; do
|
||||
echo " $line"
|
||||
done
|
||||
fi
|
||||
|
||||
# 检查最后的心跳时间
|
||||
LAST_HEARTBEAT=$(grep -i "心跳" "$LOG_FILE" 2>/dev/null | tail -1 || true)
|
||||
if [ -n "$LAST_HEARTBEAT" ]; then
|
||||
print_info "最后的心跳日志: $LAST_HEARTBEAT"
|
||||
else
|
||||
print_warning "日志中未找到心跳记录"
|
||||
fi
|
||||
}
|
||||
|
||||
# 5. 手动测试心跳
|
||||
test_heartbeat() {
|
||||
print_check_title "手动测试心跳发送"
|
||||
|
||||
if [ -z "$FINAL_BACKEND_URL" ]; then
|
||||
print_error "无法确定后端URL,跳过心跳测试"
|
||||
return 1
|
||||
fi
|
||||
|
||||
HEARTBEAT_URL="${FINAL_BACKEND_URL%/}/api/node/heartbeat"
|
||||
print_info "发送测试心跳到: $HEARTBEAT_URL"
|
||||
|
||||
if command -v curl > /dev/null 2>&1; then
|
||||
RESPONSE=$(curl -s -w "\n%{http_code}" --connect-timeout 10 --max-time 15 \
|
||||
-X POST \
|
||||
-H "Content-Type: application/x-www-form-urlencoded" \
|
||||
-d "type=pingServer" \
|
||||
"$HEARTBEAT_URL" 2>&1)
|
||||
|
||||
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
|
||||
BODY=$(echo "$RESPONSE" | sed '$d')
|
||||
|
||||
if [ "$HTTP_CODE" = "200" ]; then
|
||||
print_success "心跳发送成功 (HTTP 200)"
|
||||
if [ -n "$BODY" ]; then
|
||||
print_info "响应内容: $BODY"
|
||||
|
||||
# 尝试解析JSON响应
|
||||
if echo "$BODY" | grep -q "node_id\|node_ip"; then
|
||||
print_success "响应包含节点信息"
|
||||
echo "$BODY" | grep -o '"node_id":[0-9]*\|"node_ip":"[^"]*"' 2>/dev/null || true
|
||||
fi
|
||||
fi
|
||||
else
|
||||
print_error "心跳发送失败 (HTTP $HTTP_CODE)"
|
||||
if [ -n "$BODY" ]; then
|
||||
print_info "响应内容: $BODY"
|
||||
fi
|
||||
return 1
|
||||
fi
|
||||
elif command -v wget > /dev/null 2>&1; then
|
||||
RESPONSE=$(wget -qO- --post-data="type=pingServer" \
|
||||
--header="Content-Type: application/x-www-form-urlencoded" \
|
||||
--timeout=15 \
|
||||
"$HEARTBEAT_URL" 2>&1)
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
print_success "心跳发送成功"
|
||||
if [ -n "$RESPONSE" ]; then
|
||||
print_info "响应内容: $RESPONSE"
|
||||
fi
|
||||
else
|
||||
print_error "心跳发送失败"
|
||||
return 1
|
||||
fi
|
||||
else
|
||||
print_warning "无法测试心跳(需要 curl 或 wget 命令)"
|
||||
return 1
|
||||
fi
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
# 6. 检查系统资源
|
||||
check_resources() {
|
||||
print_check_title "检查系统资源"
|
||||
|
||||
# 检查磁盘空间
|
||||
if command -v df > /dev/null 2>&1; then
|
||||
DISK_USAGE=$(df -h . | tail -1 | awk '{print $5}' | sed 's/%//')
|
||||
if [ "$DISK_USAGE" -gt 90 ]; then
|
||||
print_error "磁盘空间不足: ${DISK_USAGE}%"
|
||||
elif [ "$DISK_USAGE" -gt 80 ]; then
|
||||
print_warning "磁盘空间紧张: ${DISK_USAGE}%"
|
||||
else
|
||||
print_success "磁盘空间充足: ${DISK_USAGE}%"
|
||||
fi
|
||||
fi
|
||||
|
||||
# 检查内存
|
||||
if command -v free > /dev/null 2>&1; then
|
||||
MEM_INFO=$(free -m | grep Mem)
|
||||
MEM_TOTAL=$(echo "$MEM_INFO" | awk '{print $2}')
|
||||
MEM_AVAIL=$(echo "$MEM_INFO" | awk '{print $7}')
|
||||
if [ -z "$MEM_AVAIL" ]; then
|
||||
MEM_AVAIL=$(echo "$MEM_INFO" | awk '{print $4}')
|
||||
fi
|
||||
|
||||
if [ -n "$MEM_TOTAL" ] && [ -n "$MEM_AVAIL" ]; then
|
||||
MEM_PERCENT=$((MEM_AVAIL * 100 / MEM_TOTAL))
|
||||
if [ "$MEM_PERCENT" -lt 10 ]; then
|
||||
print_error "可用内存不足: ${MEM_AVAIL}MB / ${MEM_TOTAL}MB (${MEM_PERCENT}%)"
|
||||
elif [ "$MEM_PERCENT" -lt 20 ]; then
|
||||
print_warning "可用内存紧张: ${MEM_AVAIL}MB / ${MEM_TOTAL}MB (${MEM_PERCENT}%)"
|
||||
else
|
||||
print_success "内存充足: ${MEM_AVAIL}MB / ${MEM_TOTAL}MB (${MEM_PERCENT}%)"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
# 主函数
|
||||
main() {
|
||||
echo -e "${CYAN}"
|
||||
echo "========================================"
|
||||
echo " LinkMaster 节点心跳故障排查工具"
|
||||
echo "========================================"
|
||||
echo -e "${NC}"
|
||||
|
||||
# 执行各项检查
|
||||
check_process
|
||||
PROCESS_OK=$?
|
||||
|
||||
check_config
|
||||
|
||||
if [ $PROCESS_OK -eq 0 ]; then
|
||||
check_network
|
||||
NETWORK_OK=$?
|
||||
|
||||
check_logs
|
||||
|
||||
if [ $NETWORK_OK -eq 0 ]; then
|
||||
echo ""
|
||||
read -p "是否执行手动心跳测试? (y/N): " -n 1 -r
|
||||
echo ""
|
||||
if [[ $REPLY =~ ^[Yy]$ ]]; then
|
||||
test_heartbeat
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
check_resources
|
||||
|
||||
# 总结
|
||||
print_separator
|
||||
echo -e "\n${BLUE}排查总结:${NC}"
|
||||
|
||||
if [ $ISSUES -eq 0 ] && [ $WARNINGS -eq 0 ]; then
|
||||
echo -e "${GREEN}✓ 未发现明显问题${NC}"
|
||||
echo -e "${CYAN}如果心跳仍然无法同步,请检查:${NC}"
|
||||
echo " 1. 后端服务是否正常运行"
|
||||
echo " 2. 后端数据库是否正常"
|
||||
echo " 3. 防火墙规则是否正确配置"
|
||||
echo " 4. 查看完整日志: ./run.sh logs-all"
|
||||
else
|
||||
if [ $ISSUES -gt 0 ]; then
|
||||
echo -e "${RED}发现 $ISSUES 个严重问题${NC}"
|
||||
fi
|
||||
if [ $WARNINGS -gt 0 ]; then
|
||||
echo -e "${YELLOW}发现 $WARNINGS 个警告${NC}"
|
||||
fi
|
||||
|
||||
echo -e "\n${CYAN}建议操作:${NC}"
|
||||
echo " 1. 根据上述检查结果修复问题"
|
||||
echo " 2. 重启服务: ./run.sh restart"
|
||||
echo " 3. 查看实时日志: ./run.sh logs"
|
||||
echo " 4. 查看完整日志: ./run.sh logs-all"
|
||||
fi
|
||||
|
||||
print_separator
|
||||
}
|
||||
|
||||
# 运行主函数
|
||||
main
|
||||
@@ -5,6 +5,7 @@ import (
|
||||
"fmt"
|
||||
"os"
|
||||
"os/signal"
|
||||
"path/filepath"
|
||||
"syscall"
|
||||
"time"
|
||||
|
||||
@@ -14,8 +15,11 @@ import (
|
||||
"linkmaster-node/internal/server"
|
||||
|
||||
"go.uber.org/zap"
|
||||
"go.uber.org/zap/zapcore"
|
||||
)
|
||||
|
||||
var version = "1.1.0" // 编译时通过 -ldflags "-X main.version=xxx" 设置
|
||||
|
||||
func main() {
|
||||
// 加载配置
|
||||
cfg, err := config.Load()
|
||||
@@ -32,7 +36,7 @@ func main() {
|
||||
}
|
||||
defer logger.Sync()
|
||||
|
||||
logger.Info("节点服务启动", zap.String("version", "1.0.0"))
|
||||
logger.Info("节点服务启动", zap.String("version", version))
|
||||
|
||||
// 初始化错误恢复
|
||||
recovery.Init()
|
||||
@@ -80,9 +84,69 @@ func main() {
|
||||
}
|
||||
|
||||
func initLogger(cfg *config.Config) (*zap.Logger, error) {
|
||||
if cfg.Debug {
|
||||
return zap.NewDevelopment()
|
||||
// 确定日志级别
|
||||
var level zapcore.Level
|
||||
logLevel := cfg.Log.Level
|
||||
if logLevel == "" {
|
||||
if cfg.Debug {
|
||||
logLevel = "debug"
|
||||
} else {
|
||||
logLevel = "info"
|
||||
}
|
||||
}
|
||||
return zap.NewProduction()
|
||||
}
|
||||
|
||||
switch logLevel {
|
||||
case "debug":
|
||||
level = zapcore.DebugLevel
|
||||
case "info":
|
||||
level = zapcore.InfoLevel
|
||||
case "warn":
|
||||
level = zapcore.WarnLevel
|
||||
case "error":
|
||||
level = zapcore.ErrorLevel
|
||||
default:
|
||||
level = zapcore.InfoLevel
|
||||
}
|
||||
|
||||
// 编码器配置
|
||||
encoderConfig := zap.NewProductionEncoderConfig()
|
||||
if cfg.Debug {
|
||||
encoderConfig = zap.NewDevelopmentEncoderConfig()
|
||||
}
|
||||
encoderConfig.EncodeTime = zapcore.ISO8601TimeEncoder
|
||||
encoderConfig.EncodeLevel = zapcore.CapitalLevelEncoder
|
||||
|
||||
// 确定输出目标
|
||||
var writeSyncer zapcore.WriteSyncer
|
||||
if cfg.Log.File != "" {
|
||||
// 确保日志目录存在
|
||||
logDir := filepath.Dir(cfg.Log.File)
|
||||
if logDir != "." && logDir != "" {
|
||||
if err := os.MkdirAll(logDir, 0755); err != nil {
|
||||
return nil, fmt.Errorf("创建日志目录失败: %w", err)
|
||||
}
|
||||
}
|
||||
|
||||
// 打开日志文件(追加模式)
|
||||
logFile, err := os.OpenFile(cfg.Log.File, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0644)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("打开日志文件失败: %w", err)
|
||||
}
|
||||
writeSyncer = zapcore.AddSync(logFile)
|
||||
} else {
|
||||
// 输出到标准错误(兼容原有行为)
|
||||
writeSyncer = zapcore.AddSync(os.Stderr)
|
||||
}
|
||||
|
||||
// 创建核心
|
||||
core := zapcore.NewCore(
|
||||
zapcore.NewJSONEncoder(encoderConfig),
|
||||
writeSyncer,
|
||||
level,
|
||||
)
|
||||
|
||||
// 创建 logger
|
||||
logger := zap.New(core, zap.AddCaller(), zap.AddStacktrace(zapcore.ErrorLevel))
|
||||
|
||||
return logger, nil
|
||||
}
|
||||
|
||||
105
install.sh
105
install.sh
@@ -8,6 +8,30 @@
|
||||
|
||||
set -e
|
||||
|
||||
# 错误处理函数
|
||||
error_handler() {
|
||||
local line_number=$1
|
||||
local command=$2
|
||||
echo ""
|
||||
echo -e "${RED}========================================${NC}"
|
||||
echo -e "${RED} 脚本执行出错!${NC}"
|
||||
echo -e "${RED}========================================${NC}"
|
||||
echo -e "${YELLOW}错误位置: 第 ${line_number} 行${NC}"
|
||||
echo -e "${YELLOW}失败命令: ${command}${NC}"
|
||||
echo ""
|
||||
echo -e "${YELLOW}故障排查建议:${NC}"
|
||||
echo " 1. 检查网络连接是否正常"
|
||||
echo " 2. 检查后端地址是否正确: ${BACKEND_URL:-未设置}"
|
||||
echo " 3. 检查是否有足够的磁盘空间和权限"
|
||||
echo " 4. 查看上面的详细错误信息"
|
||||
echo ""
|
||||
echo -e "${YELLOW}查看服务日志: sudo journalctl -u ${SERVICE_NAME:-linkmaster-node} -n 50${NC}"
|
||||
exit 1
|
||||
}
|
||||
|
||||
# 设置错误陷阱
|
||||
trap 'error_handler ${LINENO} "${BASH_COMMAND}"' ERR
|
||||
|
||||
# 颜色输出
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
@@ -96,7 +120,6 @@ detect_fastest_mirror() {
|
||||
|
||||
# Ubuntu/Debian 镜像源列表
|
||||
UBUNTU_MIRRORS=(
|
||||
"mirrors.tuna.tsinghua.edu.cn"
|
||||
"mirrors.huaweicloud.com"
|
||||
"mirrors.163.com"
|
||||
"archive.ubuntu.com"
|
||||
@@ -104,7 +127,6 @@ detect_fastest_mirror() {
|
||||
|
||||
# CentOS/RHEL 镜像源列表
|
||||
CENTOS_MIRRORS=(
|
||||
"mirrors.tuna.tsinghua.edu.cn"
|
||||
"mirrors.huaweicloud.com"
|
||||
)
|
||||
|
||||
@@ -1264,11 +1286,18 @@ EOF
|
||||
|
||||
# 调用心跳API获取节点信息
|
||||
echo -e "${BLUE}发送心跳请求获取节点信息...${NC}"
|
||||
RESPONSE=$(curl -s -X POST "${BACKEND_URL}/api/node/heartbeat" \
|
||||
echo -e "${BLUE}后端地址: ${BACKEND_URL}${NC}"
|
||||
|
||||
# 添加超时设置,避免长时间卡住
|
||||
# 使用 set +e 临时禁用错误退出,因为心跳失败不应该阻止安装
|
||||
set +e
|
||||
RESPONSE=$(curl -s --connect-timeout 10 --max-time 30 -X POST "${BACKEND_URL}/api/node/heartbeat" \
|
||||
-H "Content-Type: application/x-www-form-urlencoded" \
|
||||
-d "type=pingServer" 2>&1)
|
||||
CURL_EXIT_CODE=$?
|
||||
set -e # 重新启用错误退出
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
if [ $CURL_EXIT_CODE -eq 0 ]; then
|
||||
# 尝试解析JSON响应
|
||||
NODE_ID=$(echo "$RESPONSE" | grep -o '"node_id":[0-9]*' | grep -o '[0-9]*' | head -1)
|
||||
NODE_IP=$(echo "$RESPONSE" | grep -o '"node_ip":"[^"]*"' | cut -d'"' -f4 | head -1)
|
||||
@@ -1306,8 +1335,10 @@ EOF
|
||||
echo -e "${YELLOW} 响应: ${RESPONSE}${NC}"
|
||||
fi
|
||||
else
|
||||
echo -e "${YELLOW}⚠ 心跳请求失败,将在服务启动时重试${NC}"
|
||||
echo -e "${YELLOW} 错误: ${RESPONSE}${NC}"
|
||||
echo -e "${YELLOW}⚠ 心跳请求失败 (退出码: ${CURL_EXIT_CODE}),将在服务启动时重试${NC}"
|
||||
echo -e "${YELLOW} 错误信息: ${RESPONSE}${NC}"
|
||||
echo -e "${YELLOW} 提示: 请检查后端地址是否正确: ${BACKEND_URL}${NC}"
|
||||
echo -e "${YELLOW} 测试连接: curl -v ${BACKEND_URL}/api/public/nodes/online${NC}"
|
||||
fi
|
||||
|
||||
# 设置配置文件权限
|
||||
@@ -1318,18 +1349,52 @@ EOF
|
||||
start_service() {
|
||||
echo -e "${BLUE}启动服务...${NC}"
|
||||
|
||||
sudo systemctl enable ${SERVICE_NAME} > /dev/null 2>&1
|
||||
sudo systemctl restart ${SERVICE_NAME}
|
||||
# 先检查服务文件是否存在
|
||||
if [ ! -f "/etc/systemd/system/${SERVICE_NAME}.service" ]; then
|
||||
echo -e "${RED}✗ 错误: 服务文件不存在${NC}"
|
||||
echo -e "${YELLOW} 路径: /etc/systemd/system/${SERVICE_NAME}.service${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 检查二进制文件是否存在
|
||||
if [ ! -f "$SOURCE_DIR/agent" ] && [ ! -f "$INSTALL_DIR/$BINARY_NAME" ]; then
|
||||
echo -e "${RED}✗ 错误: 二进制文件不存在${NC}"
|
||||
echo -e "${YELLOW} 检查路径: $SOURCE_DIR/agent 或 $INSTALL_DIR/$BINARY_NAME${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 启用服务(显示输出以便调试)
|
||||
echo -e "${BLUE}启用服务...${NC}"
|
||||
if ! sudo systemctl enable ${SERVICE_NAME} 2>&1; then
|
||||
echo -e "${RED}✗ 启用服务失败${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 重新加载 systemd
|
||||
echo -e "${BLUE}重新加载 systemd...${NC}"
|
||||
sudo systemctl daemon-reload
|
||||
|
||||
# 启动服务(显示输出以便调试)
|
||||
echo -e "${BLUE}启动服务...${NC}"
|
||||
if ! sudo systemctl restart ${SERVICE_NAME} 2>&1; then
|
||||
echo -e "${RED}✗ 启动服务失败${NC}"
|
||||
echo -e "${YELLOW}查看详细日志: sudo journalctl -u ${SERVICE_NAME} -n 100 --no-pager${NC}"
|
||||
echo -e "${YELLOW}查看服务状态: sudo systemctl status ${SERVICE_NAME}${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 等待服务启动
|
||||
echo -e "${BLUE}等待服务启动...${NC}"
|
||||
sleep 3
|
||||
|
||||
# 检查服务状态
|
||||
if sudo systemctl is-active --quiet ${SERVICE_NAME}; then
|
||||
if sudo systemctl is-active --quiet ${SERVICE_NAME} 2>/dev/null; then
|
||||
echo -e "${GREEN}✓ 服务启动成功${NC}"
|
||||
else
|
||||
echo -e "${RED}✗ 服务启动失败${NC}"
|
||||
echo -e "${YELLOW}查看日志: sudo journalctl -u ${SERVICE_NAME} -n 50${NC}"
|
||||
echo -e "${YELLOW}服务状态:${NC}"
|
||||
sudo systemctl status ${SERVICE_NAME} --no-pager -l || true
|
||||
echo -e "${YELLOW}查看详细日志: sudo journalctl -u ${SERVICE_NAME} -n 100 --no-pager${NC}"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
@@ -1387,20 +1452,29 @@ main() {
|
||||
echo -e "${GREEN} LinkMaster 节点端安装程序${NC}"
|
||||
echo -e "${GREEN}========================================${NC}"
|
||||
echo ""
|
||||
echo -e "${BLUE}后端地址: ${BACKEND_URL}${NC}"
|
||||
echo ""
|
||||
|
||||
echo -e "${BLUE}[1/8] 检测系统类型...${NC}"
|
||||
detect_system
|
||||
|
||||
# 检查是否已安装,如果已安装则先卸载
|
||||
if check_installed; then
|
||||
echo -e "${BLUE}[2/8] 卸载已存在的服务...${NC}"
|
||||
uninstall_service
|
||||
else
|
||||
echo -e "${BLUE}[2/8] 检查已安装服务...${NC}"
|
||||
echo -e "${GREEN}✓ 未检测到已安装的服务${NC}"
|
||||
fi
|
||||
|
||||
# 检测并配置最快的镜像源(在安装依赖之前)
|
||||
echo -e "${BLUE}[3/8] 检测并配置镜像源...${NC}"
|
||||
detect_fastest_mirror
|
||||
|
||||
echo -e "${BLUE}[4/8] 安装系统依赖...${NC}"
|
||||
install_dependencies
|
||||
|
||||
# 优先尝试从 Releases 下载二进制文件
|
||||
echo -e "${BLUE}[5/8] 下载或编译二进制文件...${NC}"
|
||||
if ! download_binary_from_releases; then
|
||||
echo -e "${BLUE}从 Releases 下载失败,开始从源码编译...${NC}"
|
||||
build_from_source
|
||||
@@ -1408,10 +1482,19 @@ main() {
|
||||
echo -e "${GREEN}✓ 使用预编译二进制文件,跳过编译步骤${NC}"
|
||||
fi
|
||||
|
||||
echo -e "${BLUE}[6/8] 创建 systemd 服务...${NC}"
|
||||
create_service
|
||||
|
||||
echo -e "${BLUE}[7/8] 配置防火墙规则...${NC}"
|
||||
configure_firewall
|
||||
|
||||
echo -e "${BLUE}[8/8] 登记节点到后端服务器...${NC}"
|
||||
register_node
|
||||
|
||||
echo -e "${BLUE}[9/9] 启动服务...${NC}"
|
||||
start_service
|
||||
|
||||
echo -e "${BLUE}[10/10] 验证安装...${NC}"
|
||||
verify_installation
|
||||
|
||||
echo ""
|
||||
|
||||
@@ -21,6 +21,11 @@ type Config struct {
|
||||
Interval int `yaml:"interval"` // 心跳间隔(秒)
|
||||
} `yaml:"heartbeat"`
|
||||
|
||||
Log struct {
|
||||
File string `yaml:"file"` // 日志文件路径(空则输出到标准错误)
|
||||
Level string `yaml:"level"` // 日志级别:debug, info, warn, error(默认: info)
|
||||
} `yaml:"log"`
|
||||
|
||||
Debug bool `yaml:"debug"`
|
||||
|
||||
// 节点信息(通过心跳获取并持久化)
|
||||
@@ -42,12 +47,13 @@ func Load() (*Config, error) {
|
||||
cfg.Heartbeat.Interval = 60
|
||||
cfg.Debug = false
|
||||
|
||||
// 从环境变量读取后端URL
|
||||
backendURL := os.Getenv("BACKEND_URL")
|
||||
if backendURL == "" {
|
||||
backendURL = "http://localhost:8080"
|
||||
// 默认日志配置
|
||||
logFile := os.Getenv("LOG_FILE")
|
||||
if logFile == "" {
|
||||
logFile = "node.log"
|
||||
}
|
||||
cfg.Backend.URL = backendURL
|
||||
cfg.Log.File = logFile
|
||||
cfg.Log.Level = "info"
|
||||
|
||||
// 尝试从配置文件读取
|
||||
configPath := os.Getenv("CONFIG_PATH")
|
||||
@@ -66,6 +72,30 @@ func Load() (*Config, error) {
|
||||
}
|
||||
}
|
||||
|
||||
// 环境变量优先级最高,覆盖配置文件中的设置
|
||||
// 支持 BACKEND_URL 环境变量覆盖后端地址
|
||||
if backendURL := os.Getenv("BACKEND_URL"); backendURL != "" {
|
||||
cfg.Backend.URL = backendURL
|
||||
}
|
||||
|
||||
// 如果配置文件中没有设置日志文件,使用环境变量或默认值
|
||||
if cfg.Log.File == "" {
|
||||
logFile := os.Getenv("LOG_FILE")
|
||||
if logFile == "" {
|
||||
logFile = "node.log"
|
||||
}
|
||||
cfg.Log.File = logFile
|
||||
}
|
||||
|
||||
// 如果配置文件中没有设置日志级别,使用默认值
|
||||
if cfg.Log.Level == "" {
|
||||
if cfg.Debug {
|
||||
cfg.Log.Level = "debug"
|
||||
} else {
|
||||
cfg.Log.Level = "info"
|
||||
}
|
||||
}
|
||||
|
||||
return cfg, nil
|
||||
}
|
||||
|
||||
@@ -102,4 +132,3 @@ func GetConfigPath() string {
|
||||
}
|
||||
return configPath
|
||||
}
|
||||
|
||||
|
||||
@@ -28,14 +28,47 @@ type TCPingTask struct {
|
||||
}
|
||||
|
||||
func NewTCPingTask(taskID, target string, interval, maxDuration time.Duration) (*TCPingTask, error) {
|
||||
// 解析host:port
|
||||
parts := strings.Split(target, ":")
|
||||
if len(parts) != 2 {
|
||||
return nil, fmt.Errorf("无效的target格式,需要 host:port")
|
||||
// 解析host:port,如果没有端口则默认80
|
||||
var host string
|
||||
var portStr string
|
||||
var port int
|
||||
|
||||
// 检查是否是IPv6格式(如 [::1]:8080)
|
||||
if strings.HasPrefix(target, "[") {
|
||||
// IPv6格式 - 使用 Index 而不是 LastIndex 来找到第一个闭合括号
|
||||
closeBracket := strings.Index(target, "]")
|
||||
if closeBracket == -1 {
|
||||
return nil, fmt.Errorf("无效的target格式,IPv6地址格式应为 [host]:port")
|
||||
}
|
||||
host = target[1:closeBracket]
|
||||
if closeBracket+1 < len(target) && target[closeBracket+1] == ':' {
|
||||
portStr = target[closeBracket+2:]
|
||||
// 如果端口部分为空,使用默认端口80(修复 Bug 1)
|
||||
if portStr == "" {
|
||||
portStr = "80"
|
||||
}
|
||||
} else {
|
||||
portStr = "80" // 默认端口
|
||||
}
|
||||
} else {
|
||||
// 普通格式 host:port 或 host
|
||||
lastColonIndex := strings.LastIndex(target, ":")
|
||||
if lastColonIndex == -1 {
|
||||
// 没有冒号,使用默认端口80
|
||||
host = target
|
||||
portStr = "80"
|
||||
} else {
|
||||
host = target[:lastColonIndex]
|
||||
portStr = target[lastColonIndex+1:]
|
||||
// 如果端口部分为空,使用默认端口80
|
||||
if portStr == "" {
|
||||
portStr = "80"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
host := parts[0]
|
||||
port, err := strconv.Atoi(parts[1])
|
||||
var err error
|
||||
port, err = strconv.Atoi(portStr)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("无效的端口: %v", err)
|
||||
}
|
||||
@@ -80,10 +113,10 @@ func (t *TCPingTask) Start(ctx context.Context, resultCallback func(result map[s
|
||||
if !isRunning {
|
||||
return
|
||||
}
|
||||
|
||||
|
||||
// 执行tcping测试(每次测试完成后立即返回结果)
|
||||
result := t.executeTCPing()
|
||||
|
||||
|
||||
// 再次检查任务是否已停止(执行完成后)
|
||||
t.mu.RLock()
|
||||
isRunning = t.IsRunning
|
||||
@@ -91,7 +124,7 @@ func (t *TCPingTask) Start(ctx context.Context, resultCallback func(result map[s
|
||||
if !isRunning {
|
||||
return
|
||||
}
|
||||
|
||||
|
||||
if resultCallback != nil {
|
||||
resultCallback(result)
|
||||
}
|
||||
@@ -117,7 +150,7 @@ func (t *TCPingTask) Stop() {
|
||||
}
|
||||
t.IsRunning = false
|
||||
t.mu.Unlock()
|
||||
|
||||
|
||||
// 关闭停止通道
|
||||
select {
|
||||
case <-t.StopCh:
|
||||
@@ -125,7 +158,7 @@ func (t *TCPingTask) Stop() {
|
||||
default:
|
||||
close(t.StopCh)
|
||||
}
|
||||
|
||||
|
||||
t.logger.Info("TCPing任务已停止", zap.String("task_id", t.TaskID))
|
||||
}
|
||||
|
||||
@@ -185,4 +218,3 @@ func (t *TCPingTask) executeTCPing() map[string]interface{} {
|
||||
"ip": targetIP,
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -8,6 +8,7 @@ import (
|
||||
"io"
|
||||
"net"
|
||||
"net/http"
|
||||
"os"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
@@ -18,6 +19,7 @@ import (
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"go.uber.org/zap"
|
||||
"go.uber.org/zap/zapcore"
|
||||
)
|
||||
|
||||
var continuousTasks = make(map[string]*ContinuousTask)
|
||||
@@ -46,7 +48,52 @@ const (
|
||||
|
||||
func InitContinuousHandler(cfg *config.Config) {
|
||||
backendURL = cfg.Backend.URL
|
||||
logger, _ = zap.NewProduction()
|
||||
|
||||
// 根据配置创建logger
|
||||
var level zapcore.Level
|
||||
logLevel := cfg.Log.Level
|
||||
if logLevel == "" {
|
||||
if cfg.Debug {
|
||||
logLevel = "debug"
|
||||
} else {
|
||||
logLevel = "info"
|
||||
}
|
||||
}
|
||||
|
||||
switch logLevel {
|
||||
case "debug":
|
||||
level = zapcore.DebugLevel
|
||||
case "info":
|
||||
level = zapcore.InfoLevel
|
||||
case "warn":
|
||||
level = zapcore.WarnLevel
|
||||
case "error":
|
||||
level = zapcore.ErrorLevel
|
||||
default:
|
||||
level = zapcore.InfoLevel
|
||||
}
|
||||
|
||||
// 创建编码器配置
|
||||
encoderConfig := zap.NewProductionEncoderConfig()
|
||||
if cfg.Debug {
|
||||
encoderConfig = zap.NewDevelopmentEncoderConfig()
|
||||
}
|
||||
encoderConfig.EncodeTime = zapcore.ISO8601TimeEncoder
|
||||
encoderConfig.EncodeLevel = zapcore.CapitalLevelEncoder
|
||||
|
||||
// 创建核心 - 输出到标准错误(日志文件由main.go统一管理,这里输出到stderr便于调试)
|
||||
core := zapcore.NewCore(
|
||||
zapcore.NewJSONEncoder(encoderConfig),
|
||||
zapcore.AddSync(os.Stderr),
|
||||
level,
|
||||
)
|
||||
|
||||
// 创建logger
|
||||
logger = zap.New(core, zap.AddCaller(), zap.AddStacktrace(zapcore.ErrorLevel))
|
||||
|
||||
logger.Info("持续测试处理器已初始化",
|
||||
zap.String("backend_url", backendURL),
|
||||
zap.String("log_level", logLevel))
|
||||
}
|
||||
|
||||
type ContinuousTask struct {
|
||||
@@ -160,7 +207,15 @@ func HandleContinuousStop(c *gin.Context) {
|
||||
if task.tcpingTask != nil {
|
||||
task.tcpingTask.Stop()
|
||||
}
|
||||
close(task.StopCh)
|
||||
|
||||
// 关闭停止通道
|
||||
select {
|
||||
case <-task.StopCh:
|
||||
// 已经关闭
|
||||
default:
|
||||
close(task.StopCh)
|
||||
}
|
||||
|
||||
delete(continuousTasks, req.TaskID)
|
||||
}
|
||||
taskMutex.Unlock()
|
||||
@@ -170,6 +225,17 @@ func HandleContinuousStop(c *gin.Context) {
|
||||
return
|
||||
}
|
||||
|
||||
// 清理推送缓冲
|
||||
bufferMutex.Lock()
|
||||
if buffer, exists := pushBuffers[req.TaskID]; exists {
|
||||
if buffer.pushTimer != nil {
|
||||
buffer.pushTimer.Stop()
|
||||
}
|
||||
delete(pushBuffers, req.TaskID)
|
||||
logger.Debug("已清理任务推送缓冲", zap.String("task_id", req.TaskID))
|
||||
}
|
||||
bufferMutex.Unlock()
|
||||
|
||||
c.JSON(http.StatusOK, gin.H{"message": "任务已停止"})
|
||||
}
|
||||
|
||||
@@ -237,7 +303,8 @@ func pushResultToBackend(taskID string, result map[string]interface{}) {
|
||||
logger.Warn("节点ID未获取,跳过推送结果",
|
||||
zap.String("task_id", taskID),
|
||||
zap.String("node_ip", nodeIP),
|
||||
zap.String("hint", "等待心跳返回node_id后再推送"))
|
||||
zap.String("hint", "等待心跳返回node_id后再推送"),
|
||||
zap.Any("result", result))
|
||||
return
|
||||
}
|
||||
|
||||
@@ -246,10 +313,18 @@ func pushResultToBackend(taskID string, result map[string]interface{}) {
|
||||
logger.Warn("节点IP未获取,跳过推送结果",
|
||||
zap.String("task_id", taskID),
|
||||
zap.Uint("node_id", nodeID),
|
||||
zap.String("hint", "等待心跳返回node_ip后再推送"))
|
||||
zap.String("hint", "等待心跳返回node_ip后再推送"),
|
||||
zap.Any("result", result))
|
||||
return
|
||||
}
|
||||
|
||||
// 记录调试信息
|
||||
logger.Debug("准备推送结果到后端",
|
||||
zap.String("task_id", taskID),
|
||||
zap.Uint("node_id", nodeID),
|
||||
zap.String("node_ip", nodeIP),
|
||||
zap.Any("result", result))
|
||||
|
||||
// 添加到批量推送缓冲
|
||||
addToPushBuffer(taskID, nodeID, nodeIP, result)
|
||||
}
|
||||
@@ -269,28 +344,43 @@ func addToPushBuffer(taskID string, nodeID uint, nodeIP string, result map[strin
|
||||
bufferMutex.Unlock()
|
||||
|
||||
buffer.mu.Lock()
|
||||
defer buffer.mu.Unlock()
|
||||
|
||||
// 添加结果到缓冲
|
||||
buffer.results = append(buffer.results, result)
|
||||
|
||||
// 如果缓冲已满,立即推送
|
||||
shouldFlush := len(buffer.results) >= batchPushMaxSize
|
||||
buffer.mu.Unlock()
|
||||
|
||||
if shouldFlush {
|
||||
flushPushBuffer(taskID, nodeID, nodeIP)
|
||||
// 复制结果列表
|
||||
results := make([]map[string]interface{}, len(buffer.results))
|
||||
copy(results, buffer.results)
|
||||
buffer.results = buffer.results[:0] // 清空缓冲
|
||||
|
||||
// 停止定时器
|
||||
if buffer.pushTimer != nil {
|
||||
buffer.pushTimer.Stop()
|
||||
buffer.pushTimer = nil
|
||||
}
|
||||
|
||||
buffer.lastPush = time.Now()
|
||||
buffer.mu.Unlock()
|
||||
|
||||
// 批量推送结果
|
||||
for _, r := range results {
|
||||
pushSingleResult(taskID, nodeID, nodeIP, r)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
buffer.mu.Lock()
|
||||
|
||||
// 如果距离上次推送超过间隔时间,启动定时器推送
|
||||
if buffer.pushTimer == nil {
|
||||
buffer.pushTimer = time.AfterFunc(batchPushInterval, func() {
|
||||
flushPushBuffer(taskID, nodeID, nodeIP)
|
||||
})
|
||||
}
|
||||
|
||||
buffer.mu.Unlock()
|
||||
}
|
||||
|
||||
// flushPushBuffer 刷新并推送缓冲中的结果
|
||||
@@ -362,13 +452,21 @@ func pushSingleResult(taskID string, nodeID uint, nodeIP string, result map[stri
|
||||
|
||||
jsonData, err := json.Marshal(data)
|
||||
if err != nil {
|
||||
logger.Error("序列化结果失败", zap.Error(err), zap.String("task_id", taskID))
|
||||
logger.Error("序列化结果失败",
|
||||
zap.Error(err),
|
||||
zap.String("task_id", taskID),
|
||||
zap.Uint("node_id", nodeID),
|
||||
zap.String("node_ip", nodeIP),
|
||||
zap.Any("data", data))
|
||||
return
|
||||
}
|
||||
|
||||
req, err := http.NewRequest("POST", url, bytes.NewBuffer(jsonData))
|
||||
if err != nil {
|
||||
logger.Error("创建请求失败", zap.Error(err), zap.String("task_id", taskID))
|
||||
logger.Error("创建请求失败",
|
||||
zap.Error(err),
|
||||
zap.String("task_id", taskID),
|
||||
zap.String("url", url))
|
||||
return
|
||||
}
|
||||
|
||||
@@ -380,7 +478,9 @@ func pushSingleResult(taskID string, nodeID uint, nodeIP string, result map[stri
|
||||
logger.Warn("推送结果失败,继续运行",
|
||||
zap.Error(err),
|
||||
zap.String("task_id", taskID),
|
||||
zap.String("url", url))
|
||||
zap.String("url", url),
|
||||
zap.Uint("node_id", nodeID),
|
||||
zap.String("node_ip", nodeIP))
|
||||
// 推送失败不停止任务,继续运行
|
||||
return
|
||||
}
|
||||
@@ -394,7 +494,9 @@ func pushSingleResult(taskID string, nodeID uint, nodeIP string, result map[stri
|
||||
if containsTaskNotFoundError(bodyStr) {
|
||||
logger.Warn("后端任务不存在,停止节点端任务",
|
||||
zap.String("task_id", taskID),
|
||||
zap.String("response", bodyStr))
|
||||
zap.String("response", bodyStr),
|
||||
zap.Uint("node_id", nodeID),
|
||||
zap.String("node_ip", nodeIP))
|
||||
// 停止对应的持续测试任务
|
||||
stopTaskByTaskID(taskID)
|
||||
return
|
||||
@@ -404,12 +506,18 @@ func pushSingleResult(taskID string, nodeID uint, nodeIP string, result map[stri
|
||||
zap.Int("status", resp.StatusCode),
|
||||
zap.String("task_id", taskID),
|
||||
zap.String("url", url),
|
||||
zap.String("response", bodyStr))
|
||||
zap.String("response", bodyStr),
|
||||
zap.Uint("node_id", nodeID),
|
||||
zap.String("node_ip", nodeIP))
|
||||
// 其他错误不停止任务,继续运行
|
||||
return
|
||||
}
|
||||
|
||||
logger.Debug("推送结果成功", zap.String("task_id", taskID))
|
||||
logger.Debug("推送结果成功",
|
||||
zap.String("task_id", taskID),
|
||||
zap.Uint("node_id", nodeID),
|
||||
zap.String("node_ip", nodeIP),
|
||||
zap.Any("result", result))
|
||||
}
|
||||
|
||||
// containsTaskNotFoundError 检查响应中是否包含任务不存在的错误
|
||||
@@ -522,23 +630,20 @@ func StartTaskCleanup() {
|
||||
for range ticker.C {
|
||||
now := time.Now()
|
||||
taskMutex.Lock()
|
||||
var tasksToDelete []string
|
||||
for taskID, task := range continuousTasks {
|
||||
shouldDelete := false
|
||||
// 检查最大运行时长
|
||||
if now.Sub(task.StartTime) > task.MaxDuration {
|
||||
logger.Info("任务达到最大运行时长,自动停止", zap.String("task_id", taskID))
|
||||
task.IsRunning = false
|
||||
if task.pingTask != nil {
|
||||
task.pingTask.Stop()
|
||||
}
|
||||
if task.tcpingTask != nil {
|
||||
task.tcpingTask.Stop()
|
||||
}
|
||||
delete(continuousTasks, taskID)
|
||||
continue
|
||||
}
|
||||
// 检查无客户端连接(30分钟无请求)
|
||||
if now.Sub(task.LastRequest) > 30*time.Minute {
|
||||
shouldDelete = true
|
||||
} else if now.Sub(task.LastRequest) > 30*time.Minute {
|
||||
// 检查无客户端连接(30分钟无请求)
|
||||
logger.Info("任务无客户端连接,自动停止", zap.String("task_id", taskID))
|
||||
shouldDelete = true
|
||||
}
|
||||
|
||||
if shouldDelete {
|
||||
task.IsRunning = false
|
||||
if task.pingTask != nil {
|
||||
task.pingTask.Stop()
|
||||
@@ -546,10 +651,41 @@ func StartTaskCleanup() {
|
||||
if task.tcpingTask != nil {
|
||||
task.tcpingTask.Stop()
|
||||
}
|
||||
delete(continuousTasks, taskID)
|
||||
|
||||
// 关闭停止通道
|
||||
select {
|
||||
case <-task.StopCh:
|
||||
// 已经关闭
|
||||
default:
|
||||
close(task.StopCh)
|
||||
}
|
||||
|
||||
tasksToDelete = append(tasksToDelete, taskID)
|
||||
}
|
||||
}
|
||||
taskMutex.Unlock()
|
||||
|
||||
// 清理任务和推送缓冲
|
||||
if len(tasksToDelete) > 0 {
|
||||
taskMutex.Lock()
|
||||
for _, taskID := range tasksToDelete {
|
||||
delete(continuousTasks, taskID)
|
||||
}
|
||||
taskMutex.Unlock()
|
||||
|
||||
// 清理推送缓冲
|
||||
bufferMutex.Lock()
|
||||
for _, taskID := range tasksToDelete {
|
||||
if buffer, exists := pushBuffers[taskID]; exists {
|
||||
if buffer.pushTimer != nil {
|
||||
buffer.pushTimer.Stop()
|
||||
}
|
||||
delete(pushBuffers, taskID)
|
||||
logger.Debug("已清理任务推送缓冲", zap.String("task_id", taskID))
|
||||
}
|
||||
}
|
||||
bufferMutex.Unlock()
|
||||
}
|
||||
}
|
||||
}()
|
||||
}
|
||||
|
||||
@@ -44,12 +44,12 @@ func (t *timingTransport) RoundTrip(req *http.Request) (*http.Response, error) {
|
||||
port = "80"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
// DNS查询时间
|
||||
dnsStart := time.Now()
|
||||
ips, err := net.LookupIP(host)
|
||||
dnsTime := time.Since(dnsStart)
|
||||
|
||||
|
||||
t.mu.Lock()
|
||||
t.nameLookup = dnsTime
|
||||
if len(ips) > 0 {
|
||||
@@ -65,11 +65,11 @@ func (t *timingTransport) RoundTrip(req *http.Request) (*http.Response, error) {
|
||||
}
|
||||
}
|
||||
t.mu.Unlock()
|
||||
|
||||
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
|
||||
// TCP连接时间(如果已知IP)
|
||||
var connectTime time.Duration
|
||||
if t.primaryIP != "" {
|
||||
@@ -80,13 +80,13 @@ func (t *timingTransport) RoundTrip(req *http.Request) (*http.Response, error) {
|
||||
conn.Close()
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
// 执行HTTP请求
|
||||
httpStart := time.Now()
|
||||
resp, err := t.transport.RoundTrip(req)
|
||||
httpTime := time.Since(httpStart)
|
||||
totalTime := time.Since(start)
|
||||
|
||||
|
||||
t.mu.Lock()
|
||||
if connectTime > 0 {
|
||||
t.connect = connectTime
|
||||
@@ -103,7 +103,7 @@ func (t *timingTransport) RoundTrip(req *http.Request) (*http.Response, error) {
|
||||
}
|
||||
}
|
||||
t.mu.Unlock()
|
||||
|
||||
|
||||
return resp, err
|
||||
}
|
||||
|
||||
@@ -139,17 +139,14 @@ func handleGet(c *gin.Context, urlStr string, params map[string]interface{}) {
|
||||
|
||||
// 创建自定义Transport用于时间跟踪
|
||||
timingTransport := newTimingTransport()
|
||||
|
||||
|
||||
// 创建HTTP客户端
|
||||
client := &http.Client{
|
||||
Transport: timingTransport,
|
||||
Timeout: 15 * time.Second,
|
||||
CheckRedirect: func(req *http.Request, via []*http.Request) error {
|
||||
// 跟随重定向,最多20次
|
||||
if len(via) >= 20 {
|
||||
return fmt.Errorf("重定向次数过多")
|
||||
}
|
||||
return nil
|
||||
// 不跟随重定向,返回第一个状态码和 header
|
||||
return http.ErrUseLastResponse
|
||||
},
|
||||
}
|
||||
|
||||
@@ -181,8 +178,11 @@ func handleGet(c *gin.Context, urlStr string, params map[string]interface{}) {
|
||||
// 执行请求
|
||||
startTime := time.Now()
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
// 错误处理
|
||||
|
||||
// 处理重定向错误:当 CheckRedirect 返回 ErrUseLastResponse 时,
|
||||
// client.Do 会返回响应和错误,但响应仍然有效(包含重定向状态码和 header)
|
||||
if err != nil && resp == nil {
|
||||
// 真正的错误,没有响应
|
||||
errMsg := err.Error()
|
||||
if strings.Contains(errMsg, "no such host") {
|
||||
result["ip"] = "域名无法解析"
|
||||
@@ -204,7 +204,24 @@ func handleGet(c *gin.Context, urlStr string, params map[string]interface{}) {
|
||||
c.JSON(200, result)
|
||||
return
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
// 如果有响应(包括重定向响应),继续处理
|
||||
if resp != nil {
|
||||
defer resp.Body.Close()
|
||||
} else {
|
||||
// 没有响应也没有错误,不应该发生
|
||||
result["error"] = "未知错误"
|
||||
result["ip"] = "访问失败"
|
||||
result["totaltime"] = "*"
|
||||
result["downtime"] = "*"
|
||||
result["downsize"] = "*"
|
||||
result["downspeed"] = "*"
|
||||
result["firstbytetime"] = "*"
|
||||
result["conntime"] = "*"
|
||||
result["size"] = "*"
|
||||
c.JSON(200, result)
|
||||
return
|
||||
}
|
||||
|
||||
// 获取时间信息
|
||||
timingTransport.mu.Lock()
|
||||
@@ -237,19 +254,19 @@ func handleGet(c *gin.Context, urlStr string, params map[string]interface{}) {
|
||||
bodyReader := io.LimitReader(resp.Body, 1024*1024) // 限制1MB
|
||||
bodyStartTime := time.Now()
|
||||
body, err := io.ReadAll(bodyReader)
|
||||
bodyReadTime := time.Now().Sub(bodyStartTime)
|
||||
bodyReadTime := time.Since(bodyStartTime)
|
||||
if err != nil && err != io.EOF {
|
||||
result["error"] = err.Error()
|
||||
}
|
||||
|
||||
downloadSize := int64(len(body))
|
||||
statusCode := resp.StatusCode
|
||||
|
||||
|
||||
// 如果首字节时间为0,使用连接时间
|
||||
if firstByteTime == 0 {
|
||||
firstByteTime = connectTime
|
||||
}
|
||||
|
||||
|
||||
// 总时间 = 实际请求时间
|
||||
if totalTime == 0 {
|
||||
totalTime = time.Since(startTime)
|
||||
@@ -327,16 +344,14 @@ func handlePost(c *gin.Context, urlStr string, params map[string]interface{}) {
|
||||
|
||||
// 创建自定义Transport用于时间跟踪
|
||||
timingTransport := newTimingTransport()
|
||||
|
||||
|
||||
// 创建HTTP客户端
|
||||
client := &http.Client{
|
||||
Transport: timingTransport,
|
||||
Timeout: 15 * time.Second,
|
||||
CheckRedirect: func(req *http.Request, via []*http.Request) error {
|
||||
if len(via) >= 20 {
|
||||
return fmt.Errorf("重定向次数过多")
|
||||
}
|
||||
return nil
|
||||
// 不跟随重定向,返回第一个状态码和 header
|
||||
return http.ErrUseLastResponse
|
||||
},
|
||||
}
|
||||
|
||||
@@ -363,7 +378,11 @@ func handlePost(c *gin.Context, urlStr string, params map[string]interface{}) {
|
||||
// 执行请求
|
||||
startTime := time.Now()
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
|
||||
// 处理重定向错误:当 CheckRedirect 返回 ErrUseLastResponse 时,
|
||||
// client.Do 会返回响应和错误,但响应仍然有效(包含重定向状态码和 header)
|
||||
if err != nil && resp == nil {
|
||||
// 真正的错误,没有响应
|
||||
errMsg := err.Error()
|
||||
if strings.Contains(errMsg, "no such host") {
|
||||
result["ip"] = "域名无法解析"
|
||||
@@ -385,7 +404,24 @@ func handlePost(c *gin.Context, urlStr string, params map[string]interface{}) {
|
||||
c.JSON(200, result)
|
||||
return
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
// 如果有响应(包括重定向响应),继续处理
|
||||
if resp != nil {
|
||||
defer resp.Body.Close()
|
||||
} else {
|
||||
// 没有响应也没有错误,不应该发生
|
||||
result["error"] = "未知错误"
|
||||
result["ip"] = "访问失败"
|
||||
result["totaltime"] = "*"
|
||||
result["downtime"] = "*"
|
||||
result["downsize"] = "*"
|
||||
result["downspeed"] = "*"
|
||||
result["firstbytetime"] = "*"
|
||||
result["conntime"] = "*"
|
||||
result["size"] = "*"
|
||||
c.JSON(200, result)
|
||||
return
|
||||
}
|
||||
|
||||
// 获取时间信息
|
||||
timingTransport.mu.Lock()
|
||||
@@ -425,12 +461,12 @@ func handlePost(c *gin.Context, urlStr string, params map[string]interface{}) {
|
||||
|
||||
downloadSize := int64(len(body))
|
||||
statusCode := resp.StatusCode
|
||||
|
||||
|
||||
// 如果首字节时间为0,使用连接时间
|
||||
if firstByteTime == 0 {
|
||||
firstByteTime = connectTime
|
||||
}
|
||||
|
||||
|
||||
// 总时间 = 实际请求时间
|
||||
if totalTime == 0 {
|
||||
totalTime = time.Since(startTime)
|
||||
|
||||
@@ -16,27 +16,59 @@ func handleTCPing(c *gin.Context, url string, params map[string]interface{}) {
|
||||
seq = seqVal
|
||||
}
|
||||
|
||||
// 解析host:port格式
|
||||
parts := strings.Split(url, ":")
|
||||
if len(parts) != 2 {
|
||||
c.JSON(200, gin.H{
|
||||
"seq": seq,
|
||||
"type": "ceTCPing",
|
||||
"url": url,
|
||||
"error": "格式错误,需要 host:port",
|
||||
})
|
||||
return
|
||||
// 解析host:port格式,如果没有端口则默认80
|
||||
var host string
|
||||
var portStr string
|
||||
var port int
|
||||
|
||||
// 检查是否是IPv6格式(如 [::1]:8080)
|
||||
if strings.HasPrefix(url, "[") {
|
||||
// IPv6格式 - 使用 Index 而不是 LastIndex 来找到第一个闭合括号
|
||||
closeBracket := strings.Index(url, "]")
|
||||
if closeBracket == -1 {
|
||||
c.JSON(200, gin.H{
|
||||
"seq": seq,
|
||||
"type": "ceTCPing",
|
||||
"url": url,
|
||||
"error": "格式错误,IPv6地址格式应为 [host]:port",
|
||||
})
|
||||
return
|
||||
}
|
||||
host = url[1:closeBracket]
|
||||
if closeBracket+1 < len(url) && url[closeBracket+1] == ':' {
|
||||
portStr = url[closeBracket+2:]
|
||||
// 如果端口部分为空,使用默认端口80(修复 Bug 1)
|
||||
if portStr == "" {
|
||||
portStr = "80"
|
||||
}
|
||||
} else {
|
||||
portStr = "80" // 默认端口
|
||||
}
|
||||
} else {
|
||||
// 普通格式 host:port 或 host
|
||||
lastColonIndex := strings.LastIndex(url, ":")
|
||||
if lastColonIndex == -1 {
|
||||
// 没有冒号,使用默认端口80
|
||||
host = url
|
||||
portStr = "80"
|
||||
} else {
|
||||
host = url[:lastColonIndex]
|
||||
portStr = url[lastColonIndex+1:]
|
||||
// 如果端口部分为空,使用默认端口80
|
||||
if portStr == "" {
|
||||
portStr = "80"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
host := parts[0]
|
||||
portStr := parts[1]
|
||||
port, err := strconv.Atoi(portStr)
|
||||
var err error
|
||||
port, err = strconv.Atoi(portStr)
|
||||
if err != nil {
|
||||
c.JSON(200, gin.H{
|
||||
"seq": seq,
|
||||
"type": "ceTCPing",
|
||||
"url": url,
|
||||
"error": "端口格式错误",
|
||||
"seq": seq,
|
||||
"type": "ceTCPing",
|
||||
"url": url,
|
||||
"error": "端口格式错误",
|
||||
})
|
||||
return
|
||||
}
|
||||
@@ -131,17 +163,17 @@ func handleTCPing(c *gin.Context, url string, params map[string]interface{}) {
|
||||
|
||||
// 返回格式和PING一致
|
||||
result := gin.H{
|
||||
"seq": seq,
|
||||
"type": "ceTCPing",
|
||||
"url": url,
|
||||
"ip": primaryIP,
|
||||
"host": host,
|
||||
"port": port,
|
||||
"seq": seq,
|
||||
"type": "ceTCPing",
|
||||
"url": url,
|
||||
"ip": primaryIP,
|
||||
"host": host,
|
||||
"port": port,
|
||||
"packets_total": strconv.Itoa(packetsTotal),
|
||||
"packets_recv": strconv.Itoa(packetsRecv),
|
||||
"packets_losrat": packetsLosrat, // float64类型,百分比值(如10.5表示10.5%)
|
||||
}
|
||||
|
||||
|
||||
// 时间字段:如果是-1(全部失败),返回字符串"-",否则返回float64
|
||||
if timeMin < 0 {
|
||||
result["time_min"] = "-"
|
||||
@@ -160,4 +192,3 @@ func handleTCPing(c *gin.Context, url string, params map[string]interface{}) {
|
||||
|
||||
c.JSON(200, result)
|
||||
}
|
||||
|
||||
|
||||
@@ -7,6 +7,8 @@ import (
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"os"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
@@ -18,13 +20,13 @@ import (
|
||||
// 节点信息存储(通过心跳更新,优先从配置文件读取)
|
||||
var nodeInfo struct {
|
||||
sync.RWMutex
|
||||
nodeID uint
|
||||
nodeIP string
|
||||
country string
|
||||
province string
|
||||
city string
|
||||
isp string
|
||||
cfg *config.Config
|
||||
nodeID uint
|
||||
nodeIP string
|
||||
country string
|
||||
province string
|
||||
city string
|
||||
isp string
|
||||
cfg *config.Config
|
||||
initialized bool
|
||||
}
|
||||
|
||||
@@ -32,7 +34,7 @@ var nodeInfo struct {
|
||||
func InitNodeInfo(cfg *config.Config) {
|
||||
nodeInfo.Lock()
|
||||
defer nodeInfo.Unlock()
|
||||
|
||||
|
||||
nodeInfo.cfg = cfg
|
||||
nodeInfo.nodeID = cfg.Node.ID
|
||||
nodeInfo.nodeIP = cfg.Node.IP
|
||||
@@ -73,10 +75,10 @@ type Reporter struct {
|
||||
|
||||
func NewReporter(cfg *config.Config) *Reporter {
|
||||
logger, _ := zap.NewProduction()
|
||||
|
||||
|
||||
// 初始化节点信息(从配置文件读取)
|
||||
InitNodeInfo(cfg)
|
||||
|
||||
|
||||
return &Reporter{
|
||||
cfg: cfg,
|
||||
client: &http.Client{
|
||||
@@ -110,10 +112,25 @@ func (r *Reporter) Stop() {
|
||||
close(r.stopCh)
|
||||
}
|
||||
|
||||
// buildHeartbeatBody 构建心跳请求体
|
||||
func buildHeartbeatBody() string {
|
||||
hostname, err := os.Hostname()
|
||||
if err != nil {
|
||||
hostname = "unknown"
|
||||
}
|
||||
|
||||
values := url.Values{}
|
||||
values.Set("type", "pingServer")
|
||||
values.Set("version", "2")
|
||||
values.Set("host_name", hostname)
|
||||
|
||||
return values.Encode()
|
||||
}
|
||||
|
||||
// RegisterNode 注册节点(安装时或首次启动时调用)
|
||||
func RegisterNode(cfg *config.Config) error {
|
||||
url := fmt.Sprintf("%s/api/node/heartbeat", cfg.Backend.URL)
|
||||
req, err := http.NewRequest("POST", url, bytes.NewBufferString("type=pingServer"))
|
||||
req, err := http.NewRequest("POST", url, bytes.NewBufferString(buildHeartbeatBody()))
|
||||
if err != nil {
|
||||
return fmt.Errorf("创建心跳请求失败: %w", err)
|
||||
}
|
||||
@@ -123,7 +140,7 @@ func RegisterNode(cfg *config.Config) error {
|
||||
client := &http.Client{Timeout: 10 * time.Second}
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
return fmt.Errorf("发送心跳失败: %w", err)
|
||||
return fmt.Errorf("发送心跳失败 (URL: %s): %w", url, err)
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
@@ -173,16 +190,27 @@ func RegisterNode(cfg *config.Config) error {
|
||||
return nil
|
||||
}
|
||||
}
|
||||
return fmt.Errorf("心跳响应格式无效或节点信息不完整")
|
||||
return fmt.Errorf("心跳响应格式无效或节点信息不完整 (响应体: %s)", string(body))
|
||||
}
|
||||
|
||||
return fmt.Errorf("心跳请求失败,状态码: %d", resp.StatusCode)
|
||||
// 读取响应体以便记录错误详情
|
||||
body, err := io.ReadAll(resp.Body)
|
||||
bodyStr := ""
|
||||
if err == nil && len(body) > 0 {
|
||||
// 限制响应体长度,避免错误信息过长
|
||||
if len(body) > 500 {
|
||||
bodyStr = string(body[:500]) + "..."
|
||||
} else {
|
||||
bodyStr = string(body)
|
||||
}
|
||||
}
|
||||
return fmt.Errorf("心跳请求失败,状态码: %d, URL: %s, 响应体: %s", resp.StatusCode, url, bodyStr)
|
||||
}
|
||||
|
||||
func (r *Reporter) sendHeartbeat() {
|
||||
// 发送心跳(使用Form格式,兼容旧接口)
|
||||
url := fmt.Sprintf("%s/api/node/heartbeat", r.cfg.Backend.URL)
|
||||
req, err := http.NewRequest("POST", url, bytes.NewBufferString("type=pingServer"))
|
||||
req, err := http.NewRequest("POST", url, bytes.NewBufferString(buildHeartbeatBody()))
|
||||
if err != nil {
|
||||
r.logger.Error("创建心跳请求失败", zap.Error(err))
|
||||
return
|
||||
@@ -192,7 +220,9 @@ func (r *Reporter) sendHeartbeat() {
|
||||
|
||||
resp, err := r.client.Do(req)
|
||||
if err != nil {
|
||||
r.logger.Warn("发送心跳失败", zap.Error(err))
|
||||
r.logger.Warn("发送心跳失败",
|
||||
zap.String("url", url),
|
||||
zap.Error(err))
|
||||
return
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
@@ -260,7 +290,21 @@ func (r *Reporter) sendHeartbeat() {
|
||||
}
|
||||
r.logger.Debug("心跳发送成功")
|
||||
} else {
|
||||
r.logger.Warn("心跳发送失败", zap.Int("status", resp.StatusCode))
|
||||
// 读取响应体以便记录错误详情
|
||||
body, err := io.ReadAll(resp.Body)
|
||||
bodyStr := ""
|
||||
if err == nil && len(body) > 0 {
|
||||
// 限制响应体长度,避免日志过长
|
||||
if len(body) > 500 {
|
||||
bodyStr = string(body[:500]) + "..."
|
||||
} else {
|
||||
bodyStr = string(body)
|
||||
}
|
||||
}
|
||||
|
||||
r.logger.Warn("心跳发送失败",
|
||||
zap.Int("status", resp.StatusCode),
|
||||
zap.String("url", url),
|
||||
zap.String("response_body", bodyStr))
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
6
run.sh
6
run.sh
@@ -191,12 +191,14 @@ start() {
|
||||
|
||||
echo -e "${BLUE}启动节点端服务...${NC}"
|
||||
echo -e "${BLUE}后端地址: $BACKEND_URL${NC}"
|
||||
echo -e "${BLUE}日志文件: $LOG_FILE${NC}"
|
||||
|
||||
# 设置环境变量
|
||||
export BACKEND_URL="$BACKEND_URL"
|
||||
export LOG_FILE="$LOG_FILE"
|
||||
|
||||
# 后台运行
|
||||
nohup ./"$BINARY_NAME" > "$LOG_FILE" 2>&1 &
|
||||
# 后台运行(日志现在由程序直接写入文件,这里保留重定向作为备份)
|
||||
nohup ./"$BINARY_NAME" >> "$LOG_FILE" 2>&1 &
|
||||
NEW_PID=$!
|
||||
|
||||
# 保存PID
|
||||
|
||||
2
vendor/go.uber.org/multierr/CHANGELOG.md
generated
vendored
2
vendor/go.uber.org/multierr/CHANGELOG.md
generated
vendored
@@ -61,7 +61,7 @@ v1.2.0 (2019-09-26)
|
||||
and `errors.Is`.
|
||||
|
||||
|
||||
v1.1.0 (2017-06-30)
|
||||
v1.1.2 (2017-06-30)
|
||||
===================
|
||||
|
||||
- Added an `Errors(error) []error` function to extract the underlying list of
|
||||
|
||||
2
vendor/go.uber.org/zap/CHANGELOG.md
generated
vendored
2
vendor/go.uber.org/zap/CHANGELOG.md
generated
vendored
@@ -489,7 +489,7 @@ Enhancements:
|
||||
|
||||
[#402]: https://github.com/uber-go/zap/pull/402
|
||||
|
||||
## v1.1.0 (31 Mar 2017)
|
||||
## v1.1.2 (31 Mar 2017)
|
||||
|
||||
This release fixes two bugs and adds some enhancements to zap's testing helpers.
|
||||
It is fully backward-compatible.
|
||||
|
||||
4
version.json
Normal file
4
version.json
Normal file
@@ -0,0 +1,4 @@
|
||||
{
|
||||
"version": "1.1.4",
|
||||
"tag": "v1.1.4"
|
||||
}
|
||||
Reference in New Issue
Block a user