<fix>[kvm]: fail VM start when host file changed but sync fails#3741
<fix>[kvm]: fail VM start when host file changed but sync fails#3741zstack-robot-2 wants to merge 1 commit into
Conversation
Walkthrough在 KvmSecureBootExtensions 中调整 VM 主机文件同步流程:新增 Changes
Sequence Diagram(s)sequenceDiagram
participant Flow as FlowEngine
participant Origin as OriginHost
participant Dest as DestHost
participant DB as Database
Flow->>DB: 读取本地缓存 `VmHostFileVO`
Flow->>Origin: 请求读取 VM 主机文件
alt 读取成功
Origin-->>Flow: 返回 VmHostFile
Flow->>Flow: context.syncFromOriginHostSuccess = true
Flow->>Flow: trigger.next()
else 读取失败
Origin-->>Flow: 返回错误
Flow->>DB: 检查缓存 `VmHostFileVO.changeDate`
alt changeDate != null
Flow->>Flow: 返回 operr(...) 并失败流程
else
Flow->>Flow: 记录警告并 trigger.next()
Flow->>DB: 若缓存存在但无内容,则将 context.vmHostFile 置空
end
end
Flow->>Dest: 重新读取或写入至目标主机
alt 写入成功 且 context.vmBackupFileVO != null
Flow->>DB: 删除对应 VmHostBackupFileVO 记录
end
评估代码审查工作量🎯 4 (复杂) | ⏱️ ~60 分钟 诗句
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
7382aa2 to
6215e6d
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java (1)
321-326:⚠️ Potential issue | 🟠 Major不要在源主机同步成功前就把
context.vmHostFile视为可用。Line 321 先写入
context.vmHostFile后,read-vm-host-file-from-backup在 Lines 379-380 会被直接跳过。这样一旦源主机同步失败但changeDate == null,而 MN 里又没有对应的VmHostFileContentVO,Lines 425-429 只会记录skip并继续,最终可能带着缺失的 NVRAM/TPM 状态继续启动。这里应只在确认同步成功,或确认 MN 已有可用内容后,再设置context.vmHostFile。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java` around lines 321 - 326, The code assigns context.vmHostFile too early (in KvmSecureBootExtensions) causing read-vm-host-file-from-backup to be skipped and potentially allowing boot with missing NVRAM/TPM; change to first fetch into a local VmHostFileVO (e.g., vmHostFileLocal) using Q.New(VmHostFileVO.class)...find(), then perform the backup-read and existence checks (inspect changeDate and lookup for corresponding VmHostFileContentVO) and only after confirming the source sync succeeded or that MN already has valid VmHostFileContentVO, assign context.vmHostFile = vmHostFileLocal; ensure the error/skip branch (the logic around changeDate and VmHostFileContentVO presence) fails or retries instead of leaving context.vmHostFile set.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java`:
- Around line 321-326: The code assigns context.vmHostFile too early (in
KvmSecureBootExtensions) causing read-vm-host-file-from-backup to be skipped and
potentially allowing boot with missing NVRAM/TPM; change to first fetch into a
local VmHostFileVO (e.g., vmHostFileLocal) using
Q.New(VmHostFileVO.class)...find(), then perform the backup-read and existence
checks (inspect changeDate and lookup for corresponding VmHostFileContentVO) and
only after confirming the source sync succeeded or that MN already has valid
VmHostFileContentVO, assign context.vmHostFile = vmHostFileLocal; ensure the
error/skip branch (the logic around changeDate and VmHostFileContentVO presence)
fails or retries instead of leaving context.vmHostFile set.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 23027116-6022-4ee5-9890-a782a4ce76dc
📒 Files selected for processing (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java
|
Comment from ye.zou: Code Review
结论: BLOCK 🚫 回归风险: 高 — backup fallback 路径被无意中破坏,影响所有依赖 backup 恢复 NvRam/TpmState 的场景(host 宕机后 HA 启动、跨 host 启动等)。 修复建议: 核心问题是
|
6215e6d to
8837cd2
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java (1)
321-356:⚠️ Potential issue | 🔴 Critical同步成功后要重新查询最新的
VmHostFileVO。Line 321 在 sync 之前就把
context.vmHostFile固定成了旧记录,但同文件 Line 709-710 已经说明这类 sync 会创建新记录。现在 Line 355 成功后直接trigger.next(),后面的 Line 440 仍会按旧 UUID 取内容,可能把旧缓存写回目标主机,等于把这次成功 sync 的结果丢掉了。建议在成功分支重新查询最新VmHostFileVO,并同步更新context.path。🛠 建议修改
bus.send(syncMsg, new CloudBusCallBack(trigger) { `@Override` public void run(MessageReply reply) { if (reply.isSuccess()) { + context.vmHostFile = Q.New(VmHostFileVO.class) + .eq(VmHostFileVO_.type, context.type) + .eq(VmHostFileVO_.vmInstanceUuid, context.vmUuid) + .orderByDesc(VmHostFileVO_.lastOpDate) + .limit(1) + .find(); + if (context.vmHostFile == null) { + trigger.fail(operr("failed to find latest %s vm host file for VM[vmUuid=%s] after sync", + context.type, context.vmUuid)); + return; + } + context.path = context.vmHostFile.getPath(); context.syncFromOriginHostSuccess = true; trigger.next(); return; }另外建议补一条回归用例,覆盖“origin sync 成功并生成新 VO,后续必须使用新内容写目标主机”的场景。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java` around lines 321 - 356, The callback success path currently leaves context.vmHostFile pointing at the pre-sync record, so after reply.isSuccess() re-query the latest VmHostFileVO for context.type and context.vmUuid (e.g. Q.New(VmHostFileVO.class).eq(VmHostFileVO_.type, context.type).eq(VmHostFileVO_.vmInstanceUuid, context.vmUuid).orderByDesc(VmHostFileVO_.lastOpDate).limit(1).find()), update context.vmHostFile and context.path from that fresh VO, then continue with trigger.next(); also add a regression test covering “origin sync creates new VO and subsequent write uses new VO/path” to prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java`:
- Line 521: The inline comment in KvmSecureBootExtensions.java (around the code
handling VmHostFileVO/VmHostBackupFileVO in the method where the latest host
file is ensured) contains two typos: change "has" to "have" and "lastest" to
"latest" so the comment reads something like "// now we have latest
VmHostFileVO, and VmHostBackupFileVO should be cleaned." Update the comment near
the code that references VmHostFileVO and VmHostBackupFileVO in class
KvmSecureBootExtensions to improve readability.
---
Outside diff comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java`:
- Around line 321-356: The callback success path currently leaves
context.vmHostFile pointing at the pre-sync record, so after reply.isSuccess()
re-query the latest VmHostFileVO for context.type and context.vmUuid (e.g.
Q.New(VmHostFileVO.class).eq(VmHostFileVO_.type,
context.type).eq(VmHostFileVO_.vmInstanceUuid,
context.vmUuid).orderByDesc(VmHostFileVO_.lastOpDate).limit(1).find()), update
context.vmHostFile and context.path from that fresh VO, then continue with
trigger.next(); also add a regression test covering “origin sync creates new VO
and subsequent write uses new VO/path” to prevent regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f647ef3a-8e23-4339-8996-ba6df56a6083
📒 Files selected for processing (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java
8837cd2 to
f1e4687
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java`:
- Around line 519-526: In KvmSecureBootExtensions.run, you're only deleting the
VmHostBackupFileVO which leaves associated VmHostFileContentVO rows orphaned;
update the flow to also delete VmHostFileContentVO entries tied to
context.vmBackupFileVO.getUuid() (mirroring the pattern in
VmHostFileTracker.deleteBackupFileFromDb) by issuing a
SQL.New(VmHostFileContentVO.class).eq(...,
context.vmBackupFileVO.getUuid()).delete() before or alongside deleting
VmHostBackupFileVO, ensuring both deletions occur within this flow before
calling trigger.next().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c3ab4fe3-a3d4-40bc-bc79-22b638e675ed
📒 Files selected for processing (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java
| @Override | ||
| public void run(FlowTrigger trigger, Map data) { | ||
| // now we have latest VmHostFileVO, and VmHostBackupFileVO should be cleaned. | ||
| SQL.New(VmHostBackupFileVO.class) | ||
| .eq(VmHostBackupFileVO_.uuid, context.vmBackupFileVO.getUuid()) | ||
| .delete(); | ||
| trigger.next(); | ||
| } |
There was a problem hiding this comment.
缺少关联 VmHostFileContentVO 的删除。
仅删除 VmHostBackupFileVO 会导致关联的 VmHostFileContentVO 记录孤立在数据库中。根据 VmHostFileTracker.deleteBackupFileFromDb 的实现模式(见相关代码片段 3),删除备份文件时应同时删除其内容记录。
🐛 建议修复:同时删除关联内容
`@Override`
public void run(FlowTrigger trigger, Map data) {
// now we have latest VmHostFileVO, and VmHostBackupFileVO should be cleaned.
- SQL.New(VmHostBackupFileVO.class)
- .eq(VmHostBackupFileVO_.uuid, context.vmBackupFileVO.getUuid())
- .delete();
+ new SQLBatch() {
+ `@Override`
+ protected void scripts() {
+ sql(VmHostFileContentVO.class)
+ .eq(VmHostFileContentVO_.uuid, context.vmBackupFileVO.getUuid())
+ .delete();
+ sql(VmHostBackupFileVO.class)
+ .eq(VmHostBackupFileVO_.uuid, context.vmBackupFileVO.getUuid())
+ .delete();
+ }
+ }.execute();
trigger.next();
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java`
around lines 519 - 526, In KvmSecureBootExtensions.run, you're only deleting the
VmHostBackupFileVO which leaves associated VmHostFileContentVO rows orphaned;
update the flow to also delete VmHostFileContentVO entries tied to
context.vmBackupFileVO.getUuid() (mirroring the pattern in
VmHostFileTracker.deleteBackupFileFromDb) by issuing a
SQL.New(VmHostFileContentVO.class).eq(...,
context.vmBackupFileVO.getUuid()).delete() before or alongside deleting
VmHostBackupFileVO, ensuring both deletions occur within this flow before
calling trigger.next().
When syncing VM host file from the origin host fails, check whether the file has been updated (changeDate != null). If so, it is unsafe to start the VM with stale cached content, so fail the flow instead of silently continuing. Also clean up VmHostBackupFileVO after a successful write to avoid stale backup records. Resolves: ZSV-11675 Related: ZSV-11310 Change-Id: I6a73636169727a6f7265766b696c6f6d70747676
f1e4687 to
140a351
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java (1)
519-526:⚠️ Potential issue | 🟡 Minor缺少关联
VmHostFileContentVO的删除。当前仅删除
VmHostBackupFileVO,会导致关联的VmHostFileContentVO记录在数据库中变成孤立数据。根据write-vm-host-file-to-dest-host步骤(第 440-444 行)的查询逻辑,VmHostFileContentVO.uuid与VmHostBackupFileVO.uuid共享相同的值。🐛 建议修复:同时删除关联内容
`@Override` public void run(FlowTrigger trigger, Map data) { // now we have latest VmHostFileVO, and VmHostBackupFileVO should be cleaned. - SQL.New(VmHostBackupFileVO.class) - .eq(VmHostBackupFileVO_.uuid, context.vmBackupFileVO.getUuid()) - .delete(); + new SQLBatch() { + `@Override` + protected void scripts() { + sql(VmHostFileContentVO.class) + .eq(VmHostFileContentVO_.uuid, context.vmBackupFileVO.getUuid()) + .delete(); + sql(VmHostBackupFileVO.class) + .eq(VmHostBackupFileVO_.uuid, context.vmBackupFileVO.getUuid()) + .delete(); + } + }.execute(); trigger.next(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java` around lines 519 - 526, The flow currently only deletes VmHostBackupFileVO and leaves orphaned VmHostFileContentVO rows; update the run(...) block in KvmSecureBootExtensions (the anonymous Flow implementation) to also delete VmHostFileContentVO entries matching context.vmBackupFileVO.getUuid() by adding a SQL.New(VmHostFileContentVO.class).eq(VmHostFileContentVO_.uuid, context.vmBackupFileVO.getUuid()).delete() call alongside the existing SQL.New(VmHostBackupFileVO.class)...delete() so both the backup file and its associated file content are removed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java`:
- Around line 519-526: The flow currently only deletes VmHostBackupFileVO and
leaves orphaned VmHostFileContentVO rows; update the run(...) block in
KvmSecureBootExtensions (the anonymous Flow implementation) to also delete
VmHostFileContentVO entries matching context.vmBackupFileVO.getUuid() by adding
a SQL.New(VmHostFileContentVO.class).eq(VmHostFileContentVO_.uuid,
context.vmBackupFileVO.getUuid()).delete() call alongside the existing
SQL.New(VmHostBackupFileVO.class)...delete() so both the backup file and its
associated file content are removed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 92f78a3f-bb57-4041-a32c-0ea6821b45ed
📒 Files selected for processing (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java
When syncing VM host file from the origin host fails, check
whether the file has been updated (changeDate != null). If so,
it is unsafe to start the VM with stale cached content, so fail
the flow instead of silently continuing.
Also clean up VmHostBackupFileVO after a successful write to
avoid stale backup records.
Resolves: ZSV-11675
Related: ZSV-11310
Change-Id: I6a73636169727a6f7265766b696c6f6d70747676
sync from gitlab !9609