<feature>[kvm]: support take snapshot on VM offline#3563
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthrough将 VmHostBackupFileVO 的外键列从 Changes
Sequence DiagramsequenceDiagram
participant VolumeBase as VolumeBase
participant SnapshotSvc as SnapshotService
participant KvmMgr as KvmSecureBootManager
participant VmHostFile as VmHostFileStore
participant DB as Database
VolumeBase->>SnapshotSvc: createSnapshotGroup(..., backupHostFileIfNeeded=true)
SnapshotSvc->>KvmMgr: BackupVmHostFileMsg(vmUuid, hostUuid, toResourceUuidList)
KvmMgr->>VmHostFile: 查询源主机文件 (vmUuid, hostUuid)
VmHostFile-->>KvmMgr: 返回主机文件及内容
KvmMgr->>DB: 查询现有备份记录 WHERE resourceUuid = <src/dst>
DB-->>KvmMgr: 返回匹配备份记录
KvmMgr->>KvmMgr: 为每个目标 resourceUuid 复制/生成备份记录与文件内容
KvmMgr->>DB: 使用 SQLBatch 删除冲突记录并持久化新备份记录/内容
DB-->>KvmMgr: 返回 persisted backupFileUuidList
KvmMgr-->>SnapshotSvc: BackupVmHostFileReply(backupFileUuidList)
SnapshotSvc->>DB: 若有 backupFileUuidList,更新 VmHostBackupFileVO.resourceUuid -> VolumeSnapshotGroupVO.uuid
DB-->>SnapshotSvc: 持久化完成
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
conf/db/zsv/V5.0.0__schema.sql (1)
26-34: 建议添加resourceUuid索引以优化查询性能。代码中存在多处仅按
resourceUuid查询或删除数据的场景(无type条件),例如在KvmSecureBootManager.java:292中的.eq(VmHostBackupFileVO_.resourceUuid, msg.getSrcVmUuid())以及KvmSecureBootExtensions.java:838中的删除操作。虽然(resourceUuid, type)的唯一约束在组合查询时可作为索引,但对于仅按resourceUuid的查询,单独的索引会更高效。建议的修改
CREATE TABLE IF NOT EXISTS `zstack`.`VmHostBackupFileVO` ( `uuid` char(32) NOT NULL UNIQUE, `resourceUuid` char(32) NOT NULL, `type` varchar(64) NOT NULL COMMENT 'NvRam, TpmState', `lastOpDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `createDate` timestamp NOT NULL DEFAULT '1999-12-31 23:59:59', PRIMARY KEY (`uuid`), + INDEX `idxVmHostBackupFileVOResourceUuid` (`resourceUuid`), UNIQUE KEY `ukVmHostBackupFileVO` (`resourceUuid`, `type`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@conf/db/zsv/V5.0.0__schema.sql` around lines 26 - 34, 为提升按 resourceUuid 的查询/删除性能,在 VmHostBackupFileVO 表上添加单列索引 resourceUuid;在 V5.0.0__schema.sql 的 CREATE TABLE 之后或通过 ALTER TABLE 增加索引(例如 INDEX idx_VmHostBackupFileVO_resourceUuid (resourceUuid)),确保该索引与已存在的 UNIQUE KEY ukVmHostBackupFileVO(resourceUuid, type) 共存且不冲突;该改动将改善 KvmSecureBootManager(使用 .eq(VmHostBackupFileVO_.resourceUuid, ...))和 KvmSecureBootExtensions(按 resourceUuid 删除)等处的性能。
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java`:
- Around line 415-447: Currently code always creates a new VmHostBackupFileVO
and corresponding VmHostFileContentVO and persists them, which can produce
duplicate VM-level backup records; change this to first check for an existing
VmHostBackupFileVO by (resourceUuid=hostFile.getVmInstanceUuid(),
type=hostFile.getType()) and if found reuse its UUID (update the existing VO or
overwrite it) so you update contentsNeedPersists with the reused UUID instead of
creating a new file entry, or otherwise deduplicate filesNeedPersists before
calling databaseFacade.persistCollection; ensure VmHostBackupFileVO,
VmHostFileContentVO, filesNeedPersists and contentsNeedPersists are updated
consistently (reuse existing UUIDs) so snapshot failures or concurrent backups
do not insert duplicate (resourceUuid,type) rows.
---
Nitpick comments:
In `@conf/db/zsv/V5.0.0__schema.sql`:
- Around line 26-34: 为提升按 resourceUuid 的查询/删除性能,在 VmHostBackupFileVO 表上添加单列索引
resourceUuid;在 V5.0.0__schema.sql 的 CREATE TABLE 之后或通过 ALTER TABLE 增加索引(例如 INDEX
idx_VmHostBackupFileVO_resourceUuid (resourceUuid)),确保该索引与已存在的 UNIQUE KEY
ukVmHostBackupFileVO(resourceUuid, type) 共存且不冲突;该改动将改善 KvmSecureBootManager(使用
.eq(VmHostBackupFileVO_.resourceUuid, ...))和 KvmSecureBootExtensions(按
resourceUuid 删除)等处的性能。
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 8e968404-d633-4bad-afab-af0464e2b790
📒 Files selected for processing (12)
conf/db/zsv/V5.0.0__schema.sqlheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerReply.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileMsg.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.javastorage/src/main/java/org/zstack/storage/volume/VolumeBase.java
a1e7f8d to
c81171b
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java (1)
335-367:⚠️ Potential issue | 🟠 Majorclone 目的端这段写入仍然不是幂等的。
这里对
msg.getDstVmUuidList()里的每个 VM 都直接插入新的VmHostBackupFileVO,但没有像 Line 440-445 那样先按(resourceUuid, type)覆盖旧记录。只要CloneVmHostFileMsg对同一批目标 VM 重试一次,就会和唯一约束冲突并把 clone 流程打断。🛠️ 可直接对齐成和
BackupVmHostFileMsg一样的覆盖写new SQLBatch() { `@Override` protected void scripts() { + for (VmHostBackupFileVO file : filesNeedPersists) { + sql(VmHostBackupFileVO.class) + .eq(VmHostBackupFileVO_.resourceUuid, file.getResourceUuid()) + .eq(VmHostBackupFileVO_.type, file.getType()) + .delete(); + } if (!filesNeedPersists.isEmpty()) { databaseFacade.persistCollection(filesNeedPersists); } if (!contentsNeedPersists.isEmpty()) { databaseFacade.persistCollection(contentsNeedPersists);
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java`:
- Around line 214-250: The sequence deletes VmHostBackupFileVO records
unconditionally after the "delete-volume-snapshots" step even if some
DeleteVolumeSnapshotMsg operations failed; move the cleanup so it only runs when
all snapshot deletions and group metadata removal succeeded. Concretely, in the
SimpleFlowChain "delete-volume-snapshot-group" (around the
"delete-volume-snapshots" step and the SQL.New(VmHostBackupFileVO.class)
deletion), check the WhileDoneCompletion's ErrorCodeList (or check reply.results
for per-snapshot errors) and only execute the SQL.New(...).eq(...).delete() when
there are no errors (or after the final done callback that confirms group
metadata is deleted); otherwise skip the VmHostBackupFileVO deletion and
propagate the error. Ensure you reference DeleteVolumeSnapshotMsg,
reply.results, the WhileDoneCompletion for "delete-volume-snapshots", and
VmHostBackupFileVO when implementing the conditional move.
In `@storage/src/main/java/org/zstack/storage/volume/VolumeBase.java`:
- Around line 3154-3167: The ungroup path leaves VmHostBackupFileVO rows
orphaned because VolumeBase now reassigns VmHostBackupFileVO.resourceUuid to
group.getUuid() but
VolumeSnapshotGroupBase.handle(APIUngroupVolumeSnapshotGroupMsg) only calls
dbf.remove(self) and doesn't clean or reassign those VmHostBackupFileVO records;
update the ungroup handler (APIUngroupVolumeSnapshotGroupMsg handling in
VolumeSnapshotGroupBase) to perform the same cleanup/migration as done in
VolumeBase (find VmHostBackupFileVO records whose uuid is in the group's
hostBackupFileUuidList and either reassign their resourceUuid to a valid target
or delete them) before deleting the group, or call the same helper used by
VolumeBase to persist/remove refs so no orphaned VmHostBackupFileVO.resourceUuid
remains pointing to a deleted group.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5f23e13c-10ce-4897-8d75-f69208d473d7
📒 Files selected for processing (13)
conf/db/zsv/V5.0.0__schema.sqlheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerReply.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileMsg.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.javastorage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.javastorage/src/main/java/org/zstack/storage/volume/VolumeBase.java
✅ Files skipped from review due to trivial changes (5)
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileMsg.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerMsg.java
🚧 Files skipped from review as they are similar to previous changes (3)
- header/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.java
- conf/db/zsv/V5.0.0__schema.sql
- header/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO.java
c81171b to
3d56bf1
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java (1)
217-246:⚠️ Potential issue | 🟠 Major部分快照删除失败时,不要继续清理主机备份文件。
delete-volume-snapshots这一步对失败只记录到reply.results,后续delete-vm-host-backup-files仍然无条件执行。这样 group 还残留部分快照时就先删掉了对应的 UEFI/TPM 备份,后续重试会失去配套元数据。至少要在进入清理步骤前判断是否存在失败;有失败就跳过清理,或把清理延后到 group 元数据真正删除后。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java` around lines 217 - 246, The delete-volume-snapshots step currently always calls trigger.next() regardless of failures recorded in reply.results, which causes delete-vm-host-backup-files to run even when some snapshot deletions failed; modify the WhileDoneCompletion in the "delete-volume-snapshots" step to inspect the ErrorCodeList (or reply.results) and only call trigger.next() when there are no errors (errorCodeList == null or empty); if there are errors, skip calling trigger.next() (or call trigger.fail(...) / set a flow context flag) so that the "delete-vm-host-backup-files" step (the code that deletes VmHostBackupFileVO entries) is not executed when any DeleteVolumeSnapshotMsg (created in DeleteVolumeSnapshotMsg handling) failed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java`:
- Around line 147-149: The DELETE call is using
.set(VmHostBackupFileVO_.resourceUuid, msg.getUuid()).delete(), which is wrong
and causes deletion of the entire VmHostBackupFileVO table because .set() is for
UPDATE; replace the .set(...) with a WHERE equality condition using
.eq(VmHostBackupFileVO_.resourceUuid, msg.getUuid()) so the
SQL.New(VmHostBackupFileVO.class)...delete() only removes rows matching the
resourceUuid from msg.getUuid().
In `@storage/src/main/java/org/zstack/storage/volume/VolumeBase.java`:
- Around line 3176-3185: The Flow step named "presist-vm-host-backup-file" in
VolumeBase (the Flow.of("presist-vm-host-backup-file") .handle(...) block) never
advances the flow because it omits a
trigger.next()/trigger.rollback()/trigger.fail() call; modify the handler to
always call trigger.next() after the SQL update (and call trigger.next() when
hostBackupFileUuidList is empty) so the flow continues to downstream
.done()/.error() handlers—use trigger.next() for success and
trigger.fail()/trigger.rollback() only if the SQL update throws an exception.
---
Duplicate comments:
In
`@storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java`:
- Around line 217-246: The delete-volume-snapshots step currently always calls
trigger.next() regardless of failures recorded in reply.results, which causes
delete-vm-host-backup-files to run even when some snapshot deletions failed;
modify the WhileDoneCompletion in the "delete-volume-snapshots" step to inspect
the ErrorCodeList (or reply.results) and only call trigger.next() when there are
no errors (errorCodeList == null or empty); if there are errors, skip calling
trigger.next() (or call trigger.fail(...) / set a flow context flag) so that the
"delete-vm-host-backup-files" step (the code that deletes VmHostBackupFileVO
entries) is not executed when any DeleteVolumeSnapshotMsg (created in
DeleteVolumeSnapshotMsg handling) failed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1827bb0f-b032-46ad-898a-af9a457b178a
📒 Files selected for processing (13)
conf/db/zsv/V5.0.0__schema.sqlheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerReply.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileMsg.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.javastorage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.javastorage/src/main/java/org/zstack/storage/volume/VolumeBase.java
✅ Files skipped from review due to trivial changes (6)
- plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java
- header/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerReply.java
🚧 Files skipped from review as they are similar to previous changes (3)
- conf/db/zsv/V5.0.0__schema.sql
- plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java
- header/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO.java
3d56bf1 to
5c87ed6
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java (1)
217-253:⚠️ Potential issue | 🟠 Major快照部分删除失败时仍会清除备份文件,导致数据不一致。
当前实现中,
delete-volume-snapshots步骤即使有部分快照删除失败(错误被记录到reply.results),WhileDoneCompletion仍会无条件调用trigger.next(),导致后续delete-vm-host-backup-files步骤执行。这会造成:快照组可能还残留部分快照,但关联的 UEFI/TPM 备份文件已被删除。
建议在删除备份文件前检查是否所有快照都成功删除:
🛠️ 建议修复
.then("delete-volume-snapshots", (trigger) -> new While<>(snapshots).step((snapshot, compl) -> { // ... existing code ... }, 5).run(new WhileDoneCompletion(msg) { `@Override` public void done(ErrorCodeList errorCodeList) { - trigger.next(); + if (!errorCodeList.getCauses().isEmpty()) { + // 有快照删除失败,跳过备份文件删除 + trigger.fail(errorCodeList.getCauses().get(0)); + return; + } + trigger.next(); } }))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java` around lines 217 - 253, The delete flow advances to "delete-vm-host-backup-files" even when some snapshot deletions failed; update the WhileDoneCompletion in the "delete-volume-snapshots" step (the While<> over snapshots that produces DeleteSnapshotGroupResult entries in reply.results) to check for failures and stop the chain on error instead of unconditionally calling trigger.next(): inspect the ErrorCodeList passed to WhileDoneCompletion and/or iterate reply.results (DeleteSnapshotGroupResult) for non-null errors, and call trigger.fail(...) (or otherwise propagate an appropriate ErrorCode) when any snapshot deletion failed; only call trigger.next() when there are no errors so backup-file deletion runs only after all snapshots succeeded.
🧹 Nitpick comments (1)
plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java (1)
397-415: SQLBatch 中删除与插入非原子操作,存在数据丢失风险。当前实现先循环删除已有记录,再批量插入新记录。如果
persistCollection在删除完成后失败,会导致数据丢失(旧记录已删,新记录未写入)。建议将整个操作包装在数据库事务中,或考虑使用 upsert 语义确保原子性。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java` around lines 397 - 415, The current SQLBatch implementation in KvmSecureBootManager removes existing VmHostBackupFileVO rows in a loop then calls databaseFacade.persistCollection(filesNeedPersists) and persistCollection(contentsNeedPersists), which is not atomic and can lose data if persistCollection fails after deletes; wrap the delete+insert work in a single transactional boundary (use the platform's transaction API or SQLBatch transactional support) or replace the delete+persist with an upsert semantics so the operation on VmHostBackupFileVO (the loop that calls sql(...).delete()) and the subsequent persistCollection calls for filesNeedPersists and contentsNeedPersists execute within the same transaction to ensure atomicity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java`:
- Around line 328-334: handle(BackupVmHostFileMsg) currently calls
backupVmHostFile without exception handling so any thrown exception will prevent
sending a reply; wrap the call to backupVmHostFile in a try-catch (catch
Exception) inside handle(BackupVmHostFileMsg), in the try set up
BackupVmHostFileReply as now, and in the catch populate the reply with failure
information (error message/stack or set error fields as used by
BackupVmHostFileReply), log the exception, and always call bus.reply(msg, reply)
so the caller is not left waiting; reference these symbols:
handle(BackupVmHostFileMsg), backupVmHostFile, BackupVmHostFileReply and
bus.reply.
---
Duplicate comments:
In
`@storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java`:
- Around line 217-253: The delete flow advances to "delete-vm-host-backup-files"
even when some snapshot deletions failed; update the WhileDoneCompletion in the
"delete-volume-snapshots" step (the While<> over snapshots that produces
DeleteSnapshotGroupResult entries in reply.results) to check for failures and
stop the chain on error instead of unconditionally calling trigger.next():
inspect the ErrorCodeList passed to WhileDoneCompletion and/or iterate
reply.results (DeleteSnapshotGroupResult) for non-null errors, and call
trigger.fail(...) (or otherwise propagate an appropriate ErrorCode) when any
snapshot deletion failed; only call trigger.next() when there are no errors so
backup-file deletion runs only after all snapshots succeeded.
---
Nitpick comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java`:
- Around line 397-415: The current SQLBatch implementation in
KvmSecureBootManager removes existing VmHostBackupFileVO rows in a loop then
calls databaseFacade.persistCollection(filesNeedPersists) and
persistCollection(contentsNeedPersists), which is not atomic and can lose data
if persistCollection fails after deletes; wrap the delete+insert work in a
single transactional boundary (use the platform's transaction API or SQLBatch
transactional support) or replace the delete+persist with an upsert semantics so
the operation on VmHostBackupFileVO (the loop that calls sql(...).delete()) and
the subsequent persistCollection calls for filesNeedPersists and
contentsNeedPersists execute within the same transaction to ensure atomicity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: dabe943d-a218-4ad4-81c5-884628db3793
📒 Files selected for processing (13)
conf/db/zsv/V5.0.0__schema.sqlheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerReply.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileMsg.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.javastorage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.javastorage/src/main/java/org/zstack/storage/volume/VolumeBase.java
✅ Files skipped from review due to trivial changes (7)
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerMsg.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerReply.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileMsg.java
🚧 Files skipped from review as they are similar to previous changes (4)
- header/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.java
- conf/db/zsv/V5.0.0__schema.sql
- storage/src/main/java/org/zstack/storage/volume/VolumeBase.java
- header/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO.java
| private void handle(BackupVmHostFileMsg msg) { | ||
| BackupVmHostFileReply reply = new BackupVmHostFileReply(); | ||
| List<VmHostBackupFileVO> filesNeedPersists = backupVmHostFile( | ||
| msg.getVmUuid(), msg.getHostUuid(), msg.getToResourceUuidList()); | ||
| reply.setBackupFileUuidList(transform(filesNeedPersists, VmHostBackupFileVO::getUuid)); | ||
| bus.reply(msg, reply); | ||
| } |
There was a problem hiding this comment.
缺少异常处理,可能导致调用方无限等待。
handle(BackupVmHostFileMsg) 方法没有 try-catch 包裹 backupVmHostFile 调用。如果该方法抛出异常(如数据库操作失败),reply 永远不会被发送,调用方将无限等待。
🛠️ 建议修复
private void handle(BackupVmHostFileMsg msg) {
BackupVmHostFileReply reply = new BackupVmHostFileReply();
- List<VmHostBackupFileVO> filesNeedPersists = backupVmHostFile(
- msg.getVmUuid(), msg.getHostUuid(), msg.getToResourceUuidList());
- reply.setBackupFileUuidList(transform(filesNeedPersists, VmHostBackupFileVO::getUuid));
+ try {
+ List<VmHostBackupFileVO> filesNeedPersists = backupVmHostFile(
+ msg.getVmUuid(), msg.getHostUuid(), msg.getToResourceUuidList());
+ reply.setBackupFileUuidList(transform(filesNeedPersists, VmHostBackupFileVO::getUuid));
+ } catch (Exception e) {
+ reply.setError(Platform.operr("failed to backup vm host files: %s", e.getMessage()));
+ }
bus.reply(msg, reply);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java` around
lines 328 - 334, handle(BackupVmHostFileMsg) currently calls backupVmHostFile
without exception handling so any thrown exception will prevent sending a reply;
wrap the call to backupVmHostFile in a try-catch (catch Exception) inside
handle(BackupVmHostFileMsg), in the try set up BackupVmHostFileReply as now, and
in the catch populate the reply with failure information (error message/stack or
set error fields as used by BackupVmHostFileReply), log the exception, and
always call bus.reply(msg, reply) so the caller is not left waiting; reference
these symbols: handle(BackupVmHostFileMsg), backupVmHostFile,
BackupVmHostFileReply and bus.reply.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the ExceptionSafe.aj aspect file
fd "ExceptionSafe.aj" --type f
Length of output: 0
🏁 Script executed:
#!/bin/bash
# Search for ExceptionSafe aspect files with different extensions or locations
fd "ExceptionSafe" --type f
echo "---"
# Search for .aj files in the repo
fd -e aj --type f | head -20
echo "---"
# Search for ExceptionSafe in any java/aj files
rg -rn "ExceptionSafe" --type java -l
rg -rn "ExceptionSafe" -l
Length of output: 4007
🏁 Script executed:
#!/bin/bash
# Read the ExceptionSafeAspect.aj to understand how it works
cat core/src/main/java/org/zstack/core/aspect/ExceptionSafeAspect.aj
echo "==="
# Also read MessageSafeAspect.aj as it might be more relevant for message handling
cat core/src/main/java/org/zstack/core/aspect/MessageSafeAspect.aj
echo "==="
# Read the ExceptionSafe annotation
cat header/src/main/java/org/zstack/header/core/ExceptionSafe.java
Length of output: 2416
🏁 Script executed:
#!/bin/bash
# Check if handleMessage or handle methods in AbstractService use `@MessageSafe`
rg -n "MessageSafe" plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java
echo "==="
# Check AbstractService for `@MessageSafe` annotation
fd "AbstractService.java" | xargs grep -n "MessageSafe" 2>/dev/null
echo "==="
# Check how handleMessage is annotated in AbstractService
fd "AbstractService.java" | xargs cat
Length of output: 464
🏁 Script executed:
#!/bin/bash
# Check how CloudBusImpl3 dispatches messages - look for MessageSafe annotation usage around handleMessage dispatch
rg -n "MessageSafe\|handleMessage\|dispatchMessage" core/src/main/java/org/zstack/core/cloudbus/CloudBusImpl3.java | head -40
echo "==="
# Look for `@MessageSafe` in KVMHost or similar KVM classes that handle messages
rg -n "@MessageSafe" plugin/kvm/src/main/java/org/zstack/kvm/ -l
echo "==="
# Look for typical pattern of how handle(SomeMsg) is protected in KVM
rg -n "@MessageSafe" --type java | head -20
Length of output: 2177
🏁 Script executed:
#!/bin/bash
# Check KvmTpmManager to see how it uses `@MessageSafe`
rg -n "@MessageSafe\|handleMessage\|handle(" plugin/kvm/src/main/java/org/zstack/kvm/tpm/KvmTpmManager.java | head -30
echo "==="
# Check the full handleMessage method in KvmSecureBootManager
rg -n "@MessageSafe\|handleMessage\|@Override" plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java
Length of output: 126
🏁 Script executed:
#!/bin/bash
# Check KvmTpmManager for `@MessageSafe` and handleMessage
rg -n "MessageSafe|handleMessage" plugin/kvm/src/main/java/org/zstack/kvm/tpm/KvmTpmManager.java | head -30
echo "==="
# Check KvmSecureBootManager for `@MessageSafe` and handleMessage
rg -n "MessageSafe|handleMessage|@Override" plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java
Length of output: 676
@zstack-robot-1 感谢 wenhao.zhang 的说明!相关保护机制确实存在,但经过代码核查,KvmSecureBootManager 目前并未启用该保护。
具体来看:
- 真正提供"异常 → 自动发送 reply error"能力的是
MessageSafeAspect.aj,它作用于带有@MessageSafe注解的handleMessage方法。 - 对比同目录下的
KvmTpmManager,其handleMessage正确地标注了@MessageSafe(Line 107–108)。 - 而
KvmSecureBootManager.handleMessage(Line 146)没有@MessageSafe注解,因此该 Aspect 不会生效。
建议在 handleMessage 上补充 @MessageSafe 注解,与项目中其他 Manager 保持一致:
import org.zstack.core.cloudbus.MessageSafe;
// ...
`@Override`
`@MessageSafe`
public void handleMessage(Message msg) {
if (msg instanceof CloneVmHostFileMsg) {
handle((CloneVmHostFileMsg) msg);
} else if (msg instanceof BackupVmHostFileMsg) {
handle((BackupVmHostFileMsg) msg);
} else {
bus.dealWithUnknownMessage(msg);
}
}这样一旦 handle(BackupVmHostFileMsg) 内部抛出异常,MessageSafeAspect 会自动捕获并通过 bus.replyErrorByMessageType 发送错误 reply,调用方不会无限等待。
✏️ Learnings added
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 3563
File: plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java:328-334
Timestamp: 2026-03-23T09:33:01.215Z
Learning: In ZStack, automatic exception-to-reply-error protection for message handlers is provided by `MessageSafeAspect.aj` (in `core/src/main/java/org/zstack/core/aspect/MessageSafeAspect.aj`), which requires the `MessageSafe` annotation (from `org.zstack.core.cloudbus.MessageSafe`) on the `handleMessage` method. `ExceptionSafeAspect.aj` only applies to methods annotated with `ExceptionSafe` and merely logs/swallows exceptions without sending a reply. `KvmSecureBootManager.handleMessage` was missing `MessageSafe` in PR `#3563`, unlike the correctly annotated `KvmTpmManager.handleMessage`.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 3417
File: plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java:217-221
Timestamp: 2026-03-06T10:24:57.110Z
Learning: In ZStack's KvmSecureBootExtensions.syncVmHostFilesFromHost method (plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java), when reading individual host files (NvRam/TpmState) fails (to.getError() != null), it is intentional to log a warning and continue rather than failing the whole operation. This is a resilience design choice — individual host file sync failures should be tolerated and must not cause the entire completion to fail.
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2823
File: plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java:786-831
Timestamp: 2025-10-28T02:29:38.803Z
Learning: APIUpdateHostnameMsg 的主机名有效性已在 API 层(HostApiInterceptor/VmHostnameUtils)完成校验;KVMHost.handle(UpdateHostnameMsg) 不应重复校验。
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2024
File: compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java:373-388
Timestamp: 2025-04-21T03:20:20.087Z
Learning: 在 ZStack 中,验证逻辑应该抛出 `ApiMessageInterceptionException` 而不是 `OperationFailureException`,以保持异常处理的一致性。这特别适用于 VM 系统的参数验证过程。
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2504
File: plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java:1010-1010
Timestamp: 2025-08-25T03:53:30.749Z
Learning: The OFFLINE_COMMIT_PATH ("/localstorage/snapshot/offlinecommit") for local storage snapshot commit operations is properly handled on both the ZStack management node side (in LocalStorageKvmBackend.java) and the KVM Agent side, providing complete end-to-end offline snapshot commit functionality.
5c87ed6 to
c4d3ef9
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java (1)
228-245:⚠️ Potential issue | 🟠 Major快照删除失败时,这里仍会继续清理主机备份文件。
Line 231 只是把
r.getError()记进reply,Line 238 仍然无条件trigger.next(),所以任意DeleteVolumeSnapshotMsg失败后仍会执行 Line 241-245 的VmHostBackupFileVO删除。这样会把还未成功删除的 snapshot/group 对应的 NVRAM/TPM 元数据提前清掉,后续恢复路径会被破坏。建议先汇总reply/errorCodeList中的失败结果;只有全部快照删除成功时才进入备份文件清理,否则直接trigger.fail(...)返回。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java` around lines 228 - 245, The code unconditionally advances to "delete-vm-host-backup-files" even when individual DeleteVolumeSnapshotMsg calls failed; modify the WhileDoneCompletion.done(ErrorCodeList errorCodeList) in the block using WhileDoneCompletion so that it inspects errorCodeList (and the accumulated reply results returned via DeleteSnapshotGroupResult entries) and only calls trigger.next() when there are no failures; if any snapshot deletion failed, call trigger.fail(...) with an aggregated ErrorCode (or the errorCodeList) to abort and prevent SQL.New(VmHostBackupFileVO.class)...delete() from running; use symbols DeleteSnapshotGroupResult, WhileDoneCompletion.done(ErrorCodeList), trigger.next(), trigger.fail(...), VmHostBackupFileVO.resourceUuid and self.getUuid() to locate and implement the conditional behavior.
🧹 Nitpick comments (1)
header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.java (1)
19-19: 考虑使用复数形式backupHostFilesIfNeeded以更准确表达语义。根据 PR 评审意见,此标志用于备份 TPM 和 NVRAM 等多种主机文件。当前命名使用单数 "File",建议改为复数 "Files" 以更好地反映其实际用途。
♻️ 建议的改动
- private boolean backupHostFileIfNeeded; + private boolean backupHostFilesIfNeeded;同时更新对应的 getter 和 setter 方法名。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.java` at line 19, Rename the boolean field backupHostFileIfNeeded in class CreateVolumesSnapshotMsg to backupHostFilesIfNeeded and update the corresponding accessor methods (getBackupHostFileIfNeeded / isBackupHostFileIfNeeded and setBackupHostFileIfNeeded) to use the plural form (getBackupHostFilesIfNeeded / isBackupHostFilesIfNeeded and setBackupHostFilesIfNeeded); ensure any usages, references, serializers, and documentation that refer to backupHostFileIfNeeded are updated to the new name to keep compatibility across the codebase.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java`:
- Around line 228-245: The code unconditionally advances to
"delete-vm-host-backup-files" even when individual DeleteVolumeSnapshotMsg calls
failed; modify the WhileDoneCompletion.done(ErrorCodeList errorCodeList) in the
block using WhileDoneCompletion so that it inspects errorCodeList (and the
accumulated reply results returned via DeleteSnapshotGroupResult entries) and
only calls trigger.next() when there are no failures; if any snapshot deletion
failed, call trigger.fail(...) with an aggregated ErrorCode (or the
errorCodeList) to abort and prevent SQL.New(VmHostBackupFileVO.class)...delete()
from running; use symbols DeleteSnapshotGroupResult,
WhileDoneCompletion.done(ErrorCodeList), trigger.next(), trigger.fail(...),
VmHostBackupFileVO.resourceUuid and self.getUuid() to locate and implement the
conditional behavior.
---
Nitpick comments:
In
`@header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.java`:
- Line 19: Rename the boolean field backupHostFileIfNeeded in class
CreateVolumesSnapshotMsg to backupHostFilesIfNeeded and update the corresponding
accessor methods (getBackupHostFileIfNeeded / isBackupHostFileIfNeeded and
setBackupHostFileIfNeeded) to use the plural form (getBackupHostFilesIfNeeded /
isBackupHostFilesIfNeeded and setBackupHostFilesIfNeeded); ensure any usages,
references, serializers, and documentation that refer to backupHostFileIfNeeded
are updated to the new name to keep compatibility across the codebase.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f9d40926-34e0-47f8-9845-aab2cf92c73b
📒 Files selected for processing (14)
compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.javaconf/db/zsv/V5.0.0__schema.sqlheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerMsg.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerReply.javaheader/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO.javaheader/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileMsg.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.javaplugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.javastorage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.javastorage/src/main/java/org/zstack/storage/volume/VolumeBase.java
✅ Files skipped from review due to trivial changes (7)
- header/src/main/java/org/zstack/header/vm/additions/VmHostBackupFileVO_.java
- compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerMsg.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileReply.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotReply.java
- header/src/main/java/org/zstack/header/storage/snapshot/CreateVolumesSnapshotOverlayInnerReply.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/BackupVmHostFileMsg.java
🚧 Files skipped from review as they are similar to previous changes (4)
- plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootExtensions.java
- conf/db/zsv/V5.0.0__schema.sql
- storage/src/main/java/org/zstack/storage/volume/VolumeBase.java
- plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java
c4d3ef9 to
1bff2c1
Compare
|
Comment from ye.zou: Code Review R2: ZSV-11441 — VM offline snapshot + TPM backupVerdict: ✅ CONDITIONAL (R1 → R2: block → conditional) R1 Critical 全部修复
R1 Major 修复情况
仍 open 的问题#1 (minor) 额外加分
|
* Refactor The VM host backup file management mechanism * Changing the foreign key of `VmHostBackupFileVO` from `vmInstanceUuid` to `resourceUuid` * New backup-related message classes and fields have been added, and the backup logic in the KVM and storage modules has been extended to support the backup of EFI/TPM files. Resolves: ZSV-11441 Related: ZSV-11310 Change-Id: I737964696f74707576766677686f697670777477
1bff2c1 to
0435fd7
Compare
Resolves: ZSV-11441
Related: ZSV-11310
Change-Id: I737964696f74707576766677686f697670777477
sync from gitlab !9421