Skip to content

[AI] Add host model cache API surface for 5.5.22#4325

Open
MatheMatrix wants to merge 11 commits into
feature-5.5.22-aiosfrom
sync/ye.zou/feature/ZSTAC-85984-host-cache-core-5.5.22-aios@@2
Open

[AI] Add host model cache API surface for 5.5.22#4325
MatheMatrix wants to merge 11 commits into
feature-5.5.22-aiosfrom
sync/ye.zou/feature/ZSTAC-85984-host-cache-core-5.5.22-aios@@2

Conversation

@MatheMatrix

Copy link
Copy Markdown
Owner

Summary

  • Backport the host model cache core API/schema/SDK/testlib prerequisites to feature-5.5.22-aios on the grouped @@2 branch.
  • Fix SDK generation for host cache inventory collection fields so generated SDK output is repeatable.

Root Cause

  • Premium host model cache control-plane tests depend on core API surface and schema helpers that were only present in the later AI feature branch.
  • Host cache reply inventory fields need typed SDK collection output, but the SDK generator previously emitted raw collection types and hand-written SDK edits were overwritten by ./runMavenProfile sdk.

Change

  • Add host cache API/schema support and generated helper outputs required by premium control-plane behavior.
  • Add an opt-in @SDKGeneric marker and SDK generator support for typed collection output.
  • Regenerate/fix host cache SDK output and remove non-repeatable hand-written SDK validation.

Verification

  • Used together with premium branch in PR docker.
  • ./runMavenProfile premium passed after rebuilding from :ai through test-premium.
  • Premium focused AITest host cache cases passed with this core branch mounted.
  • AccessKeyBasicCase passed with this core branch mounted.
  • mvn -P premium -pl premium/plugin-premium/ai -am install -DskipTests -Djacoco.skip=true passed.
  • mvn -pl sdk -DskipTests -Djacoco.skip=true compile passed.
  • Earlier clean ./runMavenProfile sdk passed; a later rerun after full local install hit a local zstack-iam2 VerifyError during reflections scan, not an SDK compile error.

Risk

  • Moderate: API/schema prerequisite backport for AI branch. Premium MR depends on this MR.

Resolves: ZSTAC-85984

sync from gitlab !10285

AlanJager added 10 commits June 23, 2026 19:36
The VM model-cache feature needs persistent schema rows and generated client surfaces before premium can expose scheduling and cache-management controls.

Resolves: ZSTAC-0

Change-Id: I424f987899216872309b42439f8ba8a1353ae505
Host model cache identity and reservation lifecycle fields must be non-null to keep database uniqueness and capacity accounting consistent with the premium VO model.

Resolves: ZSTAC-0

Change-Id: Ibbc18ef1436e3ec8366b38c055451fc610fff88a
host cache storage and policy rows are keyed by hostUuid/sourceRoot and are removed through host cascade, so the DDL now rejects null key components instead of relying on nullable unique-key behavior.

Resolves: ZSTAC-0

Change-Id: If75425fc43b3ba407b807c7440fe1982b376cf9e
Regenerated ApiHelper after adding host model cache API actions.

Resolves: ZSTAC-0

Change-Id: Ic04457599d0e3d3e59d26a7a22156b6785d3e913
Regenerated SDK in verify-case with the premium host model cache branch mounted so update_sdk stays clean.

Tested: verify-case ./runMavenProfile premium
Tested: verify-case ./runMavenProfile sdk

Resolves: ZSTAC-0

Change-Id: I42c8e3c0edc320588097236dc8d207ebfcb8cdce
Add zoneUuid to ModelCenter-derived AI resource SDK inventories and update the 5.5.28 schema migration to persist, backfill, and constrain the zone relation for models, model services, datasets, and model service instance groups.

When existing ModelCenter rows do not have zoneUuid, infer a historical default only from deployed inference service VMs and only when all observed VM zones under the same resource agree on one zone. This keeps ModelCenter as the binding point for new behavior while avoiding an arbitrary zone choice during upgrade.

Constraint: Source branch must end with @@2 for the linked MR set
Rejected: Default to the first ZoneEO row | arbitrary and can bind AI resources to the wrong physical zone
Rejected: Leave all historical derived resources NULL when deployed inference VMs reveal a single zone | loses a reliable existing placement signal
Confidence: high
Scope-risk: moderate
Directive: Keep migration inference conservative; do not backfill from mixed-zone deployed services without an explicit product rule
Tested: Docker verify container ./runMavenProfile premium
Tested: Docker verify container ./runMavenProfile sdk
Tested: Restored 172.20.1.159 MySQL backup 2026-06-03_14-30-01 into MySQL 5.7 and ran beforeMigrate.sql plus V5.5.28__schema.sql twice
Resolves: ZSTAC-75429
Change-Id: If0a2176680abb9911d63b0281779616215d9a787
Store VmModelMountVO lastAttachedEpoch in database so restore cleanup can distinguish successful asynchronous attach from stale failure callbacks.\n\nTested: docker verify-case runMavenProfile premium\nTested: docker verify-case VmModelMountCase

Resolves: ZSTAC-84246

Change-Id: Ib426797059be9401c3e2556ecc31c1879dd04049
Move the lastAttachedEpoch schema change out of the 5.5.16 upgrade file by restoring that file to match the 5.5.16 release branch. The 5.5.22 upgrade file already carries the ADD_COLUMN migration, which keeps the change scoped to the target release.

Constraint: 5.5.16 release schema must remain byte-for-byte aligned with upstream/5.5.16
Rejected: Leave the column in V5.5.16__schema.sql | would mutate an already released upgrade file
Confidence: high
Scope-risk: narrow
Tested: git diff --exit-code upstream/5.5.16 -- conf/db/upgrade/V5.5.16__schema.sql
Tested: git diff --exit-code upstream/5.5.22 -- conf/db/upgrade/V5.5.22__schema.sql
Tested: git diff --check

Resolves: ZSTAC-84246

Change-Id: Ibf18c6665d9d3c90361ed7f57d65fc609f6a1bfb
Tighten host cache SDK typing, client-side watermark validation, and null-safe query helper conditions.

Resolves: ZSTAC-85984

Change-Id: I86b1215f6698decbe9dfdb0411ba39f42de36c13
Host cache replies now opt in to typed collection generation through SDKGeneric so generated SDK output remains repeatable. The generated action keeps validation on the server side instead of hand-written SDK code that runMavenProfile sdk overwrites.

Resolves: ZSTAC-85984

Change-Id: Ic7d27efe4a1551802f725560b3406544b78efa69
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@MatheMatrix, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 4 minutes and 52 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dea7ad7e-bc19-452f-86ff-3a5968666fe1

📥 Commits

Reviewing files that changed from the base of the PR and between 7fb83a1 and b86963e.

📒 Files selected for processing (29)
  • conf/db/upgrade/V5.5.22__schema.sql
  • header/src/main/java/org/zstack/header/rest/SDKGeneric.java
  • rest/src/main/resources/scripts/SdkDataStructureGenerator.groovy
  • sdk/src/main/java/SourceClassMap.java
  • sdk/src/main/java/org/zstack/sdk/AiHostCacheStorageInventory.java
  • sdk/src/main/java/org/zstack/sdk/AiHostCacheStorageStatus.java
  • sdk/src/main/java/org/zstack/sdk/AiHostModelCacheFailureCode.java
  • sdk/src/main/java/org/zstack/sdk/AiHostModelCacheFailurePhase.java
  • sdk/src/main/java/org/zstack/sdk/AiHostModelCacheInventory.java
  • sdk/src/main/java/org/zstack/sdk/AiHostModelCachePolicyInventory.java
  • sdk/src/main/java/org/zstack/sdk/AiHostModelCacheStatus.java
  • sdk/src/main/java/org/zstack/sdk/CleanAiHostModelCacheAction.java
  • sdk/src/main/java/org/zstack/sdk/CleanAiHostModelCacheResult.java
  • sdk/src/main/java/org/zstack/sdk/DatasetInventory.java
  • sdk/src/main/java/org/zstack/sdk/GetAiHostModelCacheCapacityAction.java
  • sdk/src/main/java/org/zstack/sdk/GetAiHostModelCacheCapacityResult.java
  • sdk/src/main/java/org/zstack/sdk/ModelCenterInventory.java
  • sdk/src/main/java/org/zstack/sdk/ModelInventory.java
  • sdk/src/main/java/org/zstack/sdk/ModelServiceInstanceGroupInventory.java
  • sdk/src/main/java/org/zstack/sdk/ModelServiceInventory.java
  • sdk/src/main/java/org/zstack/sdk/QueryAiHostModelCacheAction.java
  • sdk/src/main/java/org/zstack/sdk/QueryAiHostModelCacheResult.java
  • sdk/src/main/java/org/zstack/sdk/RefreshAiHostModelCacheAction.java
  • sdk/src/main/java/org/zstack/sdk/RefreshAiHostModelCacheResult.java
  • sdk/src/main/java/org/zstack/sdk/UpdateAiHostModelCachePolicyAction.java
  • sdk/src/main/java/org/zstack/sdk/UpdateAiHostModelCachePolicyResult.java
  • sdk/src/main/java/org/zstack/sdk/UpdateModelCenterAction.java
  • sdk/src/main/java/org/zstack/sdk/VmModelMountInventory.java
  • testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy

Warning

.coderabbit.yaml has a parsing error

The CodeRabbit configuration file in this repository has a parsing error and default settings were used instead. Please fix the error(s) in the configuration file. You can initialize chat with CodeRabbit to get help with the configuration file.

💥 Parsing errors (1)
Could not fetch remote config from http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml: TimeoutError: The operation was aborted due to timeout
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/ye.zou/feature/ZSTAC-85984-host-cache-core-5.5.22-aios@@2

Comment @coderabbitai help to get the list of available commands.

Use a stable sourceRootIdentity for host cache policy uniqueness and backfill existing rows during upgrade.

Resolves: ZSTAC-85984

Change-Id: I4ada66f7436c1c120a06adbca4128e99065bb03c
@zstack-robot-2

Copy link
Copy Markdown
Collaborator

Comment from ye.zou:

Pushed follow-up fix b86963e5ab for host cache policy schema uniqueness/backfill.

Local verification with paired premium branch:

  • skipJacoco=true ./runstablilitycase org.zstack.test.integration.ai.AiHostModelCacheControlPlaneCase 1
  • Result: Tests run: 1, Failures: 0, Errors: 0, BUILD SUCCESS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants