Skip to content

LCORE-2080: Added E2E Steps for Agent Skills#1941

Open
jrobertboos wants to merge 1 commit into
lightspeed-core:mainfrom
jrobertboos:lcore-2080
Open

LCORE-2080: Added E2E Steps for Agent Skills#1941
jrobertboos wants to merge 1 commit into
lightspeed-core:mainfrom
jrobertboos:lcore-2080

Conversation

@jrobertboos

@jrobertboos jrobertboos commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Description

Added the missing E2E steps for testing agent skills.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Cursor (Composer 2.5)
  • Generated by: Cursor (Composer 2.5)

Related Tickets & Documents

  • Related Issue LCORE-2080
  • Closes LCORE-2080

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Added support for skills in end-to-end stack setups, including new sample skills for echoing text and summarizing content.
    • Expanded skill handling so responses can show tool calls and tool results during streaming and non-streaming queries.
  • Tests

    • Added new end-to-end configurations for both server and library modes to exercise skills-enabled flows.
    • Updated skill scenarios to cover loading skills, reading skill resources, and multi-skill interactions.

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@jrobertboos, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 30 minutes and 16 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: da18d710-b4c7-4e66-9db1-dc20fce1c189

📥 Commits

Reviewing files that changed from the base of the PR and between 1f11ea7 and 2954a53.

📒 Files selected for processing (14)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml
  • tests/e2e/features/skills.feature
  • tests/e2e/features/steps/common_http.py
  • tests/e2e/features/steps/llm_query_response.py
  • tests/e2e/skills/echo/SKILL.md
  • tests/e2e/skills/echo/references/guide.md
  • tests/e2e/skills/summarize/SKILL.md
  • tests/e2e/skills/summarize/references/guide.md
  • tests/e2e/test_list.txt

Walkthrough

Adds e2e skill fixtures, compose mounts, and Lightspeed stack configs for server and library modes. Updates streaming response helpers to capture tool calls and results, and expands the skills feature scenarios to use the new load/read skill flows.

Changes

Skills e2e wiring

Layer / File(s) Summary
Compose mounts and skill fixtures
docker-compose-library.yaml, docker-compose.yaml, tests/e2e/skills/echo/*, tests/e2e/skills/summarize/*
Compose mounts expose the skills test directory, and new echo and summarize skill documents and guides are added.
Lightspeed stack configs
tests/e2e/configuration/library-mode/lightspeed-stack-skills*.yaml, tests/e2e/configuration/server-mode/lightspeed-stack-skills*.yaml
New server-mode and library-mode LCS configs set binding, logging, authentication, llama-stack client wiring, data storage, and skills paths.
Streaming response helpers
tests/e2e/features/steps/common_http.py, tests/e2e/features/steps/llm_query_response.py
A response-field assertion step is added, and streamed SSE parsing now accumulates and exposes tool calls and tool results.
Skills scenarios
tests/e2e/features/skills.feature, tests/e2e/test_list.txt
The skills feature updates tool-name and tool-call assertions across registration, load, read-resource, multi-skill, and progressive disclosure scenarios, and adds the feature to the e2e list.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • radofuchs
  • tisnik
  • asimurka
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title matches the main change: adding end-to-end support for agent skills and related test steps.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@jrobertboos jrobertboos force-pushed the lcore-2080 branch 3 times, most recently from c201e27 to fe7754f Compare June 23, 2026 16:29

@SkillsConfig
@SkillsConfig @skip
Scenario: Skill tools are registered when skills are configured

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Need to reflect skill tools (list_skills, load_skill, read_skill_resource) in /tools.

"""
And The token metrics have increased

# --- Error handling: unknown skill ---

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Error Paths" will have to be skipped for now as the skill tools do fail and produce a result, but it's a different type that the response-building code silently discards.

Below I have helpful part of conversation with Claude about the issue.


Pydantic-ai catches ModelRetry and wraps the error in a RetryPromptPart (not a ToolReturnPart). The FunctionToolResultEvent.part is typed as ToolReturnPart | RetryPromptPart — it can be either.

Where LCS drops it:

In the non-streaming path, build_turn_summary_from_agent_run only processes ToolReturnPart:

query.py
Lines 266-269

        elif isinstance(message, ModelRequest):
            for request_part in message.parts:
                if isinstance(request_part, ToolReturnPart):
                    process_function_tool_result(state, request_part)

In the streaming path, the same filter exists:

streaming.py
Lines 522-524

    part = event.part
    if not isinstance(part, ToolReturnPart):
        return None

Both paths explicitly ignore RetryPromptPart, so the retry/error message for load_skill is never surfaced as a tool_result in the API response.

The result:

  • Both tool calls appear (because both ToolCallPart instances from the ModelResponse are processed)
  • Only the list_skills result appears (because it succeeded and produced a ToolReturnPart)
  • The load_skill result is missing (because it raised ModelRetry → became a RetryPromptPart → silently dropped)

]
"""

# --- Full progressive disclosure flow ---

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely be quite flaky as the LLM (through appendage of system prompt, I think) is given only the "names" of skills so sometimes will result in just load_skill and read_skill_resource being used completely skipping list_skills.

@jrobertboos jrobertboos force-pushed the lcore-2080 branch 3 times, most recently from bd2b990 to 159e8ae Compare June 25, 2026 13:43
@jrobertboos

Copy link
Copy Markdown
Contributor Author

Please Review:

@jrobertboos jrobertboos marked this pull request as ready for review June 25, 2026 15:37

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml`:
- Around line 24-26: The skills discovery config is using a relative path in the
`skills.paths` entry, which can break startup when the working directory
changes. Update the YAML to use the absolute mounted path expected by the stack,
and keep the change localized to the `skills` block in
`lightspeed-stack-skills-directory.yaml` so startup consistently finds the
skills directory.

In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml`:
- Around line 24-26: The skills path in the stack config is CWD-sensitive and
should be pinned to the mounted absolute location instead. Update the
`skills.paths` entry in the YAML so it points to `/app-root/skills/echo` rather
than the relative `skills/echo`, keeping the `skills` configuration
deterministic under the compose mount.

In `@tests/e2e/features/steps/common_http.py`:
- Around line 334-335: The expected JSON in the step implementation still parses
context.text directly, so placeholder tokens like {MODEL} are not substituted
before validation. Update the relevant step in common_http.py to apply the same
placeholder resolution used by the existing partial-body handling before calling
json.loads and validate_json_partially. Keep the fix localized to the step that
consumes context.text and ensure the parsed expected_value reflects substituted
placeholders first.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2beba6a7-1aff-4350-92f7-60524e66a1c4

📥 Commits

Reviewing files that changed from the base of the PR and between 890a6f7 and 1f11ea7.

📒 Files selected for processing (14)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml
  • tests/e2e/features/skills.feature
  • tests/e2e/features/steps/common_http.py
  • tests/e2e/features/steps/llm_query_response.py
  • tests/e2e/skills/echo/SKILL.md
  • tests/e2e/skills/echo/references/guide.md
  • tests/e2e/skills/summarize/SKILL.md
  • tests/e2e/skills/summarize/references/guide.md
  • tests/e2e/test_list.txt
📜 Review details
⏰ Context from checks skipped due to timeout. (2)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
🧰 Additional context used
📓 Path-based instructions (2)
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/e2e/features/steps/common_http.py
  • tests/e2e/features/steps/llm_query_response.py
tests/e2e/**/*.{py,feature}

📄 CodeRabbit inference engine (AGENTS.md)

Use behave (BDD) framework for end-to-end testing with Gherkin feature files

Files:

  • tests/e2e/features/steps/common_http.py
  • tests/e2e/features/steps/llm_query_response.py
  • tests/e2e/features/skills.feature
🧠 Learnings (4)
📚 Learning: 2026-05-20T08:09:30.641Z
Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

  • docker-compose-library.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml
  • docker-compose.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml
📚 Learning: 2026-04-07T09:20:26.590Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1467
File: tests/e2e/features/steps/common.py:36-49
Timestamp: 2026-04-07T09:20:26.590Z
Learning: For Behave-based Python tests, rely on Behave’s Context layered stack for attribute lifecycle: Behave pushes a new Context layer when entering feature scope (before_feature) and again for scenario scope (before_scenario). Attributes assigned inside given/when/then steps live on the current scenario layer and are automatically removed when the scenario ends. As a result, step-set attributes should not be expected to persist across scenarios or features, and manual cleanup in after_scenario/after_feature is generally unnecessary for attributes set in step functions. Only perform manual cleanup for attributes that you set explicitly in before_feature/before_scenario, since those live on the respective feature/scenario layers.

Applied to files:

  • tests/e2e/features/steps/common_http.py
  • tests/e2e/features/steps/llm_query_response.py
📚 Learning: 2026-04-13T13:39:54.963Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1490
File: tests/e2e/features/environment.py:206-211
Timestamp: 2026-04-13T13:39:54.963Z
Learning: In lightspeed-stack E2E tests under tests/e2e/features, it is intentional to set context.feature_config inside Background/step functions (scenario-scoped Behave layer). The environment.py after_scenario restore logic should only restore configuration when context.scenario_lightspeed_override_active is True; this flag is set by configure_service only when a real config switch occurs (so restore does not run for scenarios without a switch). Additionally, steps/common.py’s module-level _active_lightspeed_stack_config_basename is used to prevent re-applying the same config across subsequent scenarios, ensuring scenario_lightspeed_override_active stays False after the first apply. Therefore, reviewers should not “fix” this flow as if feature_config were incorrectly scoped or if after_scenario restoration is missing—config switching and restoration are meant to happen exactly once per actual switch, not redundantly per scenario.

Applied to files:

  • tests/e2e/features/steps/common_http.py
  • tests/e2e/features/steps/llm_query_response.py
📚 Learning: 2026-06-24T13:45:37.249Z
Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.

Applied to files:

  • tests/e2e/features/steps/common_http.py
  • tests/e2e/features/steps/llm_query_response.py
🪛 LanguageTool
tests/e2e/skills/echo/SKILL.md

[style] ~17-~17: Using “back” with the verb “return” may be redundant.
Context: ...r's input text 2. Return the exact text back to the user without modification For f...

(RETURN_BACK)

🔇 Additional comments (8)
docker-compose-library.yaml (1)

23-23: LGTM!

docker-compose.yaml (1)

90-90: LGTM!

tests/e2e/skills/echo/SKILL.md (1)

1-19: LGTM!

tests/e2e/skills/echo/references/guide.md (1)

1-20: LGTM!

tests/e2e/skills/summarize/SKILL.md (1)

1-22: LGTM!

tests/e2e/skills/summarize/references/guide.md (1)

1-21: LGTM!

tests/e2e/configuration/library-mode/lightspeed-stack-skills-directory.yaml (1)

1-26: LGTM!

tests/e2e/configuration/library-mode/lightspeed-stack-skills.yaml (1)

1-26: LGTM!

Comment on lines +24 to +26
skills:
paths:
- skills

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Use absolute skills path to avoid CWD-dependent startup failures.

skills is relative; if the service working directory changes, skills discovery can fail at startup. Use /app-root/skills to match the compose mount explicitly.

Proposed change
 skills:
   paths:
-    - skills
+    - /app-root/skills
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
skills:
paths:
- skills
skills:
paths:
- /app-root/skills
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills-directory.yaml`
around lines 24 - 26, The skills discovery config is using a relative path in
the `skills.paths` entry, which can break startup when the working directory
changes. Update the YAML to use the absolute mounted path expected by the stack,
and keep the change localized to the `skills` block in
`lightspeed-stack-skills-directory.yaml` so startup consistently finds the
skills directory.

Comment on lines +24 to +26
skills:
paths:
- skills/echo

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Pin the skill path to the mounted absolute location.

skills/echo is CWD-sensitive. Prefer /app-root/skills/echo for deterministic resolution against the compose mount.

Proposed change
 skills:
   paths:
-    - skills/echo
+    - /app-root/skills/echo
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
skills:
paths:
- skills/echo
skills:
paths:
- /app-root/skills/echo
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/configuration/server-mode/lightspeed-stack-skills.yaml` around
lines 24 - 26, The skills path in the stack config is CWD-sensitive and should
be pinned to the mounted absolute location instead. Update the `skills.paths`
entry in the YAML so it points to `/app-root/skills/echo` rather than the
relative `skills/echo`, keeping the `skills` configuration deterministic under
the compose mount.

Comment on lines +334 to +335
expected_value = json.loads(context.text)
validate_json_partially(actual_value, expected_value)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Apply placeholder substitution before parsing expected JSON.

At Line 334, this step parses context.text directly, so placeholders like {MODEL} won’t be resolved here (unlike the existing partial-body step). That can cause false failures in scenario assertions.

Proposed fix
-    expected_value = json.loads(context.text)
+    json_str = replace_placeholders(context, context.text)
+    expected_value = json.loads(json_str)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
expected_value = json.loads(context.text)
validate_json_partially(actual_value, expected_value)
json_str = replace_placeholders(context, context.text)
expected_value = json.loads(json_str)
validate_json_partially(actual_value, expected_value)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/features/steps/common_http.py` around lines 334 - 335, The expected
JSON in the step implementation still parses context.text directly, so
placeholder tokens like {MODEL} are not substituted before validation. Update
the relevant step in common_http.py to apply the same placeholder resolution
used by the existing partial-body handling before calling json.loads and
validate_json_partially. Keep the fix localized to the step that consumes
context.text and ensure the parsed expected_value reflects substituted
placeholders first.

@anik120 anik120 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ps: squashing commits to have a single commit for a PR (unless having multiple commits is by design, in which case too, the question would be "why aren't they multiple PRs instead"), is the hygienic thing to do.

Otherwise they show up as

"fix"

"fix"

"address code rabbit"

when someone is searching through git history trying to figure out what changes were made.

Here's an article I highly recommend reading https://medium.com/@madhav2002/git-hygiene-commits-branching-and-rewriting-history-bc6dee5f953f

refined E2E tests for skills and added necessary step implementations.

close: LCORE-2080

@radofuchs radofuchs left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in overall, just a few details


actual_value = response_body[field]

if not context.text:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert for this



@then("The response is the last streamed fragment")
def response_is_last_streamed_fragment(context: Context) -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic is already in "wait for response to be complted step". If you need the use_streaming_response_data, then set it there

@asimurka

asimurka commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Just a conceptual question: Is the skill invocation really so strict that when you prompt to run a non-existing skill, the LLM really tries to execute it and ends up with failure?

@jrobertboos

jrobertboos commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

@asimurka when u prompt the LLM to use a skill, if u are direct enough, it will try to use the load_skill tool with the highlighted skill. e.g. this is what it looks like right now:

INPUT

curl -X 'POST' \
  'http://localhost:8080/v1/query' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "gpt-4o-mini",
  "provider": "openai",
  "query": "load the skill '\''non-existent'\''."
}'

OUTPUT

{
  "conversation_id": "eb995e1ee43557e33d6c43feacf47a4afc73565ba7478294",
  "response": "It appears that there are currently no available skills to load. Please let me know if you need assistance with something else!",
  "rag_chunks": [],
  "referenced_documents": [],
  "truncated": false,
  "input_tokens": 2290,
  "output_tokens": 52,
  "available_quotas": {},
  "tool_calls": [
    {
      "id": "call_51npMnMSenv6Qnp7encji746",
      "name": "load_skill",
      "args": {
        "skill_name": "non-existent"
      },
      "type": "function_call"
    },
    {
      "id": "call_retBuV3RnzjkfsR8ltVNLoq3",
      "name": "list_skills",
      "args": {},
      "type": "function_call"
    }
  ],
  "tool_results": [
    {
      "id": "call_retBuV3RnzjkfsR8ltVNLoq3",
      "status": "success",
      "content": "{}",
      "type": "function_call_output",
      "round": 1
    }
  ]
}

Does that answer your question?

@asimurka

asimurka commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Is it possible that this is just model-specific behavior? Because I think you shouldn't be able to influence the model behavior like this (with bare prompt).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants