Agentic Reports Consistency

High level guidelines for ensuring consistent and correct data in reports #

A brief overview of the report generation process #

Report generation is handled mainly by two independent components that use LLMs:

report generator LLM — the top-level LLM that gathers relevant data via available tools, composes the email content, and sends it.
query generator tool — a tool that runs an LLM to generate the queries for the semantic DB and returns the results to the report generator LLM.

Steps taken to fix the report #

Inspect the LLM trace for a report to identify what is going wrong. One or more of the following may be the issue:
- The top-level LLM is giving insufficient or incorrect instructions to the query generator.
- The query generator is unable to generate the correct query even if the instruction is accurate.
- The required data is not present in the semantic DB.
If the issue is with the top-level LLM not prompting the query generator correctly, add or refine instructions for this LLM via the report’s system prompt (how-to added in next section).
For additional instructions to the query generator, provide exact queries or guidelines via the workspace default query_generator_additional_instructions.

In this specific case (fixing the weekly, monthly, and quarterly ticket metrics report) the following changes were made:

Updated the report’s system prompt to include explicit instructions for the top-level LLM such as: generate queries separately for each table in the email, avoid exposing internal details, and how to search for the requested account.
Because queries were still not always correct, added additional instructions to the query generator via workspace defaults.
Added simple, structured example queries in the workspace defaults. The examples:
- generate relevant week/month/quarter ranges,
- select filtered accounts based on requirements,
- combine results to produce the desired metrics.
In some cases, adding an example query was not enough. Explicit guidelines were provided alongside the query (this was required for the quarterly report data).

Updating a report’s system prompt involves two steps:

You GET the report’s current configuration
Then make another PUT request to add the system prompt along with the existing configuration.

Use the following fetch command to get a specific report’s current configuration. #

await fetch("/api/reports/{report_id}", {
    method: "GET"
});

Command for adding the system prompt in the report config #

You need to copy just required fields - name, config, from the report config. The system_prompt should be present within config.email

await fetch("/api/reports/{report_id}", {
  body: JSON.stringify({
    "name": "...",
    "config": {
      "schedule": {
        "type": "...",
        "expr": "..."
      },
      "email": {
        "system_prompt": "Add your system prompt here",
        "prompt": "...",
        "subject": "...",
        "recipients": [..., ...]
      }
    }
  }),
  method: "PUT"
});

Use the following fetch command to verify whether the query generator tool produces the correct query:

await fetch("/api/v2/chatbot/debug/tools/query_semantic_db/run", {
  body: JSON.stringify({ "prompt": "what's the total support tickets for each accounts?" }),
  method: "POST"
});

Iterate on the prompt to find what works best for the query generator.

For more detailed testing of the query generator — where you can modify both the input prompt and the system prompt — follow the verification process below.

The report data had inconsistencies primarily because the query generator tool could not form the correct queries required for the data.

To validate and fix the tool, the main idea was to provide example queries for each requirement and verify the tool’s full iteration in the OpenAI playground. The verification process used was:

Select the model (gpt-4.1-mini), set temperature to 0, and add the query_db tool.
Add the system prompt (the project’s query generator system prompt) together with the Aviz-specific workspace prompt for query generation, combined with the new query examples.
Add the user message using the real LLM-generated user prompt format. To emulate an actual use case, pick the LLM-generated user prompt from an existing report run’s LLM trace.
Run the prompt in the playground. The LLM will call query_db multiple times because it must inspect the semantic DB data. You must provide correct tool responses (the actual results from the DB) rather than letting the playground auto-complete them. Without correct ticket columns and results, the LLM cannot form the correct query.
- If a similar query exists in a previous report run’s LLM trace, copy and reuse that tool response.
- Otherwise, use the /internal/connection-query endpoint to run the requested query and obtain the actual results.
The LLM in the playground will process tool results, think, and call query_db additional times until it finalizes its query.
The playground simulation will behave almost exactly like our state machine for the query generator.

Notes:

Providing accurate tool responses in the playground is essential; otherwise the LLM will not have the real DB schema or results needed to build correct queries.
Keep example queries simple and structured; they should demonstrate how to compute date ranges (week/month/quarter), how to filter accounts, and how to combine results into final metrics.