July 18

Типичный коротенький промпт...

Вот так я программирую агентов с помощью Шотгана. Пример пользовательского кусочка промпта, который пойдет в системный промпт, а далее вместе с файлами из репозитория будет скопирован в ЛЛМ и далее выдаст diff кода для применения.
====

Хочу завести агента-диспетчера сообщений пользователя.

Его функционал:
— Оркестратор слушает во время основного цикла то, что пишет юзер в чат.
— Если сообщение юзера запрошено тулколом от другого агента (не диспетчера), то ответ передается напрямую тому агенту (уже имеющееся поведение)
— Если сообщение юзера инициализировано от юзера, без предварительного тулкола (запроса) от юзера инфы, то сообщение передается Диспетчеру.
— Функция диспетчера:
а) понять запрос юзера
б) ответить пользователю + вызывать какого-либо из других агентов с передачей к нему запроса от юзера (в каждом агенте формируем раздел с запросом юзера в начале промпта, в шаблоне обернутом в "if" — т.е. эта секция активна только если агент вызван с инструментом "сообщение от юзера")
с) ответить пользователю + запросить от него доп. информацию.

Непрерывная серия запрос-ответ от диспетчера, начиная с первого сообщения юзера и ответов диспетчера если он что-то дозапрашивает у юзера, складывается в переменную "диалог" и подается как контекст беседы в других агентов, если в конце такого диалога вызван тулкол другого агента.

Диалог передается в цепочке вызовов агентов от начала до завершения цикла (plan do check act). После совершения act текущий диалог системы обнуляется.

Диспетчер работает как в режиме пока работают другие агенты, так и на паузе.
Так как запросы к другим агентам синхронны и проходят в одном цикле, а сообщение к/от диспетчера асинхронны по отношению к ним, то мы супрессим получение уже ожидаемых ответов другими агентами если диспетчер задетектировал сообщение юзера.

Т.е. как пример. Агент PLAN отослал запрос в ЛЛМ и ждет ответа. В этот момент пользователь что-то написал и оркестратор решил, что это сообщение диспетчеру. Значит мы должны прервать ожидание ответа от ЛЛМ PLAN, передать управление диспетчеру и отослать сообщение в ЛЛМ уже от него. Таким образом восстанавливая синхронность цикла.
Диспетчер получает сообщение и решает что делать. Если диспетчер в свою очередь вызвал после этого агент PLAN, то в шаблон plan будет передан диалог диспетчера с юзером, так же как и в do и check, до тех пор пока после срабатывания act диалог не обнулится и не начнется новый цикл.

Вот предложение архитектора. Мне нравится. Реализуй
===
# Refactoring Plan: Interactive Message Dispatcher Agent

## 1. Executive Summary & Goals

This plan outlines the design and implementation of an interactive "Message Dispatcher Agent" to handle unsolicited user messages during an active analysis session. The primary objective is to enhance user interactivity, allowing them to steer or interrupt the main Plan-Do-Check-Act (PDCA) agent cycle with new instructions or questions.

**Key Goals:**

1. **Implement the Dispatcher Agent:** Create a new agent responsible for interpreting user-initiated messages and deciding whether to ask for clarification or delegate the request to the main PDCA cycle.
2. **Enable Asynchronous Interruption:** Modify the core orchestrator to allow user messages to interrupt the current synchronous agent task (e.g., waiting for an LLM response), passing control to the Dispatcher.
3. **Introduce "Dialogue" Context:** Establish a persistent "dialogue" context that captures the conversation with the Dispatcher and is passed through the entire PDCA cycle, ensuring subsequent agents are aware of the user's ad-hoc request. This context is cleared upon cycle completion.

## 2. Current Situation Analysis

The existing system is designed around a synchronous, sequential PDCA (Plan-Do-Check-Act) agent cycle orchestrated by the `AgentExecutor`. User interaction is limited to a request-response model, where the user can only send a message when an agent explicitly asks a question using the `ask_user` tool, pausing the session.

**Key Limitations to Address:**

* **Lack of Proactive Control:** The user cannot interrupt or guide the agent's process once a cycle has started. They must wait for the agent to complete its current task or ask a question.
* **Rigid Control Flow:** The `AgentExecutor` in `backend/app/worker/agent_executor.py` operates in a strictly sequential loop, executing one step at a time. It lacks a mechanism to handle out-of-band, asynchronous events like a user message.
* **Stateless Interaction Context:** The current `transient_dialog` in `AgentSession` is designed for a single question-and-answer exchange. There is no mechanism to persist a longer, multi-turn "dialogue" with a user across an entire PDCA cycle.

## 3. Proposed Solution / Refactoring Strategy

### 3.1. High-Level Design / Architectural Overview

We will introduce a Dispatcher agent and modify the `AgentExecutor` to handle asynchronous user messages. The core idea is to transform the main agent loop from a simple sequential execution into an event-driven one, where the orchestrator can react to either the completion of an agent step or a user interruption.

**New Control Flow Diagram:**

```mermaid
graph TD
subgraph "User"
A[User sends message]
end

subgraph "FastAPI WebSocket Gateway"
B[analysis.py]
A --> B
end

subgraph "Redis Pub/Sub"
C[Channel: user-interrupt:{session_id}]
B --> C
end

subgraph "Arq Worker: AgentExecutor"
D{Main Loop}
E[Run LLM for Current Step]
F[Listen for User Interrupt]
G[Dispatcher Agent Logic]
H[PDCA Agent Logic]

D -- PDCA cycle continues --> E
D -- Listens concurrently --> F
C -- Interrupt signal --> F
F -- Cancels --> E
F -- Triggers --> G
G -- Formulates response/delegation --> H
E -- Completes --> H
H -- Updates state --> D
end

style F fill:#f9f,stroke:#333,stroke-width:2px
style G fill:#f9f,stroke:#333,stroke-width:2px
```

**Key Architectural Changes:**

1. **Interruptible Worker:** The `AgentExecutor`'s main step execution will be wrapped in an `asyncio.wait` call. This will allow it to concurrently await the LLM response for the current PDCA step *and* listen for an interrupt signal from a Redis Pub/Sub channel.
2. **Dispatcher Logic:** If a user interrupt is received, the ongoing LLM task is cancelled. Control is transferred to a new `_run_dispatcher_step` method, which invokes the Dispatcher agent.
3. **Dialogue Context State:** A new field, `dispatcher_dialogue`, will be added to the `AgentSession` model. This field will store the chain of messages between the user and the Dispatcher. It will be populated by the Dispatcher and cleared by the `ACT` agent at the end of a cycle.

### 3.2. Key Components / Modules

1. **Dispatcher Agent (New)**
* **Responsibility:** A new LLM-driven agent defined by a `dispatcher.md` prompt. Its purpose is to analyze a user's message, and decide to either:
a. `clarify`: Ask the user for more information.
b. `delegate`: Formulate a task for the `PLAN` agent and pass control back to the main PDCA cycle.
* This is a *logical* agent; it will be implemented as a new method within `AgentExecutor`, not a new class.

2. **`AgentExecutor` (`backend/app/worker/agent_executor.py`) (Heavy Modification)**
* **Responsibility:** The orchestrator will be refactored to handle the interruptible logic. The `_run_step` method will manage the concurrent waiting for LLM completion and user interrupts. It will contain the new `_run_dispatcher_step` method.

3. **`AgentSession` (`backend/app/models/agent.py`) (Modification)**
* **Responsibility:** The central state document. It will be extended to store the new `dispatcher_dialogue`.

4. **WebSocket Gateway (`backend/app/api/analysis.py`) (Modification)**
* **Responsibility:** The client-facing message handler. It will be updated to distinguish between a `user_reply` (to a specific question) and a `user_initiative` (a new, unsolicited message) and publish interrupts to the appropriate Redis channel.

5. **PDCA Prompts (`backend/app/prompts/PDAC/*.md`) (Modification)**
* **Responsibility:** All four prompts (`plan.md`, `do.md`, `check.md`, `act.md`) will be updated to include an optional context block for the `dispatcher_dialogue`.

### 3.3. Detailed Action Plan / Phases

---

#### **Phase 1: Foundation & Data Model**

* **Objective(s):** Lay the groundwork for the new functionality by updating data structures and creating the new agent's prompt.
* **Priority:** High

* **Task 1.1: Extend `AgentSession` Model**
* **Rationale/Goal:** Add a new field to store the persistent dialogue context for the PDCA cycle.
* **Estimated Effort:** S
* **Deliverable/Criteria for Completion:** In `backend/app/models/agent.py`, `AgentSession` and `AgentSessionUpdate` models are updated with a new field: `dispatcher_dialogue: Optional[List[ChatMessage]] = None`.

* **Task 1.2: Create `dispatcher.md` Prompt**
* **Rationale/Goal:** Define the logic, inputs, and outputs for the new Dispatcher agent.
* **Estimated Effort:** M
* **Deliverable/Criteria for Completion:** A new file `backend/app/prompts/dispatcher.md` is created. It instructs the LLM to analyze a user message and decide between two actions: `clarify_with_user(question: str)` or `delegate_to_planner(user_request_summary: str)`.

* **Task 1.3: Update `AgentService`**
* **Rationale/Goal:** Ensure the service layer can handle updates to the new `dispatcher_dialogue` field.
* **Estimated Effort:** S
* **Deliverable/Criteria for Completion:** No code changes are likely needed as `update_session` using `AgentSessionUpdate` is generic, but a manual review confirms that passing `dispatcher_dialogue` in the update payload works as expected.

---

#### **Phase 2: Backend - Interruption and Dispatch Logic**

* **Objective(s):** Implement the core mechanism for handling asynchronous user messages and dispatching them.
* **Priority:** High

* **Task 2.1: Modify WebSocket Gateway**
* **Rationale/Goal:** Differentiate between solicited replies and new user initiatives.
* **Estimated Effort:** M
* **Deliverable/Criteria for Completion:**
* In `frontend/src/views/AgentChatView.vue`, the message input form is always visible. When the user sends a message not in reply to a direct question, it is sent with `type: "user_initiative"`.
* In `backend/app/api/analysis.py`, the `client_reader` function now handles `user_initiative` messages.
* Upon receiving a `user_initiative`, the handler adds the message to a new field in the `AgentSession` document (e.g., `pending_user_message: ChatMessage`) and publishes the `session_id` to a new Redis channel: `user-interrupt:{session_id}`.

* **Task 2.2: Implement Interruptible `AgentExecutor`**
* **Rationale/Goal:** Refactor the main agent loop to be interruptible.
* **Estimated Effort:** L
* **Deliverable/Criteria for Completion:**
* The `AgentExecutor.run` method is refactored. Instead of a simple `await self._run_step()`, it now sets up a Redis Pub/Sub listener for the `user-interrupt:{session_id}` channel.
* The call to the LLM inside `_run_step` is wrapped in an `asyncio.Task`.
* `asyncio.wait` is used with `return_when=asyncio.FIRST_COMPLETED` to await both the LLM task and the interrupt listener task.
* If the interrupt listener finishes first, the LLM task is cancelled (`llm_task.cancel()`). The orchestrator then transitions to the dispatcher logic.
* If the LLM task finishes first, the orchestrator proceeds as normal.

* **Task 2.3: Implement Dispatcher Step Logic**
* **Rationale/Goal:** Create the execution logic for the Dispatcher agent.
* **Estimated Effort:** M
* **Deliverable/Criteria for Completion:**
* A new private method `_run_dispatcher_step` is created in `AgentExecutor`.
* It retrieves the `pending_user_message`, clears it, and adds it to the `dispatcher_dialogue`.
* It calls the LLM using the `dispatcher.md` prompt.
* It parses the response:
* If `clarify_with_user`, it uses the `ask_user` tool and pauses the session. The user's reply will now be part of the `dispatcher_dialogue`.
* If `delegate_to_planner`, it sets the `current_step` to `PLAN` and allows the main loop to proceed to the next iteration, now with the `dispatcher_dialogue` populated.

---

#### **Phase 3: Context Integration and Cleanup**

* **Objective(s):** Ensure the new dialogue context is used by all agents and is properly cleared.
* **Priority:** Medium

* **Task 3.1: Update PDCA Prompts**
* **Rationale/Goal:** Make all agents in the main cycle aware of the user's ad-hoc request.
* **Estimated Effort:** M
* **Deliverable/Criteria for Completion:** The `plan.md`, `do.md`, `check.md`, and `act.md` prompt files are updated with a new optional section:
```jinja2
{% if dispatcher_dialogue %}
## USER INITIATED DIALOGUE
- A user has interrupted the process with the following request. This dialogue has the highest priority.
{{ dispatcher_dialogue }}
{% endif %}
```

* **Task 3.2: Update `_build_llm_context`**
* **Rationale/Goal:** Pass the new context variable to the prompt templates.
* **Estimated Effort:** S
* **Deliverable/Criteria for Completion:** The `_build_llm_context` method in `AgentExecutor` is updated to include `dispatcher_dialogue` from the session object in the context dictionary passed to Jinja2.

* **Task 3.3: Implement Dialogue Cleanup**
* **Rationale/Goal:** Ensure the dialogue context is reset after a cycle completes, preventing it from leaking into the next independent cycle.
* **Estimated Effort:** S
* **Deliverable/Criteria for Completion:** The `commit_state_update` tool function in `backend/app/worker/agent_toolbox.py` is modified. In addition to its current duties, it adds `dispatcher_dialogue=None` to the `AgentSessionUpdate` payload, effectively clearing the context upon successful completion of the `ACT` step.

### 3.4. Data Model Changes

* **`backend/app/models/agent.py`:**
* The `AgentSession` and `AgentSessionUpdate` Pydantic models will be extended with two new optional fields:
```python
class AgentSession(BaseModel):
# ... existing fields
pending_user_message: Optional[ChatMessage] = None
dispatcher_dialogue: Optional[List[ChatMessage]] = None

class AgentSessionUpdate(BaseModel):
# ... existing fields
pending_user_message: Optional[ChatMessage] = None
dispatcher_dialogue: Optional[List[ChatMessage]] = None
```

### 3.5. API Design / Interface Changes

* **WebSocket Message Protocol:**
* The frontend will now send a new message type when the user initiates a conversation.
* **New Message from Client:**
```json
{
"type": "user_initiative",
"content": "Actually, before you continue, can you analyze the authentication service first?"
}
```
* The existing `user_reply` type will be reserved for answers to direct questions from an agent.

## 4. Key Considerations & Risk Mitigation

### 4.1. Technical Risks & Challenges

* **Complex Concurrency:** The `asyncio.wait` logic in the `AgentExecutor` is the most complex part of this plan. It must be carefully implemented to correctly handle task cancellation, avoid race conditions, and ensure the Redis listener is properly managed.
* **Mitigation:** Write thorough unit tests for the `AgentExecutor` that mock the LLM call and the Redis listener, testing scenarios where the LLM finishes first and where the interrupt happens first.
* **Wasted LLM Calls:** The interruption mechanism will lead to cancelled LLM API calls, which may still incur costs.
* **Mitigation:** This is an accepted trade-off for increased interactivity. The system should log these cancellations clearly to monitor frequency and potential cost impact.
* **State Management Bugs:** Incorrectly managing the `dispatcher_dialogue` (failing to clear it or populating it incorrectly) could lead to confused agent behavior in subsequent cycles.
* **Mitigation:** The cleanup logic in the `ACT` step is critical. Add integration tests that run a full PDCA cycle initiated by the dispatcher and verify that the dialogue is cleared at the end.

### 4.2. Dependencies

* **Phase 2 (Backend)** is a prerequisite for all subsequent phases.
* **Frontend changes** depend on the backend API and WebSocket modifications being complete.

### 4.3. Non-Functional Requirements (NFRs) Addressed

* **Usability:** Significantly improved. The user is promoted from a passive observer to an active participant who can guide the agent system.
* **Responsiveness:** The system can now react to user input in near real-time instead of waiting for a full task or cycle to complete.
* **Maintainability:** While the `AgentExecutor` becomes more complex, the logic is centralized. The Dispatcher's concerns are isolated to its own prompt and execution step.

## 5. Success Metrics / Validation Criteria

* **Qualitative:** The system feels more responsive and "alive" from a user's perspective. The user can successfully redirect the agent's focus mid-cycle.
* **Functional:**
* Sending a message while an agent is "thinking" (awaiting LLM) successfully interrupts the current task.
* The Dispatcher agent correctly receives the user's message.
* The Dispatcher's response (clarification or delegation) is executed correctly.
* The `dispatcher_dialogue` context is correctly passed to the `PLAN`, `DO`, `CHECK`, and `ACT` agents.
* The `dispatcher_dialogue` field in the `AgentSession` is set to `null` after the `ACT` step completes a cycle.

## 6. Assumptions Made

* The existing `arq` worker setup is capable of handling the `asyncio` and `redis` libraries required for the interruption mechanism.
* The latency of the Redis Pub/Sub messaging is low enough to be perceived as a near-instantaneous interrupt.
* The trade-off of potentially wasting LLM API calls for the sake of interactivity is acceptable.

## 7. Open Questions / Areas for Further Investigation

* **Interrupt Granularity:** The current plan interrupts at the LLM-call level. Should we consider other interrupt points, for example, during a long-running tool execution? (For now, the LLM call is the most significant and logical point of interruption).
* **Dispatcher Capabilities:** Should the Dispatcher be able to call other tools directly, or is delegating to the `PLAN` agent sufficient? (Delegating to `PLAN` is cleaner and reuses the existing, robust PDCA flow. We will start with this approach).