Agentic Execution
Agentic execution is a first-class concept distinct from the conversation-based agent flow. Where conversations are interactive (between the user and the agent), executions are autonomous, programmatic batch jobs: submitted via API, the LLM agent runs to completion without user interaction, and the caller inspects the result afterward.
Each execution is tightly coupled to a configured Agent instance, which already carries the dataset association, system prompt, and tool configuration needed to run.
Core Abstractions
ExecutionStatus
Enum tracking the lifecycle of an execution:
PENDING: Execution has been created; the Celery task has not started yetIN_PROGRESS: The Celery task is actively running the agentFINISHED: The execution completed successfullyFAILED: The execution ended with an error or validation failure
ExecutionResult
Dataclass returned by BaseAgenticExecutionDefinition.run(). Communicates the full outcome without raising exceptions.
success(bool): Whether the execution completed successfullyoutput(dict): Structured output payload — meaningful only whensuccess=Truefailure_code(ExecutionFailureCode | None): Standardised failure code, set whensuccess=Falsefailure_summary(str | None): LLM-generated plain-language explanation of what went wrong
ExecutionFailureCode
StrEnum defining standardised failure codes. Plugins can extend this enum to add execution-specific codes and point FAILURE_CODES at the subclass.
ExecutionInputType
Pydantic BaseModel subclassed per plugin to declare and validate structured execution input. The server derives the JSON Schema from it for API responses and request validation.
ToolScratchpad
A dict-based store that tools write to and validators read from during a single execution attempt. One instance is created per execution conversation. It is cleared before each retry so validators always see only results from the current attempt.
Validators
ValidatorResponse
Structured return value from BaseExecutionValidator.validate():
validation_successful(bool): Whether the response passed this validatorretry_needed(bool): Whether the run loop should retry after failure — defaultTruefeedback(str | None): Message sent back to the LLM whenretry_needed=True; used asfailure_summarywhenretry_needed=False
BaseExecutionValidator
Abstract base class for response validators. Receives the raw response string, the execution input, and the ToolScratchpad, and returns a ValidatorResponse.
class BaseExecutionValidator(ABC):
def validate(
self,
response: str,
execution_input: ExecutionInputType,
tool_scratchpad: Optional[ToolScratchpad] = None,
) -> ValidatorResponse:
...The run loop processes the first failing ValidatorResponse it encounters:
- All validators return
validation_successful=True→ execution succeeds validation_successful=False, retry_needed=True→feedbackis sent back to the LLM and the attempt retries (up toMAX_RETRIEStimes)validation_successful=False, retry_needed=False→ the loop stops immediately;failure_codeis set tovalidation_failedandfeedbackbecomes thefailure_summary
Built-in Validators
IsValidJsonValidator — validates that the agent’s response is a valid JSON string. On failure, retries with corrective feedback.
StopExecutionValidator — checks whether the agent called the stop execution tool during the attempt. If it did, halts the run without retrying using the recorded stop reason as the failure summary. Place this first in VALIDATORS so it short-circuits before any other checks.
Execution Definition
BaseAgenticExecutionDefinition
The base class provides a concrete run() that implements the validator retry loop, and an abstract execute() that subclasses fill in with the actual per-attempt logic.
Class-level declarations:
EXECUTION_KEY
- Type:
ClassVar[str] - Description: Unique slug identifying and persisting this execution type
AGENT_KEY
- Type:
ClassVar[str] - Description: Must match the
AGENT_KEYof the target agent plugin. Multiple execution definition classes may share the sameAGENT_KEY.
NAME / DESCRIPTION
- Type:
ClassVar[str]/ClassVar[Optional[str]] - Description: Human-readable label and optional longer description shown in the execution type selector
INPUT_TYPE
- Type:
ClassVar[type[ExecutionInputType]] - Description: Input schema class used for validation and JSON Schema generation; defaults to base
ExecutionInputType
VALIDATORS
- Type:
ClassVar[list[type[BaseExecutionValidator]]] - Description: Ordered list of validator classes applied after each
execute()call; defaults to[IsValidJsonValidator]
MAX_RETRIES
- Type:
ClassVar[int] - Description: Maximum validator-feedback correction cycles before giving up; default
3
FAILURE_CODES
- Type:
ClassVar[type[ExecutionFailureCode]] - Description: Failure code enum for this definition; override with a subclass to expose execution-specific codes alongside the base ones
Methods:
execute(input_data, conversation) -> str (abstract): Performs one attempt and returns the raw LLM response string. The base run() loop calls this repeatedly until all validators pass or retries are exhausted.
run(input_data, conversation) -> ExecutionResult (concrete): Orchestrates the retry loop:
- Calls
execute()to get the LLM response - Runs each validator in order; the first failure drives what happens next
- All pass → returns
ExecutionResult(success=True, output=...) retry_needed=False→ returns immediately withvalidation_failedretry_needed=Truewith retries remaining → clearstool_scratchpad, sends feedback, retries- Retries exhausted → asks the LLM for a failure summary and returns
max_retries_exceeded
Agent Config Provider
ConfigType
StrEnum controlling which system prompt an agent uses:
ConfigType.CONVERSATION: Used during regular user-facing conversationsConfigType.AGENTIC_EXECUTION_DEFINITION: Used during autonomous agentic execution runs
BaseAgentConfigProvider
ABC exported from each agent plugin. The AgentRegistry calls get_config(config_type) to obtain the appropriate AgentConfig for the current context.
class BaseAgentConfigProvider(ABC):
@abstractmethod
def get_config(self, config_type: ConfigType = ConfigType.CONVERSATION) -> AgentConfig:
...Plugins that need a different prompt for execution runs implement the branching in get_config(). Plugins that always use the same prompt can ignore config_type.