Workflows Module Documentation¶
This module provides workflow management functionality for the LatteReview package, implementing multi-agent review orchestration. The updated version includes support for image-based inputs alongside text inputs in the review workflow.
Overview¶
The workflows module consists of one main class:
ReviewWorkflow
: A class that orchestrates multi-agent review processes with support for sequential rounds, filtering, and handling both text and image inputs.
ReviewWorkflow¶
Description¶
The ReviewWorkflow
class manages review workflows where multiple agents can review items across different rounds. It handles the orchestration of reviews, manages outputs, and provides content validation and cost tracking.
Class Definition¶
class ReviewWorkflow(pydantic.BaseModel):
workflow_schema: List[Dict[str, Any]]
memory: List[Dict] = list()
reviewer_costs: Dict = dict()
total_cost: float = 0.0
verbose: bool = True
Key Attributes¶
workflow_schema
: List of dictionaries defining the review process structurememory
: List storing the workflow's execution historyreviewer_costs
: Dictionary tracking costs per reviewer and roundtotal_cost
: Total accumulated cost of all reviewsverbose
: Flag to enable/disable logging output
Methods¶
__call__()
¶
Execute the workflow on provided data.
async def __call__(self, data: Union[pd.DataFrame, Dict[str, Any]]) -> pd.DataFrame:
"""Execute workflow on DataFrame or dictionary input."""
# Returns DataFrame with review results
run()
¶
Core method to execute the review workflow.
async def run(self, data: pd.DataFrame) -> pd.DataFrame:
"""Execute main review workflow."""
# Process each round sequentially
# Returns updated DataFrame with review results
get_total_cost()
¶
Get total cost of workflow execution.
Internal Methods¶
_format_text_input()
: Format input for reviewers_format_image_input()
: Validate and format image paths_log()
: Handle logging based on verbose setting
Image Input Handling¶
The updated workflow supports image inputs in addition to text inputs. Images are validated for file existence and format before being passed to reviewers.
Usage Examples¶
Creating a Basic Review Workflow¶
from lattereview.workflows import ReviewWorkflow
from lattereview.agents import ScoringReviewer
from lattereview.providers import LiteLLMProvider
# Create reviewers
reviewer1 = ScoringReviewer(
provider=LiteLLMProvider(model="gpt-4o"),
name="Initial",
scoring_task="Initial paper screening"
)
reviewer2 = ScoringReviewer(
provider=LiteLLMProvider(model="gpt-4o"),
name="Expert",
scoring_task="Detailed technical review"
)
# Define workflow schema
workflow_schema = [
{
"round": 'A',
"reviewers": reviewer1,
"text_inputs": ["title", "abstract"],
"image_inputs": ["figure_path"]
},
{
"round": 'B',
"reviewers": reviewer2,
"text_inputs": ["title", "abstract", "round-A_Initial_output"],
"filter": lambda row: row["round-A_Initial_score"] > 3
}
]
# Create and run workflow
workflow = ReviewWorkflow(workflow_schema=workflow_schema)
results = await workflow(input_data)
Understanding Workflow Construction¶
A workflow is defined by a list of dictionaries where each dictionary represents a review round. The rounds are executed sequentially in the order they appear in the list. Each round dictionary must contain:
Required Arguments:
round
: A string identifier for the round (e.g., 'A', 'B', '1', 'initial')reviewers
: A single ScoringReviewer or list of ScoringReviewerstext_inputs
: A string or list of strings representing DataFrame column names to use
Optional Arguments:
image_inputs
: A list of DataFrame column names containing paths to image filesfilter
: A lambda function that determines which rows to review in this round
Handling Results¶
# Execute workflow
workflow = ReviewWorkflow(workflow_schema=workflow_schema)
try:
results_df = await workflow(input_df)
# Access costs
total_cost = workflow.get_total_cost()
per_reviewer = workflow.reviewer_costs
# Access results
round_a_scores = results_df["round-A_Initial_score"]
round_b_reasoning = results_df["round-B_Expert_reasoning"]
except ReviewWorkflowError as e:
print(f"Workflow failed: {e}")
Error Handling¶
The module uses ReviewWorkflowError
for all workflow-related errors:
Common error scenarios:
- Invalid workflow schema
- Missing input columns
- Reviewer initialization failures
- Content validation errors
- Output processing failures
- Invalid image file paths
Best Practices¶
-
Schema Design
-
Use meaningful round identifiers
- Design filter functions carefully
-
Consider round dependencies
-
Data Management
-
Validate input data structure
- Handle missing values appropriately
-
Use appropriate column names
-
Cost Control
-
Monitor per-round costs
- Set appropriate concurrent request limits
-
Track total workflow costs
-
Error Handling
- Implement proper exception handling
- Validate workflow schema
- Monitor review outputs
Limitations¶
- Sequential round execution only
- Memory grows with number of reviews
- No direct reviewer communication
- Limited to DataFrame-based workflows
- Requires async/await syntax
- Filter functions must be serializable