-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Handle missing keys gracefully in TaskEvaluator #1940
base: main
Are you sure you want to change the base?
fix: Handle missing keys gracefully in TaskEvaluator #1940
Conversation
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
⚙️ Control Options:
|
Disclaimer: This review was made by a crew of AI Agents. Code Review Comment for PR #1940OverviewThis pull request introduces critical improvements to the Positive Changes
Detailed Analysis of ChangesExample Changes:# Before
f"Initial Output:\n{data['initial_output']}\n\n"
f"Human Feedback:\n{data['human_feedback']}\n\n"
f"Improved Output:\n{data['improved_output']}\n\n"
# After
f"Initial Output:\n{data.get('initial_output', '')}\n\n"
f"Human Feedback:\n{data.get('human_feedback', '')}\n\n"
f"Improved Output:\n{data.get('improved_output', '')}\n\n" These changes demonstrate a clear transition from direct dictionary access to a more defensive approach using the Suggested Further ImprovementsWhile the changes made are positive, here are some suggestions that could further enhance the quality and maintainability of the code: 1. Type AnnotationsIncorporating type hints across the codebase will improve readability and help with early error detection. def evaluate_training_data(
self,
output_training_data: Dict[str, Dict[str, str]]
) -> str: 2. Use Constants for Dictionary KeysUtilizing constants for keys will contribute to better maintainability and reduce the risk of typos. # At module level
TRAINING_DATA_KEYS = {
'INITIAL_OUTPUT': 'initial_output',
'HUMAN_FEEDBACK': 'human_feedback',
'IMPROVED_OUTPUT': 'improved_output'
}
# Usage
f"Initial Output:\n{data.get(TRAINING_DATA_KEYS['INITIAL_OUTPUT'], '')}\n\n" 3. Validation of Input DataAdding validation checks can prevent processing of invalid or empty inputs. def evaluate_training_data(self, output_training_data):
if not output_training_data:
raise ValueError("output_training_data cannot be empty")
final_aggregated_data = ""
for key, data in output_training_data.items():
if not isinstance(data, dict):
raise TypeError(f"Training data for key {key} must be a dictionary")
final_aggregated_data += (
f"Initial Output:\n{data.get('initial_output', '')}\n\n"
f"Human Feedback:\n{data.get('human_feedback', '')}\n\n"
f"Improved Output:\n{data.get('improved_output', '')}\n\n"
) Security ConsiderationsThe introduction of Testing RecommendationsTo ensure the reliability of the modifications, I recommend the following testing protocols:
Final VerdictThe changes from this pull request are commendable and enhance the overall quality of the code. Implementing the suggested improvements can further solidify the code's reliability and maintainability. I approve of the changes with a strong recommendation to consider these enhancements. ✅ |
@pythonbyte Please checkout this issue: #1935 In their error message, you can see the root issue is that Is it okay if training doesn't have Or, is there a deeper issue? |
Description
This PR fixes a KeyError that occurs in the TaskEvaluator when accessing training data dictionary keys.
Core Issue
The TaskEvaluator.evaluate_training_data() method was directly accessing dictionary keys ('initial_output', 'human_feedback', 'improved_output') without checking for their existence, which caused KeyError exceptions when these keys were missing.
Fix
Modified the code to use the safer dict.get() method with empty string defaults:
This change makes the code more resilient by:
Testing
The fix addresses the specific KeyError shown in the error trace:
This error was occurring during crew.train() execution when evaluating training data.
Closes #1935