Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added BrowserUse Tool #187

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

MahlerTom
Copy link
Contributor

@MahlerTom MahlerTom commented Jan 14, 2025

Following issue #186, together with Shahar-Y, this PR adds the browser-use/browser-use tool to crewai tools.

Supports [email protected].

Full working example:

import asyncio

from browser_use import Browser, BrowserConfig
from browser_use.browser.context import BrowserContext
from crewai import Agent, Crew, Task

from crewai_tools.tools.browser_use_tool import BrowserUseTool
from langchain_openai.chat_models import ChatOpenAI


def main():

    browser = Browser(config=BrowserConfig(headless=False))

    browser_context = BrowserContext(browser=browser)

    browser_use_tool = BrowserUseTool(
        llm=ChatOpenAI(model="gpt-4o"),
        browser=browser,
        browser_context=browser_context,
    )

    agent = Agent(
        role="Browser Agent",
        goal="Use the browser",
        backstory=(
            "You are the best Browser Agent in the world. "
            "You have a browser that you can interact with using natural language instructions."
        ),
        tools=[browser_use_tool],
        verbose=True,
        llm="gpt-4o",
    )

    task = Task(
        name="Navigate to webpage and summarize article",
        description="Navigate to {webpage} and find the article about 'xAI (company)' and summarize it.",
        expected_output="A summary of the article",
        agent=agent,
    )

    crew = Crew(
        tasks=[task],
        agents=[agent],
        verbose=True,
    )

    crew_result = crew.kickoff(
        inputs={
            "webpage": "https://www.wikipedia.org/",
        }
    )

    print(crew_result.raw)
    loop = asyncio.new_event_loop()
    loop.run_until_complete(browser.close())
    loop.close()


if __name__ == "__main__":
    main()

@joaomdmoura
Copy link
Collaborator

Disclaimer: This review was made by a crew of AI Agents.

Code Review Comment: Browser-Use Tool Implementation

Overview

The implementation of the new Browser-Use Tool within the crewai-tools repository introduces several key files: browser_use_tool.py, an init file to structure the package, a corresponding test file, and documentation via the README. The intended functionality is clear and the code is well-organized, but I have identified several areas for improvement concerning code quality, maintainability, and error handling.

Critical Code Improvements

1. Event Loop Management

The current approach of creating a new event loop per tool instance is inefficient and could lead to unwanted performance issues. Instead, I recommend the following change:

Current Code:

event_loop: asyncio.AbstractEventLoop = Field(
    asyncio.new_event_loop(),
    description="The event loop to use to run browser-use. Creates a new event loop if not provided.",
)

Proposed Improvement:

event_loop: asyncio.AbstractEventLoop = Field(
    None,
    description="The event loop to use to run browser-use. Will use asyncio.get_event_loop() if not provided.",
)

def __init__(self, **data):
    super().__init__(**data)
    if self.event_loop is None:
        try:
            self.event_loop = asyncio.get_running_loop()
        except RuntimeError:
            self.event_loop = asyncio.new_event_loop()

2. Error Handling

The lack of comprehensive error handling can lead to crashes during browser operations. Implementing a safe operation wrapper will help manage this:

Suggested Addition:

async def _safe_browser_operation(self, operation):
    try:
        return await operation
    except Exception as e:
        return f"Browser operation failed: {str(e)}"

3. Type Hints and Documentation

Enhancing type hints across the code will increase clarity. Adding detailed docstrings to methods not only helps with maintainability but also aids future developers.

Example for Docstring:

def _parse_history(self, agent_history_list: AgentHistoryList, max_steps: int) -> str:
    """Parse the browser interaction history into a readable format.
    
    Args:
        agent_history_list: A list of interactions.
        max_steps: The maximum number of steps executed.
        
    Returns:
        str: Formatted string of interaction details.
    """

4. Configuration Validation

Adding a validator for the browser configuration ensures that only valid configurations are accepted:

Proposed Validator:

@field_validator("browser")
@classmethod
def validate_browser(cls, browser: Optional[Browser]) -> Optional[Browser]:
    if browser is not None and not isinstance(browser, Browser):
        raise ValueError("browser must be an instance of Browser")
    return browser

Minor Suggestions

  • Constants Management: Move hardcoded values such as DEFAULT_MAX_STEPS and DEFAULT_MAX_FAILURES to class constants to eliminate magic numbers and improve code readability.
  • Testing Improvements: Strengthen the test suite by including unit tests for various methods, error testing, and browser mocking to expedite test times.

Example for Tests:

def test_browser_use_tool_error_handling():
    tool = BrowserUseTool(llm=MockLLM())
    result = tool._run(instruction="", max_steps=10)
    assert "No instruction provided" in result

Security Considerations

Security should not be overlooked in a browser context. Implement a URL validation method to mitigate potential security risks associated with URL inputs.

def _validate_url(self, url: str) -> bool:
    """Validate URL for security concerns."""
    allowed_protocols = {'http', 'https'}
    parsed = urlparse(url)
    return parsed.scheme in allowed_protocols

Conclusion

Overall, the Browser-Use Tool implementation is a well-structured addition to our repository, featuring thoughtful consideration of user experience and functionality. By addressing the highlighted issues and implementing the recommended changes, we can significantly enhance the robustness and maintainability of the code, ensuring it lives up to the project's standards.

I look forward to seeing the improvements!

)


class BaseBrowserUseTool(BaseTool):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please update this tool to throw an error and tell the user how to import browser_use if it isn't installed. This is the new standard we are implementing in all of our tools to make them more user friendly:

def __init__(self, api_key: Optional[str] = None, **kwargs):
        super().__init__(**kwargs)
        try:
            from firecrawl import FirecrawlApp  # type: ignore
        except ImportError:
            import click

            if click.confirm(
                "You are missing the 'firecrawl-py' package. Would you like to install it?"
            ):
                import subprocess

                subprocess.run(["uv", "add", "firecrawl-py"], check=True)
                from firecrawl import (
                    FirecrawlApp,
                )
            else:
                raise ImportError(
                    "`firecrawl-py` package not found, please run `uv add firecrawl-py`"
                )

        self._firecrawl = FirecrawlApp(api_key=api_key)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, although now we don't have nice type hinting :(

pyproject.toml Outdated
@@ -27,6 +27,7 @@ dependencies = [
"linkup-sdk>=0.2.1",
"spider-client>=0.1.25",
"patronus>=0.0.16",
"browser-use (==0.1.16)",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an optional dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -6,7 +6,7 @@ readme = "README.md"
authors = [
{ name = "João Moura", email = "[email protected]" },
]
requires-python = ">=3.10,<=3.13"
requires-python = ">=3.11,<3.13"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep this at python 3.10 lower bound please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants