Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bedrock: Agent construct fails with Claude 3.5 v2 & Haiku 3.5 #796

Open
1 task done
mccauleyp opened this issue Nov 12, 2024 · 1 comment
Open
1 task done

Bedrock: Agent construct fails with Claude 3.5 v2 & Haiku 3.5 #796

mccauleyp opened this issue Nov 12, 2024 · 1 comment
Labels
backlog bug Something isn't working

Comments

@mccauleyp
Copy link

mccauleyp commented Nov 12, 2024

Describe the bug

Attempting to use Claude 3.5 v2 or Haiku 3.5 with the Agent construct will produce a successful deployment but a broken agent that produces "Internal server error" responses. That's because these models require invocation via an inference profile but the construct provisions them in an "on demand" mode that isn't compatible.

Expected Behavior

Should be able to deploy agents using these models.

Current Behavior

Agent deployment succeeds but produces "Internal server error" responses.

Reproduction Steps

Create an agent using Sonnet 3.5 v2 or Haiku 3.5, e.g.:

bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V2_0

Possible Solution

I am working around the issue by using the CDK escape hatch to override the CloudFormation foundation model property, which might provide some hints as to how the construct could be modified:

from aws_cdk import Stack, aws_bedrock, aws_iam
from cdklabs.generative_ai_cdk_constructs import bedrock

AGENT_MODEL = bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V2_0
AGENT_INSTRUCTION = "You are a dog, always respond with 'woof woof'."
AGENT_ALIAS_VERSION = "1"


class BedrockResources:
    def __init__(self, scope: Stack) -> None:
        stage_name = get_stage_name(scope)

        agent_name = "my-agent"
        self.agent = bedrock.Agent(
            scope,
            "Agent",
            name=agent_name,
            instruction=AGENT_INSTRUCTION,
            foundation_model=AGENT_MODEL,
        )
        self._enable_inference_profile(
            scope=scope, agent=self.agent, model=AGENT_MODEL
        )

        self.agent_alias = self.agent.add_alias(
            alias_name=f"{agent_name}-v{AGENT_ALIAS_VERSION}"
        )

    @staticmethod
    def _enable_inference_profile(
        scope: Stack, agent: bedrock.Agent, model: bedrock.BedrockFoundationModel
    ) -> None:
        """Enable models that require or support inference profiles.

        Inference profiles are used for cross-region inference, which improves
        performance by enabling load balancing of requests across regions. Certain
        models like Claude Sonnet 3.5 v2 and Haiku 3.5 must use inference profiles

        This is not yet supported by the Agent CDK construct, so we can override the
        configuration on underlying CloudFormation property.
        """
        model_str = model.to_string()
        inference_profile_arn = f"arn:aws:bedrock:{scope.region}:{scope.account}:inference-profile/us.{model_str}"  # noqa: E501
        foundation_model_arn = f"arn:aws:bedrock:*::foundation-model/{model_str}"

        invoke_inference_profile_policy = aws_iam.Policy(
            scope,
            f"InferenceProfilePolicy{agent.name}",
            statements=[
                aws_iam.PolicyStatement(
                    actions=["bedrock:InvokeModel*", "bedrock:GetInferenceProfile"],
                    resources=[foundation_model_arn, inference_profile_arn],
                )
            ],
            roles=[agent.role],
        )

        cfn_agent: aws_bedrock.CfnAgent = agent.node.find_child("Agent")  # type:ignore[assignment]
        cfn_agent.foundation_model = inference_profile_arn
        cfn_agent.node.add_dependency(invoke_inference_profile_policy)

Additional Information/Context

No response

CDK CLI Version

2.166.0

Framework Version

0.1.279

Node.js Version

v20.11.0

OS

OSX

Language

Python

Language Version

3.12

Region experiencing the issue

us-east-1

Code modification

No

Other information

No response

Service quota

  • I have reviewed the service quotas for this construct
@krokoko
Copy link
Collaborator

krokoko commented Nov 12, 2024

Thanks for reporting this issue @mccauleyp , this should be fixed when #683 is implemented

@krokoko krokoko added backlog and removed needs-triage This issue or PR still needs to be triaged. labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog bug Something isn't working
Development

No branches or pull requests

2 participants