You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Attempting to use Claude 3.5 v2 or Haiku 3.5 with the Agent construct will produce a successful deployment but a broken agent that produces "Internal server error" responses. That's because these models require invocation via an inference profile but the construct provisions them in an "on demand" mode that isn't compatible.
Expected Behavior
Should be able to deploy agents using these models.
Current Behavior
Agent deployment succeeds but produces "Internal server error" responses.
Reproduction Steps
Create an agent using Sonnet 3.5 v2 or Haiku 3.5, e.g.:
I am working around the issue by using the CDK escape hatch to override the CloudFormation foundation model property, which might provide some hints as to how the construct could be modified:
from aws_cdk import Stack, aws_bedrock, aws_iam
from cdklabs.generative_ai_cdk_constructs import bedrock
AGENT_MODEL = bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V2_0
AGENT_INSTRUCTION = "You are a dog, always respond with 'woof woof'."
AGENT_ALIAS_VERSION = "1"
class BedrockResources:
def __init__(self, scope: Stack) -> None:
stage_name = get_stage_name(scope)
agent_name = "my-agent"
self.agent = bedrock.Agent(
scope,
"Agent",
name=agent_name,
instruction=AGENT_INSTRUCTION,
foundation_model=AGENT_MODEL,
)
self._enable_inference_profile(
scope=scope, agent=self.agent, model=AGENT_MODEL
)
self.agent_alias = self.agent.add_alias(
alias_name=f"{agent_name}-v{AGENT_ALIAS_VERSION}"
)
@staticmethod
def _enable_inference_profile(
scope: Stack, agent: bedrock.Agent, model: bedrock.BedrockFoundationModel
) -> None:
"""Enable models that require or support inference profiles.
Inference profiles are used for cross-region inference, which improves
performance by enabling load balancing of requests across regions. Certain
models like Claude Sonnet 3.5 v2 and Haiku 3.5 must use inference profiles
This is not yet supported by the Agent CDK construct, so we can override the
configuration on underlying CloudFormation property.
"""
model_str = model.to_string()
inference_profile_arn = f"arn:aws:bedrock:{scope.region}:{scope.account}:inference-profile/us.{model_str}" # noqa: E501
foundation_model_arn = f"arn:aws:bedrock:*::foundation-model/{model_str}"
invoke_inference_profile_policy = aws_iam.Policy(
scope,
f"InferenceProfilePolicy{agent.name}",
statements=[
aws_iam.PolicyStatement(
actions=["bedrock:InvokeModel*", "bedrock:GetInferenceProfile"],
resources=[foundation_model_arn, inference_profile_arn],
)
],
roles=[agent.role],
)
cfn_agent: aws_bedrock.CfnAgent = agent.node.find_child("Agent") # type:ignore[assignment]
cfn_agent.foundation_model = inference_profile_arn
cfn_agent.node.add_dependency(invoke_inference_profile_policy)
Additional Information/Context
No response
CDK CLI Version
2.166.0
Framework Version
0.1.279
Node.js Version
v20.11.0
OS
OSX
Language
Python
Language Version
3.12
Region experiencing the issue
us-east-1
Code modification
No
Other information
No response
Service quota
I have reviewed the service quotas for this construct
The text was updated successfully, but these errors were encountered:
Describe the bug
Attempting to use Claude 3.5 v2 or Haiku 3.5 with the Agent construct will produce a successful deployment but a broken agent that produces "Internal server error" responses. That's because these models require invocation via an inference profile but the construct provisions them in an "on demand" mode that isn't compatible.
Expected Behavior
Should be able to deploy agents using these models.
Current Behavior
Agent deployment succeeds but produces "Internal server error" responses.
Reproduction Steps
Create an agent using Sonnet 3.5 v2 or Haiku 3.5, e.g.:
Possible Solution
I am working around the issue by using the CDK escape hatch to override the CloudFormation foundation model property, which might provide some hints as to how the construct could be modified:
Additional Information/Context
No response
CDK CLI Version
2.166.0
Framework Version
0.1.279
Node.js Version
v20.11.0
OS
OSX
Language
Python
Language Version
3.12
Region experiencing the issue
us-east-1
Code modification
No
Other information
No response
Service quota
The text was updated successfully, but these errors were encountered: