Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(jans-cedarling): update PolicyStore parser to support agama-lab generated policies #10098

Closed
wants to merge 38 commits into from

Conversation

rmarinn
Copy link
Contributor

@rmarinn rmarinn commented Nov 10, 2024

Prepare


Description

This PR enhances the PolicyStore struct to support policy stores generated by the Agama Lab Policy Designer, enabling seamless loading of policies in both JSON and YAML formats.

New Features:

  • JSON Loading: Use PolicyStore::load_from_json(json_str) to load policy stores from JSON.
  • YAML Loading: Use PolicyStore::load_from_yaml(yaml_str) to load policy stores from YAML.

Target issue

The issue addressed by this PR involves users being unable to directly use policy stores exported from Agama Lab's Policy Designer. This enhancement enables seamless integration of Agama-generated policies (see Updated JSON Policy Store Schema).

closes #10038

Implementation Details

This update introduces a revised schema for the PolicyStore, including updates to trusted_issuers and TokenEntityMetadata. The following sections provide detailed information on the changes.

Updated JSON Policy Store Schema

The JSON structure has been updated to include additional fields and configurations:

{
    "cedar_version": "v4.0.0",
    "policy_stores": {
            "some_random_id": {
                "name": "policy name",
                "description": "a brief description about the policy",
                "policies": {...}
                "trusted_issuers": {...},
                "schema": "base_64_encoded_json_cedar_schema"
            }
    },
}
Updated trusted_issuers schema
"trusted_issuers": {
  "some_unique_id" : {
    "name": "name_of_the_trusted_issuer",
    "description": "description for the trusted issuer",
    "openid_configuration_endpoint": "https://<trusted-issuer-hostname>/.well-known/openid-configuration",
    "access_tokens": { 
      "trusted": true,
      "principlal_identifier": "jti",
      ...
    },
    "id_tokens": { ... },
    "userinfo_tokens": { ... },
    "tx_tokens": { ... },
  },
  ...
}
  • Note: access_tokens now includes new fields like trusted and principal_identifier alongside existing fields.
Updated Token Entity Metadata schema (used for: access_tokens, id_tokens, usrinfo_tokens, and tx_tokens).
{
  "token_type": {
    "user_id": "<field name in token (e.g., 'email', 'sub', 'uid', etc.) or '' if not used>",
    "role_mapping": "<field for role assignment (e.g., 'role', 'memberOf', etc.) or '' if not used>",
    "claim_mapping": {
      "mapping_target": {
        "parser": "<type of parser ('regex' or 'json')>",
        "type": "<type identifier (e.g., 'Acme::Email')>",
        "...": "Additional configurations specific to the parser"
      },
    },
  }
}
Updated YAML Policy Store Schema

For easier readability and authoring, the YAML format has been simplified:

cedar_version: v4.0.0
name: PolicyStoreOk
description: A test policy store where everything is fine.
policies:
  840da5d85403f35ea76519ed1a18a33989f855bf1cf8:
    description: simple policy example for principal workload
    creation_date: '2024-09-20T17:22:39.996050'
    policy_content: |-
      permit(
          principal is Jans::Workload,
          action in [Jans::Action::"Update"],
          resource is Jans::Issue
      )when{
          principal.org_id == resource.org_id
      };
  444da5d85403f35ea76519ed1a18a33989f855bf1cf8:
    description: simple policy example for principal user
    creation_date: '2024-09-20T17:22:39.996050'
    policy_content: |-
      permit(
          principal is Jans::User,
          action in [Jans::Action::"Update"],
          resource is Jans::Issue
      )when{
          principal.country == resource.country
      };
schema: |-
  namespace Jans {
    type Url = {"host": String, "path": String, "protocol": String};
    entity Access_token = {"aud": String, "exp": Long, "iat": Long, "iss": TrustedIssuer, "jti": String};
    entity Issue = {"country": String, "org_id": String};
    entity Role;
    entity TrustedIssuer = {"issuer_entity_id": Url};
    entity User in [Role] = {"country": String, "email": String, "sub": String, "username": String};
    entity Workload = {"client_id": String, "iss": TrustedIssuer, "name": String, "org_id": String};
    entity id_token = {"acr": String, "amr": String, "aud": String, "exp": Long, "iat": Long, "iss": TrustedIssuer, "jti": String, "sub": String};
    action "Update" appliesTo {

      principal: [Workload, User, Role],
      resource: [Issue],
      context: {}
    };
  }
trusted_issuers:
  IDP1:
    name: 'Google'
    description: 'Consumer IDP'
    openid_configuration_endpoint: 'https://accounts.google.com/.well-known/openid-configuration'
    access_tokens:
        trusted: true
        principal_identifier: jti
    id_tokens:
        user_id: 'sub'
        role_mapping: 'role'
        claim_mapping: {}
  • Default values are implemented for missing fields.
Updated Rust Implementation

The PolicyStore struct and related methods have been refactored to support the new schema:

pub struct PolicyStore {
    pub name: Option<String>,
    pub description: Option<String>,
    pub cedar_version: Option<Version>,
    pub policies: HashMap<String, PolicyContent>,
    pub cedar_schema: CedarSchema,
    pub trusted_issuers: HashMap<String, TrustedIssuerMetadata>,
    policy_set: PolicySet,
}

impl PolicyStore {
    pub fn load_from_json(json: &str) -> Result<Self, LoadPolicyStoreError> {
        let json_store = serde_json::from_str::<PolicyStoreJson>(json)
            .map_err(LoadFromJsonError::Deserialization)?;
        json_store.try_into().map_err(LoadPolicyStoreError::Json)
    }
    pub fn load_from_yaml(yaml: &str) -> Result<Self, LoadPolicyStoreError> {
        let yaml_store = serde_yml::from_str::<PolicyStoreYaml>(yaml)
            .map_err(LoadFromYamlError::Deserialization)?;
        Ok(yaml_store.into())
    }

    pub fn policy_set(&self) -> &PolicySet {
        &self.policy_set
    }
}
Testing and Validation
  • Unit Tests: Added tests for JSON and YAML loading to ensure the new functionality works correctly.
  • Integration Tests: Tested compatibility with policies exported from Agama Lab Policy Designer.

Test and Document the changes

  • Static code analysis has been run locally and issues have been fixed
  • Relevant unit and integration tests have been added/updated
  • Relevant documentation has been updated if any (i.e. user guides, installation and configuration guides, technical design docs etc)

Please check the below before submitting your PR. The PR will not be merged if there are no commits that start with docs: to indicate documentation changes or if the below checklist is not selected.

  • I confirm that there is no impact on the docs due to the code changes in this PR.

- implement a `ClaimMapping` struct for the new `claim_mapping` field in
  the policy store
- implement deserialize for the `ClaimMapping` struct

Signed-off-by: rmarinn <[email protected]>
- implement `TokenEntityMetada` struct
- implement `Deserialize` for `TokenEntityMetadata`

Signed-off-by: rmarinn <[email protected]>
- implement new struct TrustedIssuerMetadata
- implement Deserialize for TrustedIssuerMetadata

Signed-off-by: rmarinn <[email protected]>
- Implement AgamaPolicyStore struct.
- Implement Deserialize for AgamaPolicyStore struct.

Signed-off-by: rmarinn <[email protected]>
- change the type of AgamaPolicyStore.cedar_schema from
  cedar_policy::Schema to CedarSchema to make it compatibale with the
  existsing implementation

Signed-off-by: rmarinn <[email protected]>
- update the token_metadata implementation in the Cerdarling PolicyStore
  to support the new schema.

Signed-off-by: rmarinn <[email protected]>
- remove old implementation for IdentitySource struct and related
  implementations. The new implementation, TrustedIssuerMetadata, has
  now been implementd with the main policy store.

Signed-off-by: rmarinn <[email protected]>
@rmarinn rmarinn added the comp-jans-cedarling Touching folder /jans-cedarling label Nov 10, 2024
@rmarinn rmarinn self-assigned this Nov 10, 2024
Copy link

dryrunsecurity bot commented Nov 10, 2024

DryRun Security Summary

The provided code changes focus on improving the security of the Cedarling application, with updates to the policy store management, JWT token handling, and authorization enforcement, as well as the introduction of robust error handling, input validation, and comprehensive test cases.

Expand for full summary

Summary:

The provided code changes cover a wide range of updates and improvements to the Cedarling application, with a strong focus on application security. The changes span various components, including the policy store management, JWT token handling, and authorization enforcement.

Key security-related aspects of the changes include:

  1. Robust error handling and input validation when loading and parsing policy store configurations from JSON and YAML formats.
  2. Secure JWT decoding and validation strategies, with support for trusted issuer metadata and OpenID configuration fetching.
  3. Comprehensive test cases covering various edge cases and error scenarios related to token validation, key service, and policy enforcement.
  4. Improvements to the policy store structure, including the introduction of the TrustedIssuerMetadata type and the ability to define fine-grained policies with conditions based on principal and resource attributes.

Overall, the changes demonstrate a security-conscious approach to the application's development, with a focus on ensuring the integrity, reliability, and robustness of the security-critical components. However, it's important to continue reviewing the entire codebase and deployment configuration to identify any potential security vulnerabilities or areas for improvement.

Files Changed:

  1. docs/cedarling/cedarling-policy-store.md: Updates to the documentation for the Cedarling Policy Store, including changes to the JSON schema and policy store structure.
  2. jans-cedarling/bindings/cedarling_python/tests/test_policy_store.py: Improvements to the error handling and test cases for the PolicyStoreSource in the cedarling_python library.
  3. jans-cedarling/bindings/cedarling_python/example.py: Changes to the handling of the policy store location and the use of environment variables.
  4. jans-cedarling/cedarling/src/authz/entities/mod.rs: Implementation of entities for various types of tokens, including access tokens, ID tokens, and user info tokens.
  5. jans-cedarling/cedarling/src/authz/mod.rs: Improvements to the authorization evaluation functionality, including comprehensive logging and error handling.
  6. jans-cedarling/cedarling/src/common/policy_store.rs: Significant refactoring of the PolicyStore and related structures, including the introduction of PolicyContent and TrustedIssuerMetadata.
  7. jans-cedarling/cedarling/src/common/cedar_schema.rs: Improvements to the handling and deserialization of the CedarSchema struct.
  8. jans-cedarling/cedarling/src/common/policy_store/claim_mapping.rs: Introduction of a new module for handling claim mapping configurations.
  9. jans-cedarling/cedarling/src/common/policy_store/json_store.rs: Implementation of a JSON-based policy store.
  10. jans-cedarling/cedarling/src/common/policy_store/token_entity_metadata.rs: Definition of the TokenEntityMetadata struct for managing token-related metadata.
  11. jans-cedarling/cedarling/src/common/policy_store/test.rs: Addition of test cases for the PolicyStore deserialization and error handling.
  12. jans-cedarling/cedarling/src/common/policy_store/trusted_issuer_metadata.rs: Implementation of the TrustedIssuerMetadata struct for managing trusted issuer information.
  13. jans-cedarling/cedarling/src/common/policy_store/yaml_store.rs: Implementation of a YAML-based policy store.
  14. jans-cedarling/cedarling/src/init/policy_store.rs: Refactoring of the load_policy_store function.
  15. jans-cedarling/cedarling/src/init/service_config.rs: Updates to the initialization of the ServiceConfig struct, particularly the handling of trusted issuers and OpenID configuration.
  16. jans-cedarling/cedarling/src/jwt/decoding_strategy.rs: Introduction of JWT decoding strategies with and without validation.
  17. jans-cedarling/cedarling/src/jwt/mod.rs: Updates to the JwtService module, including the replacement of TrustedIssuer with `T

Code Analysis

We ran 9 analyzers against 30 files and 0 analyzers had findings. 9 analyzers had no findings.

Riskiness

🟢 Risk threshold not exceeded.

View PR in the DryRun Dashboard.

@mo-auto mo-auto added area-documentation Documentation needs to change as part of issue or PR comp-docs Touching folder /docs kind-feature Issue or PR is a new feature request labels Nov 10, 2024
nynymike
nynymike previously approved these changes Nov 11, 2024
@rmarinn
Copy link
Contributor Author

rmarinn commented Nov 11, 2024

please don't approve merge yet, this is still a work in progress. i just made this so i could close the other PR.

@rmarinn rmarinn force-pushed the jans-cedarling-10038 branch from 63bea68 to 194e776 Compare November 11, 2024 21:35
- Simplify YAML test files by removing the need for
  a top-level `policy_store` ID
- Ensure YAML test files exclusively contain human-readable Cedar code;
  base64-encoded schemas are now only used for JSON test files.
- Pending: Replace the existing implementation with the new parser.

Signed-off-by: rmarinn <[email protected]>
- split the parsing for TokenEntityMetadata into separate functions per
  field for easier unit-testing

Signed-off-by: rmarinn <[email protected]>
- Refactor deserialization logic to utilize existing helper functions.

Signed-off-by: rmarinn <[email protected]>
- rename AgamaPolicyStore to PolicySource
- move PolicySouceJson into it's own file
- move PolicySouceYaml into it's own file

Signed-off-by: rmarinn <[email protected]>
- moved a test file to the /test_files directory

Signed-off-by: rmarinn <[email protected]>
@rmarinn rmarinn changed the title feat(jans-cedarling): implement parser for agama-lab generated policies feat(jans-cedarling): update PolicyStore parser to support agama-lab generated policies Nov 13, 2024
@rmarinn rmarinn marked this pull request as ready for review November 13, 2024 09:39
Copy link
Contributor

@duttarnab duttarnab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see inline comments.

``` json
"policy_content": "cGVybWl0KAogICAgc..."
```
- **policy_content** : (*String*) The Cedar Policy Encoded in Base64.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The policy content is mentioned as string. But at L55 it is shown as Json.

"principal_identifier": "some_user123",
"role_mapping": "role",
"token_type": {
"user_id": "<field name in token (e.g., 'email', 'sub', 'uid', etc.) or '' if not used>",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the claim name instead of the field name to show the exactness with the OIDC standard.

ref: https://openid.net/specs/openid-connect-basic-1_0.html#IDToken

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was just following the wiki: Token Entity Metadata Schema... maybe a discussion should be made first for this one? I thought we should be just following the wiki because discussions have been made for that already.

- **token_type:** The type of token being processed, such as `access_tokens`, `id_tokens`, `userinfo_tokens`, and `tx_tokens`.
- **user_id (_Optional_):** The field in the token used to identify the user. If not needed, set to an empty string (`""`).
- **role_mapping (_Optional_):** Indicates which field in the token should be used for role-based access control. If not needed, set to an empty string (`""`).
- **claim_mapping:** Defines how to extract and transform specific claims from the token. Each claim can have its own parser (`regex` or `json`) and type (`Acme::Email`, `Acme::URI`, etc.).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be easier to understand if the regex parser in claim_mapping is explained with an example—showing how Cedarling would use the regex parser to populate the claim value from the token to the entity. Otherwise, the point remains unclear. Additionally, the json parser could also be explained with an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was only following this: https://github.com/JanssenProject/jans/wiki/Cedarling-Nativity-Plan. I honestly don't know how to use this as well so i couldn't provide an example on the behavior that Cedarling will do after it parsed the config.

Copy link
Contributor

@djellemah djellemah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, I'm strongly opposed to having separate code paths for yaml and json parsing. A large part of the justification for introducing yaml was to maintain exactly the same code paths as json, so that yaml test fixtures will exercise the same code that the json text fixtures were previously exercising.

As far as I know, the Agama Lab Policy Designer team has no plans to provide output in yaml?

Secondly, the purpose of having the alternative String | Object structure as the value for schema and content_policy was to allow for human-readability in test fixtures, while simultaneously providing the Agama Lab Policy Designer team with a backwards-compatible schema, that would also allow them to start using human-readable values when and if that works for them.

Copy link
Contributor

@olehbozhok olehbozhok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I agree with @djellemah.

@rmarinn you made a lot of work, but it was not discussed and as a result it needs redo.
Using different parsing code for YAML and JSON will lead to errors.

.ok_or_else(|| de::Error::missing_field("parser"))?;

match parser {
"regex" => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we can't just create struct smt like this and deserialize?
Definitely with better naming..

struct SomeName{
    #[serde(rename(deserialize = "type"))]
    cedar_type: String,
    regex_expression: String,
    #[serde(flatten)]
    fields: HashMap<String, RegexField>,
}

pub struct PolicyStoreDataJson {
/// Optional name of the policy store.
#[serde(deserialize_with = "parse_option_string", default)]
pub name: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is mandatory. Because it defines namespace of cedar-policy schema

pub struct PolicyStoreJson {
/// Optional Cedar version information.
#[serde(deserialize_with = "parse_maybe_cedar_version")]
cedar_version: Option<Version>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -184,7 +184,7 @@ impl<'de> serde::Deserialize<'de> for CedarSchema {
mod deserialize {
#[derive(Debug, thiserror::Error)]
pub enum ParseCedarSchemaSetMessage {
#[error("unable to decode cedar policy schema base64")]
#[error("Failed to decode Base64 encoded string")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be great to extend

Suggested change
#[error("Failed to decode Base64 encoded string")]
#[error("Failed to decode Base64 encoded string cedar policy schema")]

.decode(value)
.map_err(|e| de::Error::custom(format!("Failed to decode Base64 encoded string: {}", e)))?;
String::from_utf8(buf)
.map_err(|e| de::Error::custom(format!("Failed to decode Base64 encoded string: {}", e)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case we dont decode base64 but cast base64 decoded string to utf8

olehbozhok
olehbozhok previously approved these changes Nov 13, 2024
Copy link
Contributor

@olehbozhok olehbozhok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to merge this and fix later

@rmarinn
Copy link
Contributor Author

rmarinn commented Nov 14, 2024

I just made a new PR instead of trying to revert this to fix the issues. Please see: #10141.

@rmarinn rmarinn closed this Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-documentation Documentation needs to change as part of issue or PR comp-docs Touching folder /docs comp-jans-cedarling Touching folder /jans-cedarling kind-feature Issue or PR is a new feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix(cedarling): Make cedarling Policy Store compatible with Agama Lab
6 participants