Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't produce AVRO objects that have an array of records property #70

Open
zaidyahya opened this issue Aug 24, 2020 · 2 comments
Open

Comments

@zaidyahya
Copy link

I'm trying to create an Avro schema that defines an object more complex than routine. Specifically, I want it to have a property that holds an array of records. However, I'm unable to define it through the schema. I'm using ksql-datagen and thus utilize the Avro Random Generator in schema building.

My aim is a structure like this,

class Event {
    String name
    Address address
    List<Reference> references
}

where

class Address {
    String street
}

class Reference {
    String age
}

My AVRO schema looks like this,

{
        "namespace": "ksql",
        "name": "event",
        "type": "record",
        "fields": [
            {   "name": "name", 
                "type": {
                    "type": "string",
                    "arg.properties": {
                        "options": [
                            "JaneDoe",
                            "TestUser"
                        ]
                    }
                }
            },
            {   "name": "address",
                "type": {
                    "name": "address",
                    "type": "record",
                    "fields": [
                        {
                            "name": "street",
                            "type": {
                                "type": "string",
                                "arg.properties": {
                                    "regex":"Account_[1-9]{0,2}"
                                }
                            }
                        }
                    ]
                }
            },
            {
                "name": references,
                "type": {
                    "type": "array",
                    "items": {
                        "name": "reference",
                        "type": {
                            "name": "reference",
                            "type": "record",
                            "fields": [
                                {
                                    "name": "age",
                                    "type": {
                                        "type": "string",
                                        "arg.properties": {
                                            "options": [
                                                "13", "14"
                                            ]
                                        }
                                    }
                                }
                            ]
                        }
                    }
                }
            }
        ]
}

I get the error No type: {"name":"reference","type":{"name":"reference","type":"record","fields":[{"name":"age","type":{"type":"string","arg.properties":{"options":["13","14"]}}}]}}

I have tried different ways of defining the array, such as removing the first name and type in items. But that gets me this error, org.apache.kafka.connect.errors.DataException: Invalid Java object for schema type STRUCT: class org.apache.avro.generic.GenericData$Record for field: "null"

A single record property works fine i.e. address property, but I can't construct an array of records. Could any point out what I'm doing wrong?

@zaidyahya zaidyahya changed the title Can't produce AVRO objects that have a property array of records Can't produce AVRO objects that have an array of records property Aug 24, 2020
@karthikn-kmart
Copy link

karthikn-kmart commented Oct 8, 2021

Any update on this issue? I have same requirements
My data-gen schema
{ "namespace": "ksql", "name": "orders", "type": "record", "fields": [ { "name": "orderPlacedDatetime", "type": { "type": "long", "format_as_time": "unix_long", "arg.properties": { "iteration": { "start": 1, "step": 10 } } } }, { "name": "customerId", "type": { "type": "int", "arg.properties": { "range": { "min": 101, "max": 110 } } } }, { "name": "storeId", "type": { "type": "string", "arg.properties": { "regex": "store[1-9]" } } }, { "name": "products", "type": { "type": "array", "items": { "name": "product", "type": "record", "fields": [ { "name": "keycode", "type": { "type": "string", "arg.properties": { "options": [ "123", "456", "789" ] } } }, { "name": "quantity", "type": { "type": "int", "arg.properties": { "range": { "min": 1, "max": 10 } } } } ] } } } ] }

I'm getting org.apache.kafka.connect.errors.DataException: Invalid Java object for schema with type STRUCT: class org.apache.avro.generic.GenericData$Record

@poruganti-confluent
Copy link

Try this schema. I can see multiple issues with the schema you posted. First one is, the name references is not wrapped with double quotes (") and the items in the array should be embedded with in [ ].

{ "namespace": "ksql", "name": "event", "type": "record", "fields": [ { "name": "name", "type": { "type": "string", "arg.properties": { "options": [ "JaneDoe", "TestUser" ] } } }, { "name": "address", "type": { "name": "address", "type": "record", "fields": [ { "name": "street", "type": { "type": "string", "arg.properties": { "regex": "Account_[1-9]{0,2}" } } } ] } }, { "name": "references", "type": { "type": "array", "items": [ { "name": "reference", "type": { "name": "reference", "type": "record", "fields": [ { "name": "age", "type": { "type": "string", "arg.properties": { "options": [ "13", "14" ] } } } ] } } ] } } ] }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants