Skip to content

Commit

Permalink
Add Generic RecordBuilder to use with (*Table)InsertRecord (#622)
Browse files Browse the repository at this point in the history
* Add Generic RecordBuilder to use with (*Table)InsertRecord

I noticed in parca `(*Table)InsertRecord` is used. The `arrow.Record` is  built by
hand. Considering the table is flat by nature, this is the part where a generic solution
fits better. Hand building records can be error prone and bring long term
maintenance burden.

This commit adds `(*Build[T])` that takes a struct and returns an api for appending `T` and
retrieving the final `arrow.Record`

Tags are used to describe the schema(This is a common pattern, works very well on parquet-go).

Here is example of Sample schema annotated with tags

```go
type Sample struct {
	ExampleType string      `frostdb:"example_type,rle_dict,asc(0)"`
	Labels      []Label     `frostdb:"labels,rle_dict,null,dyn,asc(1),null_first"`
	Stacktrace  []uuid.UUID `frostdb:"stacktrace,rle_dict,asc(3),null_first"`
	Timestamp   int64       `frostdb:"timestamp,asc(2)"`
	Value       int64       `frostdb:"value"`
}
```

Which is the same as

```go
func SampleDefinition() *schemapb.Schema {
	return &schemapb.Schema{
		Name: "test",
		Columns: []*schemapb.Column{{
			Name: "example_type",
			StorageLayout: &schemapb.StorageLayout{
				Type:     schemapb.StorageLayout_TYPE_STRING,
				Encoding: schemapb.StorageLayout_ENCODING_RLE_DICTIONARY,
			},
			Dynamic: false,
		}, {
			Name: "labels",
			StorageLayout: &schemapb.StorageLayout{
				Type:     schemapb.StorageLayout_TYPE_STRING,
				Nullable: true,
				Encoding: schemapb.StorageLayout_ENCODING_RLE_DICTIONARY,
			},
			Dynamic: true,
		}, {
			Name: "stacktrace",
			StorageLayout: &schemapb.StorageLayout{
				Type:     schemapb.StorageLayout_TYPE_STRING,
				Encoding: schemapb.StorageLayout_ENCODING_RLE_DICTIONARY,
			},
			Dynamic: false,
		}, {
			Name: "timestamp",
			StorageLayout: &schemapb.StorageLayout{
				Type: schemapb.StorageLayout_TYPE_INT64,
			},
			Dynamic: false,
		}, {
			Name: "value",
			StorageLayout: &schemapb.StorageLayout{
				Type: schemapb.StorageLayout_TYPE_INT64,
			},
			Dynamic: false,
		}},
		SortingColumns: []*schemapb.SortingColumn{{
			Name:      "example_type",
			Direction: schemapb.SortingColumn_DIRECTION_ASCENDING,
		}, {
			Name:       "labels",
			Direction:  schemapb.SortingColumn_DIRECTION_ASCENDING,
			NullsFirst: true,
		}, {
			Name:      "timestamp",
			Direction: schemapb.SortingColumn_DIRECTION_ASCENDING,
		}, {
			Name:       "stacktrace",
			Direction:  schemapb.SortingColumn_DIRECTION_ASCENDING,
			NullsFirst: true,
		}},
	}
}
```

Example usage

```go
		b := NewBuild[Sample](memory.DefaultAllocator)
		defer b.Release()
		samples := NewTestSamples()
		b.Append(samples...)
		r := b.NewRecord()
```

Please see the `Build` godoc comment for more details and limitations.

* fix lint: handle returned error

* fix lint: gofumpt

* fox lint: goimports -w -local
  • Loading branch information
gernest authored Dec 12, 2023
1 parent bc23e42 commit 9e1c181
Show file tree
Hide file tree
Showing 3 changed files with 679 additions and 5 deletions.
10 changes: 5 additions & 5 deletions dynparquet/example.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@ type Label struct {
}

type Sample struct {
ExampleType string
Labels []Label
Stacktrace []uuid.UUID
Timestamp int64
Value int64
ExampleType string `frostdb:"example_type,rle_dict,asc(0)"`
Labels []Label `frostdb:"labels,rle_dict,null,dyn,asc(1),null_first"`
Stacktrace []uuid.UUID `frostdb:"stacktrace,rle_dict,asc(3),null_first"`
Timestamp int64 `frostdb:"timestamp,asc(2)"`
Value int64 `frostdb:"value"`
}

type Samples []Sample
Expand Down
Loading

0 comments on commit 9e1c181

Please sign in to comment.