Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/serialization minimap #23

Merged
merged 38 commits into from
Jun 24, 2024
Merged

Conversation

mineralntl
Copy link
Collaborator

No description provided.

@mineralntl mineralntl linked an issue Jun 9, 2023 that may be closed by this pull request
@mineralntl mineralntl force-pushed the feature/serializationMinimap branch from 02dbeb5 to 6e5d9d6 Compare June 20, 2023 19:16

public class TypeMetadata implements Serializable {

private Set<String> ingestTypes = Sets.newHashSet();
private Set<String> ingestTypes = new TreeSet<>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to make these TreeSets then the variable should be migrated to a SortedSet

public String toString() {
StringBuilder sb = new StringBuilder();

Set<String> fieldNames = Sets.newHashSet();
for (String ingestType : typeMetadata.keySet()) {
// create and append ingestTypes mini-map
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at the minimap for one of the tests and I'm wondering if the serialization schema can be simplified (and by extension, if the serialization/deserialization code can be simplified).

Take this for example:

# schema proposed in merge request
dts:[0:ingest1,1:ingest2];types:[0:DateType,1:IntegerType,2:LcType];FIELD1:[0:2,1:0];FIELD2:[0:1,1:2];FIELD3:[0:0,1:0]

# alternate schema
ingest1,ingest2;DateType,IntegerType,LcType;FIELD1:[0:2,1:0];FIELD2:[0:1,1:2];FIELD3:[0:0,1:0]

# 1. split serialized string by a semi colon, if three pieces exist then we're dealing with a minified string
# 2. Split the ingest type component by comma into an array. Now we have an array of ingest types.
# 3. Split the Types component by comma into an array. Now we have an array of Types
# 4. Split the field component by comma and iterate through. As you iterate the ingest type and Type indexes get plugged into the two arrays created earlier. 

Serialization should be simpler as well. It's calling Joiner.on(comma/colon/semi-colon).join(ingesttype/type/etc)

apmoriarty
apmoriarty previously approved these changes Feb 27, 2024
hlgp
hlgp previously approved these changes Mar 4, 2024
@mineralntl mineralntl dismissed stale reviews from hlgp and apmoriarty via 66d34f4 March 6, 2024 20:46
@mineralntl
Copy link
Collaborator Author

Changed unintentionally cat'ing dataTypes after further testing

@mineralntl mineralntl requested a review from avgAGB May 22, 2024 18:29
@mineralntl mineralntl merged commit 5b62bf6 into main Jun 24, 2024
2 checks passed
@mineralntl mineralntl deleted the feature/serializationMinimap branch June 24, 2024 17:42
jwomeara added a commit that referenced this pull request Jun 26, 2024
jwomeara added a commit that referenced this pull request Jun 26, 2024
commit 56a2269762b7ccd8986790aa9f0d235172ff3161
Author: Whitney O'Meara <[email protected]>
Date:   Wed Jun 26 20:40:33 2024 +0000

    udpated type-utils version

commit ab0098dbe86f2d15bf2a18c1b74dc22623183c30
Author: Whitney O'Meara <[email protected]>
Date:   Wed Jun 26 18:51:44 2024 +0000

    Reapply "Feature/serialization minimap (#23)"

    This reverts commit 8255413.

commit 1a772f6480d1edf6fa383e2e6f4dc6138209b88e
Merge: 8255413 ebca7ce
Author: Whitney O'Meara <[email protected]>
Date:   Wed Jun 26 18:16:19 2024 +0000

    Merge remote-tracking branch 'origin/main' into feature/mapService

commit ebca7ce
Author: Moon Moon <[email protected]>
Date:   Wed Jun 26 13:31:05 2024 -0400

    Preventing npe from empty data or ingest type string (#38)

commit 8255413
Author: Whitney O'Meara <[email protected]>
Date:   Tue Jun 25 13:32:06 2024 +0000

    Revert "Feature/serialization minimap (#23)"

    This reverts commit 5b62bf6.

commit 7760b83
Author: Moriarty <[email protected]>
Date:   Tue Jun 25 09:45:29 2024 +0000

    [maven-release-plugin] prepare for next development iteration

commit 6a34dce
Author: Moriarty <[email protected]>
Date:   Tue Jun 25 09:45:27 2024 +0000

    [maven-release-plugin] prepare release 4.0.2

commit f589568
Merge: add487b 5b62bf6
Author: Whitney O'Meara <[email protected]>
Date:   Mon Jun 24 22:08:51 2024 +0000

    Merge remote-tracking branch 'origin/main' into feature/mapService

commit 5b62bf6
Author: Moon Moon <[email protected]>
Date:   Mon Jun 24 13:42:14 2024 -0400

    Feature/serialization minimap (#23)

    * Adding ability to parse string with minimap

    * Creating mini-map during serialization

    * Changing to TreeSet for ordering purposes

    * WIP forming new mini-map string

    * Ensuring ordered types during serialization

    * Removing duplicate unit test

    * Adding ability to parse string with minimap

    * Creating mini-map during serialization

    * Changing to TreeSet for ordering purposes

    * WIP forming new mini-map string

    * Ensuring ordered types during serialization

    * Removing duplicate unit test

    * Moving hard coded strings

    * Formatting

    * Removing old method calls

    * Removing unnecessary exception throwing

    * Formatting

    * Updating to remove HashSet to preserve ordering

    * Updating unit tests

    * Updating unit tests again

    * Updating unit tests again again

    * Formatting

    * Removing old methods

    * Adding in fieldName creation

    * Updates based on testing

    * Returning immutable map

    * Fixing concatenated dataTypes

    ---------

    Co-authored-by: Ivan Bella <[email protected]>

commit 6f2d4d4
Author: Moriarty <[email protected]>
Date:   Mon Jun 24 12:15:21 2024 -0400

    Support field cardinality across a date range (#37)

    * Add seeking filter for the F column to support getting field cardinality across a date range

    * guard against empty ranges

    * move log messages from debug to trace

commit 44482bd
Author: Moriarty <[email protected]>
Date:   Tue Jun 18 10:57:37 2024 -0400

    Cleanup code, logging formats for AllFieldMetadataHelper. Wrap scanner in try-with-resources blocks (#36)

commit f288080
Author: Moriarty <[email protected]>
Date:   Fri Jun 14 09:30:16 2024 -0400

    MetadataHelper address try-with-resources warnings (#35)

    * Wrap scanners in try-with-resources

    * Additional instances of try-with-resources

commit c33f5ea
Author: Moriarty <[email protected]>
Date:   Thu Jun 6 11:54:16 2024 +0000

    [maven-release-plugin] prepare for next development iteration

commit e86baa2
Author: Moriarty <[email protected]>
Date:   Thu Jun 6 11:54:14 2024 +0000

    [maven-release-plugin] prepare release 4.0.1

commit cb733db
Author: Moriarty <[email protected]>
Date:   Wed Jun 5 08:07:04 2024 -0400

    Add table test for the MetadataHelper, update docs, general code cleanup (#34)

commit add487b
Merge: bada436 2948b55
Author: Whitney O'Meara <[email protected]>
Date:   Thu May 23 04:18:11 2024 +0000

    Merge remote-tracking branch 'origin/main' into feature/mapService

commit 2948b55
Author: Whitney O'Meara <[email protected]>
Date:   Mon May 20 17:49:29 2024 +0000

    [maven-release-plugin] prepare for next development iteration

commit a30194b
Author: Whitney O'Meara <[email protected]>
Date:   Mon May 20 17:49:27 2024 +0000

    [maven-release-plugin] prepare release 4.0.0

commit 870ccfd
Author: Whitney O'Meara <[email protected]>
Date:   Mon May 20 17:48:56 2024 +0000

    updated to tagged release

commit 570d8a9
Author: Whitney O'Meara <[email protected]>
Date:   Mon May 20 12:32:27 2024 -0400

    Feature/query microservices (#33)

    * bumped release version

    * bumped versions for some modules

    * Updated with latest changes from main/integration

    * Updated package names for commons.lang3 classes due to type-utils fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alternate TypeMetadata serialization schema
6 participants