-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/serialization minimap #23
Conversation
02dbeb5
to
6e5d9d6
Compare
…urityAgency/datawave-metadata-utils into feature/serializationMinimap
|
||
public class TypeMetadata implements Serializable { | ||
|
||
private Set<String> ingestTypes = Sets.newHashSet(); | ||
private Set<String> ingestTypes = new TreeSet<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to make these TreeSets then the variable should be migrated to a SortedSet
public String toString() { | ||
StringBuilder sb = new StringBuilder(); | ||
|
||
Set<String> fieldNames = Sets.newHashSet(); | ||
for (String ingestType : typeMetadata.keySet()) { | ||
// create and append ingestTypes mini-map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking at the minimap for one of the tests and I'm wondering if the serialization schema can be simplified (and by extension, if the serialization/deserialization code can be simplified).
Take this for example:
# schema proposed in merge request
dts:[0:ingest1,1:ingest2];types:[0:DateType,1:IntegerType,2:LcType];FIELD1:[0:2,1:0];FIELD2:[0:1,1:2];FIELD3:[0:0,1:0]
# alternate schema
ingest1,ingest2;DateType,IntegerType,LcType;FIELD1:[0:2,1:0];FIELD2:[0:1,1:2];FIELD3:[0:0,1:0]
# 1. split serialized string by a semi colon, if three pieces exist then we're dealing with a minified string
# 2. Split the ingest type component by comma into an array. Now we have an array of ingest types.
# 3. Split the Types component by comma into an array. Now we have an array of Types
# 4. Split the field component by comma and iterate through. As you iterate the ingest type and Type indexes get plugged into the two arrays created earlier.
Serialization should be simpler as well. It's calling Joiner.on(comma/colon/semi-colon).join(ingesttype/type/etc)
Changed unintentionally cat'ing dataTypes after further testing |
commit 56a2269762b7ccd8986790aa9f0d235172ff3161 Author: Whitney O'Meara <[email protected]> Date: Wed Jun 26 20:40:33 2024 +0000 udpated type-utils version commit ab0098dbe86f2d15bf2a18c1b74dc22623183c30 Author: Whitney O'Meara <[email protected]> Date: Wed Jun 26 18:51:44 2024 +0000 Reapply "Feature/serialization minimap (#23)" This reverts commit 8255413. commit 1a772f6480d1edf6fa383e2e6f4dc6138209b88e Merge: 8255413 ebca7ce Author: Whitney O'Meara <[email protected]> Date: Wed Jun 26 18:16:19 2024 +0000 Merge remote-tracking branch 'origin/main' into feature/mapService commit ebca7ce Author: Moon Moon <[email protected]> Date: Wed Jun 26 13:31:05 2024 -0400 Preventing npe from empty data or ingest type string (#38) commit 8255413 Author: Whitney O'Meara <[email protected]> Date: Tue Jun 25 13:32:06 2024 +0000 Revert "Feature/serialization minimap (#23)" This reverts commit 5b62bf6. commit 7760b83 Author: Moriarty <[email protected]> Date: Tue Jun 25 09:45:29 2024 +0000 [maven-release-plugin] prepare for next development iteration commit 6a34dce Author: Moriarty <[email protected]> Date: Tue Jun 25 09:45:27 2024 +0000 [maven-release-plugin] prepare release 4.0.2 commit f589568 Merge: add487b 5b62bf6 Author: Whitney O'Meara <[email protected]> Date: Mon Jun 24 22:08:51 2024 +0000 Merge remote-tracking branch 'origin/main' into feature/mapService commit 5b62bf6 Author: Moon Moon <[email protected]> Date: Mon Jun 24 13:42:14 2024 -0400 Feature/serialization minimap (#23) * Adding ability to parse string with minimap * Creating mini-map during serialization * Changing to TreeSet for ordering purposes * WIP forming new mini-map string * Ensuring ordered types during serialization * Removing duplicate unit test * Adding ability to parse string with minimap * Creating mini-map during serialization * Changing to TreeSet for ordering purposes * WIP forming new mini-map string * Ensuring ordered types during serialization * Removing duplicate unit test * Moving hard coded strings * Formatting * Removing old method calls * Removing unnecessary exception throwing * Formatting * Updating to remove HashSet to preserve ordering * Updating unit tests * Updating unit tests again * Updating unit tests again again * Formatting * Removing old methods * Adding in fieldName creation * Updates based on testing * Returning immutable map * Fixing concatenated dataTypes --------- Co-authored-by: Ivan Bella <[email protected]> commit 6f2d4d4 Author: Moriarty <[email protected]> Date: Mon Jun 24 12:15:21 2024 -0400 Support field cardinality across a date range (#37) * Add seeking filter for the F column to support getting field cardinality across a date range * guard against empty ranges * move log messages from debug to trace commit 44482bd Author: Moriarty <[email protected]> Date: Tue Jun 18 10:57:37 2024 -0400 Cleanup code, logging formats for AllFieldMetadataHelper. Wrap scanner in try-with-resources blocks (#36) commit f288080 Author: Moriarty <[email protected]> Date: Fri Jun 14 09:30:16 2024 -0400 MetadataHelper address try-with-resources warnings (#35) * Wrap scanners in try-with-resources * Additional instances of try-with-resources commit c33f5ea Author: Moriarty <[email protected]> Date: Thu Jun 6 11:54:16 2024 +0000 [maven-release-plugin] prepare for next development iteration commit e86baa2 Author: Moriarty <[email protected]> Date: Thu Jun 6 11:54:14 2024 +0000 [maven-release-plugin] prepare release 4.0.1 commit cb733db Author: Moriarty <[email protected]> Date: Wed Jun 5 08:07:04 2024 -0400 Add table test for the MetadataHelper, update docs, general code cleanup (#34) commit add487b Merge: bada436 2948b55 Author: Whitney O'Meara <[email protected]> Date: Thu May 23 04:18:11 2024 +0000 Merge remote-tracking branch 'origin/main' into feature/mapService commit 2948b55 Author: Whitney O'Meara <[email protected]> Date: Mon May 20 17:49:29 2024 +0000 [maven-release-plugin] prepare for next development iteration commit a30194b Author: Whitney O'Meara <[email protected]> Date: Mon May 20 17:49:27 2024 +0000 [maven-release-plugin] prepare release 4.0.0 commit 870ccfd Author: Whitney O'Meara <[email protected]> Date: Mon May 20 17:48:56 2024 +0000 updated to tagged release commit 570d8a9 Author: Whitney O'Meara <[email protected]> Date: Mon May 20 12:32:27 2024 -0400 Feature/query microservices (#33) * bumped release version * bumped versions for some modules * Updated with latest changes from main/integration * Updated package names for commons.lang3 classes due to type-utils fix
No description provided.