-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDR-18217 Remove Iterable interface from CLDRFile, add iteratorDefault, iterableDefault #4285
base: main
Are you sure you want to change the base?
Conversation
…lt, iterableDefault -Instead of Iterable interface, use new methods iterableDefault, iterableWithoutExtras -Private boolean CLDRFile.DEFAULT_ITERATION_INCLUDES_EXTRAS determines whether iteratorDefault and iterableDefault skip extras; either way they skip paths with null values -Replace original iterator methods (zero or more parameters) with iteratorDefault or iteratorWithoutExtras -Iterable and iterator are distinct, as in iterableWithoutExtras versus iteratorWithoutExtras -Call iteratorWithoutExtras/iterableWithoutExtras directly where tests would otherwise fail with DEFAULT_ITERATION_INCLUDES_EXTRAS true
This goes further than #4280 |
@@ -105,7 +105,7 @@ private static void showEnglish(PrintWriter out) throws IOException { | |||
String locale = "en"; | |||
Set<String> sorted = | |||
Builder.with(new TreeSet<String>()) | |||
.addAll(cldrFile.iterator()) | |||
.addAll(cldrFile.iteratorDefault()) | |||
.addAll(cldrFile.getExtraPaths()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: if iteratorDefault includes extra paths, then getExtraPaths would be superfluous here, except that iteratorDefault skips paths with null values (getStringValue(path) == null)
alternatively fullIterator() could be called here
An essential difference from #4280 is that here, iteration always skips paths with null values (except for fullIterable which remains unchanged). Because of this, none of the tests need to call the iterableWithoutExtras or iteratorWithoutExtras |
@@ -98,7 +99,7 @@ | |||
* http://java.sun.com/j2se/1.4.2/docs/api/org/xml/sax/DTDHandler.html | |||
*/ | |||
|
|||
public class CLDRFile implements Freezable<CLDRFile>, Iterable<String>, LocaleStringProvider { | |||
public class CLDRFile implements Freezable<CLDRFile>, LocaleStringProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Major change: CLDRFile no longer implements Iterable. Instead, a method must be called to get an iterable/iterator
@@ -1462,22 +1463,70 @@ public static Set<String> getMatchingXMLFiles(File sourceDirs[], Matcher m) { | |||
return s; | |||
} | |||
|
|||
@Override | |||
public Iterator<String> iterator() { | |||
private final boolean DEFAULT_ITERATION_INCLUDES_EXTRAS = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this boolean determines whether default iterator/iterable includes extra paths
|
||
public Iterator<String> iteratorDefault() { | ||
if (DEFAULT_ITERATION_INCLUDES_EXTRAS) { | ||
return Iterators.filter(fullIterable().iterator(), p -> getStringValue(p) != null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fullIterable is unchanged. If DEFAULT_ITERATION_INCLUDES_EXTRAS is true, iteratorDefault is the same as fullIterable except that it skips paths with null values
return dataSource.iterator(); | ||
} | ||
|
||
public synchronized Iterator<String> iterator(String prefix) { | ||
private synchronized Iterator<String> iteratorWithoutExtras(String prefix) { | ||
return dataSource.iterator(prefix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original iterator
methods are renamed to iteratorWithoutExtras
and they are now private and rarely used if DEFAULT_ITERATION_INCLUDES_EXTRAS is true
@@ -3081,7 +3146,7 @@ public String getWinningPath(String path) { | |||
public Collection<String> getExtraPaths() { | |||
Set<String> toAddTo = new HashSet<>(); | |||
toAddTo.addAll(getRawExtraPaths()); | |||
for (String path : this) { | |||
for (String path : this.iterableWithoutExtras()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iterableWithoutExtras
is required here to prevent infinite recursion, since the caller is getExtraPaths
@@ -25,7 +25,7 @@ private CharacterFallbacks() { | |||
DtdData.getInstance(DtdType.supplementalData).getDtdComparator(null); | |||
|
|||
for (Iterator<String> it = | |||
characterFallbacks.iterator("//supplementalData/characters/", comp); | |||
characterFallbacks.iteratorDefault("//supplementalData/characters/", comp); | |||
it.hasNext(); ) { | |||
String path = it.next(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the callers of iterator
outside of CLDRFile have become iteratorDefault
. Note, BTW, that this loop, and many like it, should be simplified by calling iterableDefault
instead of iteratorDefault
. I've postponed that simplification, pending decision on the overall approach. Also, including the word "default" in the method names might be temporary; for now, it helps to indicate which code is affected by changing the default behavior from excluding to including extra paths
I'm requesting review even though this is still a DRAFT PR. Question is how to proceed, not necessarily to merge this as-is. |
@@ -48,7 +48,7 @@ public static void main(String[] args) { | |||
Set<String> children = entry.getValue(); | |||
for (String child : children) { | |||
CLDRFile file = cldrFactory.make(child, false); | |||
for (String path : file) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit here for the many clients that just need for (String path : cldrFile)
? Why didn't we just leave the iteratable interface on and make it equivalent to iterableDefault - what's the downside?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, i read the issue and comment thread. I don't quite follow it- to me they are two different usage models, whether you are trying to work with what's actually in the file, or what you could find in the file (as with TestCoverage). TestCoverage, the impetus for this issue, is to me the special case. But anyway, no objection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only reason to use explicit methods instead of the iterable interface is to facilitate finding all the usages and enabling multiple versions, including/excluding extra paths, and including/excluding paths without values. Once the dust has settled and we know what we want, it may be appropriate to restore the iterable interface.
The distinction between "what's actually in the file" and "what you could find in the file" is hard for me to interpret. For an XML file on disk, it's concrete. For an XMLSource it's more abstract, since the "constructed" paths with fallback code values are "in the file" (the XMLSource in RAM) even though they're not in the XML file on disk. For a CLDRFile it's even more nebulous since the "extra paths" are contained in the CLDRFile object but they don't have values, and if you loop through the paths in the CLDRFile you may or may not encounter the extra paths depending on which method you use for looping. So the concept of being "actually in the file" doesn't seem well defined to me.
TestCoverageLevel is one impetus for this issue. Another is CldrXmlWriter, which failed to write paths even though they had values that had been voted on. Those paths were "in the file" in the sense that they were in the vote table in the database, but they were not "in the file" in the sense of being in the XMLSource, and the question of whether they were "in the CLDRFile" was nebulous due to their having started out as "extra paths" and then been voted for.
My impression, based partly on the work of making this PR, is that the distinction between "paths with values" and "paths without values" is important and relatively well defined. There's no need to treat "extra paths" per se as not really being "in the file". For some purposes, like constructing a DataPage, all the paths are needed, including those without values. For other purposes, it's appropriate to skip the paths that don't have values.
This PR seems to imply that these kinds of iteration are generally useful:
- Including all paths
- Including all paths that have (non-null) values
A third kind, including all paths except extra paths, is only useful in a very limited context, in CLDRFile when adding extra paths, and that's to avoid infinite recursion.
-new HashSet<>(getRawExtraPaths()) is just slightly shorter code -To get root LSR codes, use fullIterable, not necessary to skip any paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a massive change. I was thinking that the change from
for (String s : f) {
to
for (String s : f.iterableDefault()) {
was just an intermediate change while you were investigating, adding alternative iteration, and fixing cases that shouldn't use the default.
Since you've done that now, I don't see why we should make this change.
@macchiati "This is a massive change..." Yes, this is a draft PR. As I wrote earlier, "I'm requesting review even though this is still a DRAFT PR. Question is how to proceed, not necessarily to merge this as-is." Also, as I wrote in response to @srl295 , "Once the dust has settled and we know what we want, it may be appropriate to restore the iterable interface." How should we proceed? I suggest:
I would make a new PR in which the changes are mostly confined to CLDRFile.java. Does that sound right? |
Yes, that sounds good! Great work investigating this!
|
-Instead of Iterable interface, use new methods iterableDefault, iterableWithoutExtras
-Private final boolean CLDRFile.DEFAULT_ITERATION_INCLUDES_EXTRAS determines whether iteratorDefault and iterableDefault skip extras; either way they skip paths with null values
-Replace original iterator methods (zero or more parameters) with iteratorDefault or iteratorWithoutExtras
-Iterable and iterator are distinct, as in iterableWithoutExtras versus iteratorWithoutExtras
-Call iteratorWithoutExtras/iterableWithoutExtras directly where tests would otherwise fail with DEFAULT_ITERATION_INCLUDES_EXTRAS true
-Make methods private if possible without failing tests; surprisingly, all the WithoutExtras methods can be private
CLDR-18217
ALLOW_MANY_COMMITS=true