-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the performance regression detection method to reduce false positives. #58
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this from the commit.
// Calculate means of both datasets | ||
double meanBefore = StatUtils.mean(before); | ||
double meanAfter = StatUtils.mean(after); | ||
// Calculate the difference in means (regression value) | ||
double regressionValue = (meanAfter - meanBefore) / meanBefore; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code would be more logical/readable if these lines were moved into the if (pValue <0.05) {
block.
for (int i = 0; i < rawDataList.size(); i++) { | ||
IonDecimal score = (IonDecimal) rawDataList.get(i); | ||
rawData.add(score.bigDecimalValue()); | ||
public static double detectRegression(double[] before, double[] after) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, this is detecting whether there is a statistically significant change. It looks like it will return a non-zero value both for a significant regression and a significant improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, this is detecting whether there is a statistically significant change. It looks like it will return a non-zero value both for a significant regression and a significant improvement.
Yes, and only significant regression value will be added to the comparisonResults
map which is used for deciding whether ion-java-regression-detection
workflow should fail or not. Here is the comment about the conditional check
List<Double> filteredData = new ArrayList<>(); | ||
for (double value : data) { | ||
if (value >= lowerBound && value <= upperBound) { | ||
filteredData.add(value); | ||
} | ||
} | ||
return scoreStruct; | ||
return filteredData.stream().mapToDouble(d -> d).toArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make this a lot more concise. (I'm only 95% sure the syntax of suggestion is correct, but I know the concept is sound.)
return Arrays.stream(data).filter(d -> lowerBound <= d && d <= upperBound).toArray();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion.
if (score.getType().equals(IonType.FLOAT)) { | ||
IonFloat scoreFloat = (IonFloat) score; | ||
return scoreFloat.bigDecimalValue(); | ||
public static double[] preProcess(String benchmarkResult, String keyWord) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use a more descriptive name than preProcess
. How about loadKeywordSpecificBenchmarkResults
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sounds good to me. I'll update it in the next commit.
IonStruct benchmarkResultStruct = readHelper(benchmarkResultFilePath); | ||
IonStruct parameterStruct = (IonStruct) benchmarkResultStruct.get(PARAMETERS); | ||
return parameterStruct.get(keyWord); | ||
public static DoubleStream toDouble(IonList data){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be private. You should also just return List<Double>
or double[]
unless you're going to be passing around DoubleStream
to other functions. (E.g. if the input and output of removeOutliers
was DoubleStream
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will change to private in the next commit.
The raw data is a nested IonList
, and toDouble
is used for transform the inner IonList
to DoubleStream
while flattening the raw data. The return value of toDouble
is not directly passing to removeOutliers
. Here is where the data flattening happened. In this case, should we still consider the returned data type as List<Double>
or double[]
?
private final static String GC_ALLOCATE = "·gc.alloc.rate"; | ||
private final static String HEAP_USAGE = "Heap usage"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain to me why it's okay to get rid of these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain to me why it's okay to get rid of these?
These two constant variables were used for composing the regression detection summary to represent which metric has regression. For now, we compose the regression detection summary by iterating the comparisonResults. The map comparisonResults
contains key-value pairs where the keys are metrics that have regression detected and the values are the amounts they have regressed. The keys already include the names of deleted variables, so the deleted information does not need to be initialized separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! The new technique is simpler (in the code we own, anyway) and more sound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an accidental commit?
@@ -44,7 +44,7 @@ public class Main { | |||
|
|||
+ " ion-java-benchmark run-suite (--test-ion-data <file_path>) (--benchmark-options-combinations <file_path>) <output_file>\n" | |||
|
|||
+ " ion-java-benchmark compare (--benchmark-result-previous <file_path>) (--benchmark-result-new <file_path>) <output_file>\n" | |||
+ " ion-java-benchmark compare (--benchmark-result-previous <file_path>) (--benchmark-result-new <file_path>)\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was <output_file>
not used? Or did you determine that it is not useful, and we should always write to stdout
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, <output_file>
is not useful for now. We will write out the regression results.
double[] previousData = preProcess(benchmarkResultPrevious, benchmarkScoreKeyword); | ||
double[] newData = preProcess(benchmarkResultNew, benchmarkScoreKeyword); | ||
double comparisonResult = detectRegression(previousData, newData); | ||
if (comparisonResult > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replying to this comment
detectRegression()
method will return a non-zero value both for a significant regression and a significant improvement and only significant regression (when comparison result > 0) will be added to the final comparisonResults
map.
Issue #, if available:
#53
Description of changes:
This PR optimized the method of detecting performance regression by making these changes:
removeOutliers
anddetectRegression
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.