-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast Path Math.min/max_F/D #20999
base: master
Are you sure you want to change the base?
Fast Path Math.min/max_F/D #20999
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@hzongaro can you review and merge this? |
@@ -12173,7 +12173,38 @@ J9::Power::CodeGenerator::inlineDirectCall(TR::Node *node, TR::Register *&result | |||
return true; | |||
} | |||
break; | |||
|
|||
case TR::java_lang_Math_max_F: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this section of code? on OMR side, generateMaxMin
is expecting f/d max/min IlOpcode
already (i.e. the call-node has been transformed, so you will not run to this section). please double-check it ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll mention that from my understanding of the Z code they do both. Calls either transformed to f/d max/min ILOpcode during RecognizedCallTransformer, or inlined in inlineDirectCall
if it remained a call.
Only benefit in runs where RecognizedCallTransformer is not performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it remained a call and evaluator chose to call generateMaxMin
, it will crash, won't it?
switch (node->getOpCodeValue())
{
case TR::imax:
case TR::imin:
cmp_op = TR::InstOpCode::cmp4; break;
case TR::iumax:
case TR::iumin:
cmp_op = TR::InstOpCode::cmpl4; break;
case TR::lmax:
case TR::lmin:
cmp_op = TR::InstOpCode::cmp8; break;
case TR::lumax:
case TR::lumin:
cmp_op = TR::InstOpCode::cmpl8; break;
case TR::fmax:
case TR::fmin:
case TR::dmax:
case TR::dmin:
cmp_op = TR::InstOpCode::fcmpu; break;
default: TR_ASSERT(false, "assertion failure"); break; <=== assert here?
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My testing so far shows that all calls should be caught in the transformation, so removing the code should have no effect. But my testing code likely doesn't cover all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luke-li-2003 please attach some jit logs compiling the targeted method both before and after your changes, thus allowing us to see it did kick in. also, it is a little surprise that performance benefit is not as big as expected.
Re-enable the fast-pathing of Math.min/max for floating points with the behaviours around +/-0.0 and NaN correctly handled. Signed-off-by: Luke Li <[email protected]>
d2205de
to
8c3ca55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Here are the logs of the master branch build and the build with my changes. trace.default.log Relevant excerpts Before:
After:
|
Re-enable the fast-pathing of Math.min/max for floating points with the behaviours around +/-0.0 and NaN correctly handled.
Depends on eclipse-omr/omr#7617