Fast Path Math.min/max_F/D #20999

luke-li-2003 · 2025-01-22T17:09:30Z

Re-enable the fast-pathing of Math.min/max for floating points with the behaviours around +/-0.0 and NaN correctly handled.

Depends on eclipse-omr/omr#7617

luke-li-2003 · 2025-01-22T17:09:46Z

Issue: https://github.ibm.com/runtimes/openj9-jit-power/issues/416

rmnattas

LGTM

luke-li-2003 · 2025-01-23T16:06:49Z

@hzongaro can you review and merge this?

zl-wang · 2025-01-27T19:22:22Z

runtime/compiler/p/codegen/J9TreeEvaluator.cpp

@@ -12173,7 +12173,38 @@ J9::Power::CodeGenerator::inlineDirectCall(TR::Node *node, TR::Register *&result
            return true;
            }
         break;
-
+      case TR::java_lang_Math_max_F:


do we need this section of code? on OMR side, generateMaxMin is expecting f/d max/min IlOpcode already (i.e. the call-node has been transformed, so you will not run to this section). please double-check it ...

I'll mention that from my understanding of the Z code they do both. Calls either transformed to f/d max/min ILOpcode during RecognizedCallTransformer, or inlined in inlineDirectCall if it remained a call.
Only benefit in runs where RecognizedCallTransformer is not performed.

if it remained a call and evaluator chose to call generateMaxMin, it will crash, won't it?

switch (node->getOpCodeValue()) { case TR::imax: case TR::imin: cmp_op = TR::InstOpCode::cmp4; break; case TR::iumax: case TR::iumin: cmp_op = TR::InstOpCode::cmpl4; break; case TR::lmax: case TR::lmin: cmp_op = TR::InstOpCode::cmp8; break; case TR::lumax: case TR::lumin: cmp_op = TR::InstOpCode::cmpl8; break; case TR::fmax: case TR::fmin: case TR::dmax: case TR::dmin: cmp_op = TR::InstOpCode::fcmpu; break; default: TR_ASSERT(false, "assertion failure"); break; <=== assert here? }

My testing so far shows that all calls should be caught in the transformation, so removing the code should have no effect. But my testing code likely doesn't cover all cases.

@luke-li-2003 please attach some jit logs compiling the targeted method both before and after your changes, thus allowing us to see it did kick in. also, it is a little surprise that performance benefit is not as big as expected.

Re-enable the fast-pathing of Math.min/max for floating points with the behaviours around +/-0.0 and NaN correctly handled. Signed-off-by: Luke Li <[email protected]>

zl-wang

LGTM

luke-li-2003 · 2025-01-27T23:44:42Z

Here are the logs of the master branch build and the build with my changes.

trace.default.log
trace.mathMinMax.log

Relevant excerpts

Before:

 \\ Main.main([Ljava/lang/String;)V
 \\   34 JBinvokestatic 5 Main.max(DD)D
 \\       2 JBinvokestatic 12 java/lang/Math.max(DD)D

    0x718b5ff6aa34 000000b8 [    0x718b24585390] 4bfffce1          2 	bl 	0000718B5FF6A714		; Direct Call "java/lang/Math.max(DD)D"
 PRE: [D_GPR_0128 : gr2] [FPR_0065 : fp0] [FPR_0068 : fp1] [D_GPR_0130 : gr11] [D_GPR_0131 : gr12] [D_GPR_0132 : gr0] [D_GPR_0133 : gr3] [D_GPR_0134 : gr4] [D_GPR_0135 : gr5] [D_GPR_0136 : gr6] [D_GPR_0137 : gr7] [D_GPR_0138 : gr8] [D_GPR_0139 : gr9] [D_GPR_0140 : gr10] [D_FPR_0141 : fp2] [D_FPR_0142 : fp3] [D_FPR_0143 : fp4] [D_FPR_0144 : fp5] [D_FPR_0145 : fp6] [D_FPR_0146 : fp7] [D_FPR_0147 : fp8] [D_FPR_0148 : fp9] [D_FPR_0149 : fp10] [D_FPR_0150 : fp11] [D_FPR_0151 : fp12] [D_FPR_0152 : fp13] [D_FPR_0153 : fp14] [D_FPR_0154 : fp15] [D_FPR_0155 : fp16] [D_FPR_0156 : fp17] [D_FPR_0157 : fp18] [D_FPR_0158 : fp19] [D_FPR_0159 : fp20] [D_FPR_0160 : fp21] [D_FPR_0161 : fp22] [D_FPR_0162 : fp23] [D_FPR_0163 : fp24] [D_FPR_0164 : fp25] [D_FPR_0165 : fp26] [D_FPR_0166 : fp27] [D_FPR_0167 : fp28] [D_FPR_0168 : fp29] [D_FPR_0169 : fp30] [D_FPR_0170 : fp31] [D_CCR_0171 : cr0] 
POST: [D_GPR_0128 : gr2] [FPR_0129 : fp0] [FPR_0068 : fp1] [D_GPR_0130 : gr11] [D_GPR_0131 : gr12] [D_GPR_0132 : gr0] [D_GPR_0133 : gr3] [D_GPR_0134 : gr4] [D_GPR_0135 : gr5] [D_GPR_0136 : gr6] [D_GPR_0137 : gr7] [D_GPR_0138 : gr8] [D_GPR_0139 : gr9] [D_GPR_0140 : gr10] [D_FPR_0141 : fp2] [D_FPR_0142 : fp3] [D_FPR_0143 : fp4] [D_FPR_0144 : fp5] [D_FPR_0145 : fp6] [D_FPR_0146 : fp7] [D_FPR_0147 : fp8] [D_FPR_0148 : fp9] [D_FPR_0149 : fp10] [D_FPR_0150 : fp11] [D_FPR_0151 : fp12] [D_FPR_0152 : fp13] [D_FPR_0153 : fp14] [D_FPR_0154 : fp15] [D_FPR_0155 : fp16] [D_FPR_0156 : fp17] [D_FPR_0157 : fp18] [D_FPR_0158 : fp19] [D_FPR_0159 : fp20] [D_FPR_0160 : fp21] [D_FPR_0161 : fp22] [D_FPR_0162 : fp23] [D_FPR_0163 : fp24] [D_FPR_0164 : fp25] [D_FPR_0165 : fp26] [D_FPR_0166 : fp27] [D_FPR_0167 : fp28] [D_FPR_0168 : fp29] [D_FPR_0169 : fp30] [D_FPR_0170 : fp31] [D_CCR_0171 : cr0] 
    0x718b5ff6aa38 000000bc [    0x718b2460e9b0] c82e0050          2 	lfd 	fp1, [gr14, 80]		; spilled for dcall # #SPILL8		# SymRef  <#SPILL8_466     0x718b2460e850>[#466  Auto +80] [flags 0x80000000 0x0 ]

After:

 \\ Main.main([Ljava/lang/String;)V
 \\   34 JBinvokestatic 5 Main.max(DD)D
 \\       2 JBinvokestatic 12 java/lang/Math.max(DD)D

    0x788a54707fa8 000000ac [    0x788a4c7bcb00] fc001000          2 	fcmpu 	cr0, fp0, fp2
    0x788a54707fac 000000b0 [    0x788a4c7bcba0] 4183000c          2 	bun 	cr0, Label L0083
    0x788a54707fb0 000000b4 [    0x788a4c7bcc40] f0001500          2 	xsmaxdp 	vsr0, vsr0, vsr2
    0x788a54707fb4 000000b8 [    0x788a4c7bcce0] 48000008          2 	b 	Label L0082	
    0x788a54707fb8 000000bc [    0x788a4c7bcd70]                   2 	Label L0083:	
    0x788a54707fb8 000000bc [    0x788a4c7bce00] fc00102a          2 	fadd 	fp0, fp0, fp2
    0x788a54707fbc 000000c0 [    0x788a4c7bcf40]                   2 	Label L0082:	; (End of internal control flow)	
 PRE: 
POST: [CCR_0080 : cr0] [FPR_0065 : fp0] [FPR_0065 : fp0] [FPR_0066 : fp2]

luke-li-2003 mentioned this pull request Jan 22, 2025

Fast Path Math.min/max_F/D eclipse-omr/omr#7617

Open

rmnattas approved these changes Jan 22, 2025

View reviewed changes

hzongaro added comp:jit arch:power labels Jan 23, 2025

zl-wang requested changes Jan 27, 2025

View reviewed changes

Fast Path Math.min/max_F/D

8c3ca55

Re-enable the fast-pathing of Math.min/max for floating points with the behaviours around +/-0.0 and NaN correctly handled. Signed-off-by: Luke Li <[email protected]>

luke-li-2003 force-pushed the FastPathMathMinMaxFD branch from d2205de to 8c3ca55 Compare January 27, 2025 19:47

zl-wang approved these changes Jan 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast Path Math.min/max_F/D #20999

Fast Path Math.min/max_F/D #20999

luke-li-2003 commented Jan 22, 2025

luke-li-2003 commented Jan 22, 2025

rmnattas left a comment

luke-li-2003 commented Jan 23, 2025

zl-wang Jan 27, 2025

rmnattas Jan 27, 2025

zl-wang Jan 27, 2025

luke-li-2003 Jan 27, 2025

zl-wang Jan 27, 2025

zl-wang left a comment

luke-li-2003 commented Jan 27, 2025

Fast Path Math.min/max_F/D #20999

Are you sure you want to change the base?

Fast Path Math.min/max_F/D #20999

Conversation

luke-li-2003 commented Jan 22, 2025

luke-li-2003 commented Jan 22, 2025

rmnattas left a comment

Choose a reason for hiding this comment

luke-li-2003 commented Jan 23, 2025

zl-wang Jan 27, 2025

Choose a reason for hiding this comment

rmnattas Jan 27, 2025

Choose a reason for hiding this comment

zl-wang Jan 27, 2025

Choose a reason for hiding this comment

luke-li-2003 Jan 27, 2025

Choose a reason for hiding this comment

zl-wang Jan 27, 2025

Choose a reason for hiding this comment

zl-wang left a comment

Choose a reason for hiding this comment

luke-li-2003 commented Jan 27, 2025