You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 2, 2024. It is now read-only.
I am trying to use HPAT to accelerate ETL process. Although HPAT gave significant speedup on a multi-core CPU in terms of the data frame transformation, it has an issue that I could not figure out now.
It gives no speedup or raises an error when we return the data frame from the jitted function. The example with minimal code is listed as follows.
In the baseline case, time python test_hpat3.py uses 30.31s.
time mpiexec -n 2 python test_hpat3.py
real 0m33.568s
time mpiexec -n 8 python test_hpat3.py
real 0m32.557s
time mpiexec -n 16 python test_hpat3.py
real 0m37.037s
time mpiexec -n 32 python test_hpat3.py
real 0m48.858s
We found that using more processes on MPI for this example program only gives more slowdown.
The observation is different when I remove the return df from the JITted function, where we have more speedup with the increasing number of processes used.
Besides, if I use even more processes, an error is reported.
time mpiexec -n 44 python test_hpat3.py
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 68909 RUNNING AT CR3PPM-SER010
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
real 0m38.648s
user 22m1.044s
sys 4m20.828s
I am not sure if the slowdown/error is supposed to happen since I am quite new to HPAT.
Could you give me more explanation and suggestions about it? Let me know if other information is needed.
Since I would like to feed the data frame after the ETL process, how can I return the data frame out of the HPAT jitted function?
Hi,
I am trying to use HPAT to accelerate ETL process. Although HPAT gave significant speedup on a multi-core CPU in terms of the data frame transformation, it has an issue that I could not figure out now.
It gives no speedup or raises an error when we return the data frame from the jitted function. The example with minimal code is listed as follows.
In the baseline case,
time python test_hpat3.py
uses 30.31s.We found that using more processes on MPI for this example program only gives more slowdown.
The observation is different when I remove the
return df
from the JITted function, where we have more speedup with the increasing number of processes used.Besides, if I use even more processes, an error is reported.
I am not sure if the slowdown/error is supposed to happen since I am quite new to HPAT.
Could you give me more explanation and suggestions about it? Let me know if other information is needed.
Since I would like to feed the data frame after the ETL process, how can I return the data frame out of the HPAT jitted function?
Thank you so much.
Best regards,
Hongyuan Liu
Software configuration:
hpat 0.30.0 py37hc547734_15 intel/label/test
numba 0.45.0 py37h962f231_0
The text was updated successfully, but these errors were encountered: