Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A problem in the using my own data #3

Open
githubcooperation opened this issue May 6, 2023 · 10 comments
Open

A problem in the using my own data #3

githubcooperation opened this issue May 6, 2023 · 10 comments
Labels
bug Something isn't working

Comments

@githubcooperation
Copy link

I execute the following code,
python BEN_DA.py -t E:\ProgramResearch\BEN\data\train -l E:\ProgramResearch\BEN\data\label -r E:\ProgramResearch\BEN\data\raw-all -prefix sen2 -check RIA
I find during the train model, the loss is nan
image
The follow is my train data and my label data,
image
image
image
image
Can you give me some suggestions to solve this problem, thank you!

@grandjeanlab
Copy link

I have the same issue..

here is the data I use
https://surfdrive.surf.nl/files/index.php/s/0mmRiXTveveiq8C

and the command line
python /opt/BEN/BEN_DA.py -t epi_train_small -l epi_label_small -r epi_train -weight weight/Rat-EPI-94T-CAMRI_2022_07191156/.hdf5 -prefix stdrat_func_240517

Any idea?

@grandjeanlab
Copy link

this is also the case when using the data / command from the tuto here https://www.youtube.com/watch?v=VZFNDh3MliA

@yu02019
Copy link
Owner

yu02019 commented May 17, 2024

Hi Joanes,

Thank you for reporting this issue.

I ran the data you provided, and here are the results and logs. It appears that BEN is working well with this dataset.
Uploading epi_train_predict.zip…

However, based on your questions, I discovered a bug related to the "weight path" in the inference call to BEN. Please allow me some time to fix this issue and update the source code.

Thank you for your patience.

Best regards

@yu02019
Copy link
Owner

yu02019 commented May 17, 2024

The link I provided above seems to be unavailable. Please refer to this temporary link: https://drive.google.com/file/d/1itIhlgW2dpUMwy_qmrymv2HsXwd1XUoE/view?usp=sharing

@yu02019 yu02019 added the bug Something isn't working label May 17, 2024
@grandjeanlab
Copy link

thanks for the fast reponse. might be installation related? I run ben on linux within a singularity container (the only way I found to install tf 1.15.4.). Here is the definition file to build the container. Maybe you could consider making docker / singularity versions available in the future?? https://github.com/grandjeanlab/apptainer/tree/main/ubuntu_ben

I got the python installed.
absl-py 0.10.0
asn1crypto 0.24.0
astor 0.8.1
certifi 2024.2.2
cryptography 2.1.4
cycler 0.11.0
decorator 4.4.2
gast 0.2.2
google-pasta 0.2.0
grpcio 1.32.0
h5py 2.10.0
idna 2.6
imageio 2.15.0
importlib-metadata 2.0.0
Keras 2.2.4
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.2
keyring 10.6.0
keyrings.alt 3.0
kiwisolver 1.3.1
Markdown 3.2.2
matplotlib 3.3.2
networkx 2.5.1
nibabel 3.0.0
numpy 1.16.0
opencv-python 4.1.2.30
opt-einsum 3.3.0
pandas 0.23.4
Pillow 8.4.0
pip 20.2.3
protobuf 3.13.0
pycrypto 2.6.1
pygobject 3.26.1
pyparsing 3.1.2
python-apt 1.6.5+ubuntu0.3
python-dateutil 2.9.0.post0
pytz 2024.1
PyWavelets 1.1.1
pyxdg 0.25
PyYAML 6.0.1
scikit-image 0.16.2
scipy 1.5.4
seaborn 0.9.0
SecretStorage 2.3.1
setuptools 50.3.0
SimpleITK 2.0.0
six 1.11.0
tensorboard 1.15.0
tensorflow-estimator 1.15.1
tensorflow-gpu 1.15.4
termcolor 1.1.0
Werkzeug 1.0.1
wheel 0.30.0
wrapt 1.12.1
zipp 3.2.0

@grandjeanlab
Copy link

is the weight path issue related to ben expecting the weights from unet_fp32_all_BN_NoCenterScale_polyic_epoch15_bottle256_04012051 if no weight are specified in the BEN_DA interface??

I put those in my path, but also didn't work.
let me know if I can do something else to help troubleshoot

@yu02019
Copy link
Owner

yu02019 commented May 17, 2024

I suppose all you have to do in the stripts is maintain the proper weight path. This issue is not caused by the installation.

In the previous code, ".hdf5" needed to be added in BEN_infer.py but not in BEN_DA.py, which could cause confusion.

To address this issue, I added a line of code to unify the path calling methods. Now, both scripts (BEN_infer.py and BEN_DA.py) require only the folder path of the weight files.

weight += '' if weight.endswith('.hdf5') else '.hdf5'

图片111

@grandjeanlab
Copy link

cool. that seems like it is not it. I still have empty masks when using the weights generated from BEN_DA
the log does not show any kind of error. This is with your demo dataset.

log.txt

would you have some ideas what I could try.

@yu02019
Copy link
Owner

yu02019 commented May 20, 2024

I haven't encountered this bug (loss being NaN) before. Initially, I thought it might be due to the model not correctly loading the pre-trained weights. However, it now seems more likely to be an environment-related issue.

Here are my logs and the installed Python libraries. I am running this on Windows 10 with a Titan RTX GPU.
conda list.txt
pip list.txt
log-our-20240520.txt

This is just my speculation, but could it be a problem with the CUDA environment? I am using a conda-managed environment with the following setup (shown by the command ”conda list“):

cudatoolkit               10.0.130                      0
cudnn                     7.6.5                cuda10.0_0

I am not sure if using pip to manage the environment is directly calling the host machine's CUDA environment. Thus, you might need to maintain compatibility between the CUDA version on the host machine and the installed TensorFlow version. Another way is, you could try installing via a conda virtual environment.

@grandjeanlab
Copy link

hi,
I tried conda management instead of containers and that seems to work.

before closing the issue, here is what worked,
using cuda 10.2 on alma linux 8.6.

conda create -y -n ben python=3.6
source activate ben
git clone  --branch doc https://github.com/yu02019/BEN.git
cd BEN
pip install -r requirements.txt

or on google collab using a trick i found (but lost the link)

%env PYTHONPATH = # /env/python

!wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh
!chmod +x Miniconda3-py38_4.12.0-Linux-x86_64.sh
!./Miniconda3-py38_4.12.0-Linux-x86_64.sh -b -f -p /usr/local
!conda update conda -y
import sys
sys.path.append('/usr/local/lib/python3.8/site-packages')

!conda create -y -n ben python=3.6

%%shell
eval "$(conda shell.bash hook)"
conda activate ben
pip install ipykernel
git clone --branch doc https://github.com/yu02019/BEN.git
cd BEN
pip install -r requirements.txt

and every following chunks need to be run with the following. but only works to run python scripts!!!

%%shell
eval "$(conda shell.bash hook)"
conda activate ben

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants