Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the code more generalizable to non-human template #53

Merged
merged 18 commits into from
Jul 12, 2023

Conversation

NadiaBlostein
Copy link
Contributor

@NadiaBlostein NadiaBlostein commented Jul 10, 2023

This PR is about executing steps 1.1 to 1.4 in (see README).

In order to test this PR:

  • clone the repository, and get the latest version of this branch

    git checkout nb/preprocess_segment
    git pull
    
  • get the data:

    git clone [email protected]:datasets/philadelphia-pediatric
    git checkout a99400038a98d074e79e69b955aec6d6fefe2abb
    git-annex get sub-101
    git-annex get sub-102
    
  • edit config file using these info:

    {
        "path_data": "ABSOLUTE/PATH/TO/DATA/"
        "include-list": "sub-101 sub-102",
        "data_type": "anat",
        "contrast": "t1",
        "suffix_image": "_rec-composed_T1w",
        "first_disc": "1",
        "last_disc": "18"
    }
  • Test this PR

    • Read the README
    • Test section 1.1 --> 1.4

Before testing this PR, this should be done: neuropoly/data-management#248

Fixes #25, #29, #31, #34, #37, #50, #51

coord = im_discs.getNonZeroCoordinates(sorting = 'z', reverse_coord = True)
coord_physical = []
for c in coord:
if c.value <= last_disc or c.value in [48, 49, 50, 51, 52]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be a problem in non-human templates?

centerline.save_centerline(fname_output = fname_centerline)
print(subject_name + ' SC segmentation does not exist. Extracting centerline from ' + fname_image)
im_seg = Image(fname_image).change_orientation('RPI')
param_centerline = ParamCenterline(algo_fitting = 'optic', smooth = smooth, degree = 5, minmax = minmax)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8


list_centerline.append(centerline)
tqdm_bar.update(1)
tqdm_bar.close()
os.chdir(current_path)
return list_centerline

# def compute_ICBM152_centerline(dataset_info): ###### FIX
# def compute_ICBM152_centerline(dataset_info): # ??????
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace the ????? by an issue on GH that points to the line of code that you don't understand

Copy link
Member

@jcohenadad jcohenadad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • As per previous discussions, the configuration file should not include ALL parameters, but only parameters that enable to reproduce the experiments. Parameters that should not be included include: jobs (bc it depends on local hardware), path-output (bc it depends on prefered user organization),
  • path_data and path_output should be with "-" instead of "_". Please change everywhere appropriate
  • script: remove from config file
  • path_output --> should not be in the config file

Nadia Blostein and others added 3 commits July 10, 2023 15:24
@jcohenadad
Copy link
Member

Suggestions about PATH_OUT. If setting PATH_OUT to be PATH_DATA/derivatives/labels, the following will happen:

├── sub-XXX  <---- your dataset
│   └── anat
│       └──sub-XXX_T1w.nii.gz
...
...
└── derivatives
    └── labels
        ├── results  <---- empty
        ├── qc  <---- has stuff
        ├── log  <---- has stuff
        ├── processed_data  <---- empty
        ├── sub-XXX
        │   └── anat
        │       └──sub-XXX_T1w_labelYYY.nii.gz  <---- segmentation and/or disc label to use for template generation
        ...

In general, PATH_OUT would be set to a local directory (eg: scratch space on a cluster), and then, the useful data would be copied back into the dataset under derivatives/labels.

However, I could see the advantages of the proposed approach:

  • no need to copy from PATH_OUT to the input data folder (ie: less prone to human error)
  • more logging info associated with the processing.

Cons:

  • the dataset becomes cluttered with additional processing qc/logging, which is not the end the world...

preprocess_segment.sh Outdated Show resolved Hide resolved
@jcohenadad jcohenadad changed the title Nb/preprocess segment Make the code more generalizable to non-human template Jul 12, 2023
Comment on lines +70 to +71
FILE="${SUBJECT}${IMAGE_SUFFIX}.nii.gz"
FILESEG="${SUBJECT}${IMAGE_SUFFIX}_label-SC_seg.nii.gz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be looking in the derivative folder of the data

echo "Not found. Proceeding with automatic segmentation."
# Segment spinal cord
sct_deepseg_sc -i ${FILE} -o ${FILESEG} -c ${CONTRAST} -qc ${PATH_QC} -qc-subject ${SUBJECT}
# TODO: MOVE THAT FILE UNDER derivatives/labels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NadiaBlostein after thinking more about it, I think we should not move the outputs in the derivatives folder of the dataset automatically. The reason is that, if someone tries running the script, it will overwrite the data in the derivatives, which will be an annoyance.

Instead, we should do this move while doing the manual correction of the segmentations/labels. In fact, this is what is currently done by other projects. Also see #58

Copy link
Contributor Author

@NadiaBlostein NadiaBlostein Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 73 and 89 of preprocess_segment.sh check if the files exist to avoid overwriting anything. This also allows the straightening.cache, straight_ref.nii.gz, warp_curve2straight.nii.gz and warp_straight2curve.nii.gz for each subject to be saved separately.

However, if you prefer us to remain consistent with what everyone else does, I’ll take a look through the other projects to find a repository that I could use as a blueprint for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preprocess_segment.py is not used anymore, is it? I’m not sure I understand your comment.

Also: straightening.cache etc. have nothing to do with my comment because these files were not copied anyway. I think there is a misunderstanding that will
likely be better resolved in a meeting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, I made a typo (was on phone); was referring to preprocess_segment.sh. Will correct comment now.

I understood that you may have at least wanted the subject-specific warp files saved. We'll chat next week.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, I made a typo (was on phone); was referring to preprocess_segment.sh. Will correct comment now.

👍

I understood that you may have at least wanted the subject-specific warp files saved. We'll chat next week.

No. What I meant in the meeting is that we want these subject-specific files in their own folders (as opposed to a flat directory where there could be file conflicts)-- I did not mean that these data should be ultimately stored in the git-annexed source data

Comment on lines 96 to 98
mv "${SUBJECT}${IMAGE_SUFFIX}_label-SC_seg_labeled_discs.nii.gz" "${SUBJECT}${IMAGE_SUFFIX}_label-disc.nii.gz"
# TODO: MOVE THAT FILE UNDER derivatives/labels
mv "${SUBJECT}${IMAGE_SUFFIX}_label-SC_seg_labeled.nii.gz" "${SUBJECT}${IMAGE_SUFFIX}_label-disc_levels.nii.gz"
Copy link
Member

@jcohenadad jcohenadad Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be moved to derivatives

scrap that (see https://github.com/neuropoly/template/pull/53/files#r1261649236)

# Copy source images
rsync -avzh $PATH_DATA/$SUBJECT .
rsync -avzh $PATH_DATA/$SUBJECT/$DATA_TYPE/* .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all subject files should not be in a flat directory, but instead each subject in its own directory-- to avoid file conflicts-- please do it as we do in: https://spine-generic.readthedocs.io/analysis-pipeline.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Removal of the segment_batching_script
2 participants