Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to output all residues involved in each patch #32

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

vhoer
Copy link
Collaborator

@vhoer vhoer commented Jun 5, 2024

To compare patches, a patch index column is added to the outputfile and residue output.

write_patches is now only called once and changed to handle multiple columns at once - to simplify patch numbering.

vhoer added 2 commits June 5, 2024 18:25
--resout).
To compare patches, a patch index column is added to the outputfile and residue output.
@fwaibl
Copy link
Contributor

fwaibl commented Jun 6, 2024

Hi,

that's a great idea, thanks for implementing it. A few comments from my side:

  • the current format of the residue output file seems a bit complicated to parse into a regular format (I guess pd.read_csv -> str.split -> Series.explode might do the trick). With respect to easy loading, a different format (JSON, or an "exploded" csv file) might be better. On the other hand, it would look ugly and long on the standard output.
  • It would also be possible to do the same per atom, and let the user group into residues as it fits? Or, go for the most complete version and output every vertex with its assigned atom, surface area and patch. From this, the user could obtain the residue grouping with a single pandas.groupby operation, and with additional area information on the vertex, atom, or residue level.
  • ... at that point, we might also think about implementing an npz output analogous to the commandline_hydrophobic script. I generally prefer csv over npz because it is human-readable, but it would be good to unify the output a bit and include more information.

@vhoer
Copy link
Collaborator Author

vhoer commented Jun 6, 2024

Hey,
I agree with all points in principle. To me, the residue level is the optimal "course graining" level of output to get a fast overview. However, especially a full output of all vertices with their closest atom might be useful as well. I'll get on that whenever I find the time.

@fwaibl
Copy link
Contributor

fwaibl commented Sep 19, 2024

Couldn't we use a with-statement for res_output? Otherwise, this looks good to me. I'd vote to make a consistent output format at some point, but this doesn't need to be here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants