Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different visualization area in a view of pymol3d in the tutorial #228

Open
xavgit opened this issue Nov 22, 2024 · 8 comments
Open

Different visualization area in a view of pymol3d in the tutorial #228

xavgit opened this issue Nov 22, 2024 · 8 comments

Comments

@xavgit
Copy link

xavgit commented Nov 22, 2024

Hi,
in the tutorial for the section https://prolif.readthedocs.io/en/stable/notebooks/docking.html#docking
about
"You can also compare two different poses on the same view, and the protein residues that have different interactions in the other pose or are missing will be highlighted in magenta:"
the width of the visualization area on left is bigger than that on the right side.
How it is possible to have the same width on both sides?
It is possible to compare more than two poses with the given code?

Thanks.

Saverio

image

P.S. Thanks for this very useful package.

@cbouy
Copy link
Member

cbouy commented Nov 24, 2024

Hi @xavgit

They should be of the same size, have you tried just specifying a smaller size for the view? By default it's size=(900, 600) but you can adjust this and hopefully it will all fit in your window

@xavgit
Copy link
Author

xavgit commented Nov 25, 2024

Hi,
thanks , I try.
If possible another question.
Regarding the instructions

df = fp.to_dataframe(index_col="Pose")
# show only the 5 first poses
df.head(5)

how it is possible to get only the list of the interacting residues
present at level protein?
In the tutorial
protein TYR38.A TYR109.A TRP125.A LEU126.A ... PRO338.B PHE346.B LEU348.B PHE351.B ASP352.B THR355.B TYR359.B
This for knowing only what are the interacting residue whit the ligand poses of a specified ligand.

Thanks.

Saverio

@xavgit
Copy link
Author

xavgit commented Nov 25, 2024

Hi,
I've used the following code on a WQXGA monitor:

from prolif.plotting.complex3d import Complex3D

pose_index = 0
comp3D = Complex3D.from_fingerprint(
fp, pose_iterable[pose_index], protein_mol, frame=pose_index
)

pose_index = 4
other_comp3D = Complex3D.from_fingerprint(
fp, pose_iterable[pose_index], protein_mol, frame=pose_index
)

view = comp3D.compare(other_comp3D , size = ( 800 , 500 ) )
view

The result is the same or similar with the browser in full screen.
I don't know if the added size = ( 600 , 600 ) is in the right place.

Thanks.

Saverio

@cbouy
Copy link
Member

cbouy commented Nov 26, 2024

Regarding the view size I really don't know what to say, have you tried with a different browser?

For your question around extracting the list of residues:

# for all poses combined
df.columns.get_level_values("protein").unique().to_list()

# for a specific pose
pose_num = 1
(pose := df.iloc[pose_num])[pose].index.get_level_values("protein").unique()

@cbouy
Copy link
Member

cbouy commented Dec 6, 2024

@xavgit did you find any solution to this on your side? Just to know if I can close this issue or not

@xavgit
Copy link
Author

xavgit commented Dec 28, 2024

Hi,
thanks for your suggestions and sorry for
this very long time to replay.
I've stopped to work on this topic but now
I' ve restarted.
For the different visualization areas I've noticed that
that the problem doesn't exists if the width and height
sizes are equal or not so different whereas if the width
is greater than the height then the visualization areas
are different along the width dimension.
I guess that this is not a prolif problem.
I can later try with different browser and
report the results.

For the instructions
pose_num = 1
(pose := df.iloc[pose_num])[pose].index.get_level_values("protein").unique()

,having added a .toList() as in the first example provided by you ,
I' receiving an error on compound.

What I've done.
I've docked about 10000 drugs using ad-gpu and then converted to sdf as indicated
in your tutorial.
Then for each sdf I have produced the ifp as a dataframe and "pickled" it as .pkl.
Then reloading the dataframe I've used the provided instruction:

pose_num = 1
(pose := df.iloc[pose_num])[pose].index.get_level_values("protein").unique().tolist()
receiving the following for one molecule:

DB00496
All Poses
['ASP230', 'VAL231', 'GLU232', 'GLY233', 'HID235', 'ASN244', 'PRO281', 'PRO282', 'GLN285', 'PHE286', 'HID288', 'TRP326', 'ALA327', 'PHE330', 'GLU331', 'ASN392', 'LEU393', 'GLN394', 'ASN406', 'ALA407', 'HID408', 'ASP413', 'MET416', 'THR417', 'MG664', 'MG665']
Best Pose
['ASP230', 'VAL231', 'GLY233', 'ASN244', 'HID235', 'GLU232', 'GLN285']

DB00497
All Poses
['ASP230', 'VAL231', 'GLU232', 'GLY233', 'ASN244', 'PRO281', 'PHE286', 'PHE330', 'GLU331', 'MG664', 'MG665']
Best Pose
Traceback (most recent call last):
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/indexing.py", line 1714, in _get_list_axis
return self.obj._take_with_is_copy(key, axis=axis)
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/generic.py", line 4153, in _take_with_is_copy
result = self.take(indices=indices, axis=axis)
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/generic.py", line 4133, in take
new_data = self._mgr.take(
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 891, in take
indexer = maybe_convert_indices(indexer, n, verify=verify)
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/indexers/utils.py", line 282, in maybe_convert_indices
raise IndexError("indices are out-of-bounds")
IndexError: indices are out-of-bounds

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/xxxx/Desktop/1000ns_dist_rst_dockings/cluster_1/All_3D/ad4gpu_ad4_sf_no_knphs/prolif_analysis/ifpg_df_dict.py", line 17, in
print( ( pose := ifpg_df_d[ key ].iloc[1] )[ pose ].index.get_level_values( 'protein' ).unique().to_list() )
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/series.py", line 1153, in getitem
return self._get_with(key)
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/series.py", line 1191, in _get_with
return self.iloc[key]
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/indexing.py", line 1191, in getitem
return self._getitem_axis(maybe_callable, axis=axis)
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/indexing.py", line 1743, in _getitem_axis
return self._get_list_axis(key, axis=axis)
File "/home/xxxx/.local/lib/python3.10/site-packages/pandas/core/indexing.py", line 1717, in _get_list_axis
raise IndexError("positional indexers are out-of-bounds") from err
IndexError: positional indexers are out-of-bounds

where All Poses is for df.columns.get_level_values("protein").unique().to_list()
and Best Pose is for (pose := df.iloc[pose_num])[pose].index.get_level_values("protein").unique().to_list()

For the:
df.columns.get_level_values("protein").unique().to_list()
no problem for the first 330 compounds.

I've printed the dataframe of the molecule for which there is the error.

$ ./inspect_pkl_file.py ifpg_df_DB00497_docking_res_ad4_sf.pkl
ligand UNL1
protein ASP230 VAL231 GLU232 GLY233 ASN244
interaction VdWContact VdWContact VdWContact VdWContact HBAcceptor VdWContact
Pose
0 4 4 4 1 1 5
1 7 4 4 1 1 5
2 6 4 4 1 1 5
3 5 4 4 1 1 5
4 6 4 4 1 1 5
5 6 4 4 1 1 5
6 6 3 4 1 1 5
7 6 4 4 1 1 5
8 6 4 4 1 1 5
9 5 3 4 1 1 5
10 6 3 4 1 1 5
11 6 4 4 1 1 5
12 5 4 4 1 1 5
13 6 4 4 1 1 5
14 7 4 4 1 1 5
15 7 3 4 1 1 5
16 6 4 4 1 1 5
17 5 4 4 1 1 5
18 6 4 4 1 1 5
19 5 4 4 1 1 5
20 5 4 4 1 1 5
21 6 4 4 1 1 5
22 6 4 4 1 1 5
23 5 3 4 1 1 5
24 6 4 4 1 1 5
25 6 4 4 1 1 5
26 5 4 4 1 1 5
27 5 3 4 1 1 5
28 7 4 4 1 1 5
29 6 3 4 1 1 5
30 6 4 4 1 1 5
31 6 4 4 1 1 5

ligand
protein PRO281 PHE286 PHE330
interaction VdWContact Hydrophobic VdWContact Hydrophobic VdWContact
Pose
0 1 14 6 8 18
1 1 13 6 7 17
2 1 14 6 8 17
3 1 14 6 8 17
4 1 13 6 7 17
5 2 14 6 8 18
6 1 14 6 7 17
7 1 14 6 7 17
8 1 14 6 8 17
9 1 12 6 7 17
10 1 14 6 7 17
11 1 14 6 8 17
12 1 13 6 8 17
13 1 14 6 8 17
14 1 13 6 7 17
15 1 13 6 7 17
16 1 14 6 8 17
17 1 14 6 8 17
18 1 14 6 8 17
19 1 14 6 8 17
20 1 14 6 8 17
21 1 14 6 7 17
22 1 14 6 7 17
23 1 14 6 7 17
24 1 14 6 8 18
25 1 14 6 8 17
26 1 13 6 7 17
27 1 12 6 7 16
28 1 13 6 7 17
29 1 14 6 7 17
30 1 14 6 7 17
31 1 13 6 8 17

ligand
protein GLU331 MG664 MG665
interaction VdWContact VdWContact VdWContact
Pose
0 3 4 6
1 3 4 7
2 3 4 6
3 3 4 7
4 2 4 7
5 3 4 7
6 2 4 7
7 3 4 7
8 3 4 7
9 2 4 6
10 2 4 7
11 3 4 7
12 3 4 7
13 3 4 7
14 3 4 7
15 3 4 7
16 3 4 6
17 3 4 7
18 3 4 6
19 3 4 6
20 3 4 6
21 2 4 6
22 3 4 7
23 2 4 6
24 3 4 6
25 3 4 7
26 3 4 7
27 2 4 6
28 3 4 7
29 2 4 6
30 3 4 7
31 3 4 7

Any suggestion?

Thanks.

Saverio

The .pkl if you want to inspect it.
ifpg_df_DB00497_docking_res_ad4_sf.pkl.txt

@cbouy
Copy link
Member

cbouy commented Dec 28, 2024

I think this happens when all the poses hit the exact same residues (so only the count changes, and there's no 0 in the dataframe).

In (pose := df.iloc[pose_num])[pose].index.get_level_values("protein").unique() the [pose] part ensures that we only look at residues without 0 but this errors out when there's none...

When this happens the output for each pose is the same as for "all poses combined" (i.e. df.columns.get_level_values("protein").unique().to_list()) so you can directly check for zeros in the df and use that directly

@xavgit
Copy link
Author

xavgit commented Dec 29, 2024

Hi,
thanks.
I've read but It is not all clear to me.
I'll examine a case where there is not the error to make a comparison to
better understand what you have explained.

Saverio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants