Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error encountered during Pandas pivot #14

Open
evanseitz opened this issue Feb 21, 2022 · 0 comments
Open

Error encountered during Pandas pivot #14

evanseitz opened this issue Feb 21, 2022 · 0 comments

Comments

@evanseitz
Copy link
Contributor

The use of Pandas pivot throws an error for me about repeated entries, which halts the program laid out in [1].

I only encountered this error using my own synthetic data (which can have some duplicate sequence/bin combinations, hence the error). I can confirm that no such error arises when using the demo sort-seq file provided (2010, file_S2.txt). I was able to correct it using the recommendation at [2], by making the following change:

pivot_df = sub_df.pivot(index='x', values='ct', columns='bin').fillna(0).astype(int) #before
pivot_df = sub_df.pivot_table(index='x', values='ct', columns='bin').fillna(0).astype(int) #after

[1] https://mavenn.readthedocs.io/en/latest/datasets/dataset_sortseq.html
[2] https://stackoverflow.com/questions/11232275/pandas-pivot-warning-about-repeated-entries-on-index

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant