Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Highlights of Download stats on main page. #285

Open
haesleinhuepf opened this issue Oct 16, 2024 · 3 comments
Open

Highlights of Download stats on main page. #285

haesleinhuepf opened this issue Oct 16, 2024 · 3 comments
Assignees

Comments

@haesleinhuepf
Copy link
Member

It would be great if we had a script that determines the recently most downloaded zenodo records, similar to this notebook: and then updates the website main page. For example, it could replace a placeholder such as {most_downloaded} with some actual content.

Optional: It could download the most downloaded thing, turn it into a PNG and show it on the main page. This will not work with all contents obviously, but with PPTx or PDFs.

git-bob comment: Given a zenodo record url, how would you automatically download the first file from this zenodo record using the zenodo API and if its a PPT or PDF turn the first slide or page into a PNG?

Copy link
Contributor

This message was generated by git-bob (version: 0.9.0, model: github_models:gpt-4o, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

To download the first file from a Zenodo record and convert it to a PNG if it's a PPT or PDF:

import requests
from pptx import Presentation
from pdf2image import convert_from_path
from io import BytesIO
from PIL import Image

def download_first_file_from_zenodo(record_id):
    # Fetch record metadata
    url = f"https://zenodo.org/api/records/{record_id}"
    response = requests.get(url)
    response.raise_for_status()
    data = response.json()
    
    # Get the first file download link
    file_url = data['files'][0]['links']['self']
    file_type = data['files'][0]['type']
    
    # Download the file
    response = requests.get(file_url)
    response.raise_for_status()
    file_content = BytesIO(response.content)

    # Check if the file is PPT or PDF and convert
    if file_type == 'application/vnd.openxmlformats-officedocument.presentationml.presentation':
        prs = Presentation(file_content)
        slide = prs.slides[0]
        image = BytesIO()
        slide.shapes[0].image.export(image, 'PNG')
        image.seek(0)
        img = Image.open(image)

    elif file_type == 'application/pdf':
        pages = convert_from_path(file_content)
        img = pages[0]

    # Save the first page or slide as a PNG
    img.save('first_page.png', 'PNG')

# Example usage
download_first_file_from_zenodo('1234567')

Replace '1234567' with the actual Zenodo record ID.

@haesleinhuepf
Copy link
Member Author

haesleinhuepf commented Oct 16, 2024

Hi @marabuuu ,

as discussed yesterday, could you please check if this works or can be made to work? (-:

Edit: Feel free to let git-bob do the hard part of the job :-)

Thanks!

Best,
Robert

@marabuuu
Copy link
Collaborator

Yes, I am happy to look into this :)

marabuuu added a commit to marabuuu/training that referenced this issue Oct 23, 2024
marabuuu added a commit to marabuuu/training that referenced this issue Oct 23, 2024
marabuuu added a commit to marabuuu/training that referenced this issue Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants