Highlights of Download stats on main page. #285

haesleinhuepf · 2024-10-16T12:47:25Z

It would be great if we had a script that determines the recently most downloaded zenodo records, similar to this notebook: and then updates the website main page. For example, it could replace a placeholder such as {most_downloaded} with some actual content.

Optional: It could download the most downloaded thing, turn it into a PNG and show it on the main page. This will not work with all contents obviously, but with PPTx or PDFs.

git-bob comment: Given a zenodo record url, how would you automatically download the first file from this zenodo record using the zenodo API and if its a PPT or PDF turn the first slide or page into a PNG?

The text was updated successfully, but these errors were encountered:

github-actions · 2024-10-16T12:48:53Z

^{This message was generated by git-bob (version: 0.9.0, model: github_models:gpt-4o, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.}

To download the first file from a Zenodo record and convert it to a PNG if it's a PPT or PDF:

import requests
from pptx import Presentation
from pdf2image import convert_from_path
from io import BytesIO
from PIL import Image

def download_first_file_from_zenodo(record_id):
    # Fetch record metadata
    url = f"https://zenodo.org/api/records/{record_id}"
    response = requests.get(url)
    response.raise_for_status()
    data = response.json()
    
    # Get the first file download link
    file_url = data['files'][0]['links']['self']
    file_type = data['files'][0]['type']
    
    # Download the file
    response = requests.get(file_url)
    response.raise_for_status()
    file_content = BytesIO(response.content)

    # Check if the file is PPT or PDF and convert
    if file_type == 'application/vnd.openxmlformats-officedocument.presentationml.presentation':
        prs = Presentation(file_content)
        slide = prs.slides[0]
        image = BytesIO()
        slide.shapes[0].image.export(image, 'PNG')
        image.seek(0)
        img = Image.open(image)

    elif file_type == 'application/pdf':
        pages = convert_from_path(file_content)
        img = pages[0]

    # Save the first page or slide as a PNG
    img.save('first_page.png', 'PNG')

# Example usage
download_first_file_from_zenodo('1234567')

Replace '1234567' with the actual Zenodo record ID.

haesleinhuepf · 2024-10-16T12:50:21Z

Hi @marabuuu ,

as discussed yesterday, could you please check if this works or can be made to work? (-:

Edit: Feel free to let git-bob do the hard part of the job :-)

Thanks!

Best,
Robert

marabuuu · 2024-10-16T13:07:45Z

Yes, I am happy to look into this :)

…png NFDI4BIOIMAGE#285

haesleinhuepf assigned marabuuu Oct 16, 2024

marabuuu added a commit to marabuuu/training that referenced this issue Oct 19, 2024

determine most downloaded zenodo record last week and start creating …

db9afe2

…png NFDI4BIOIMAGE#285

marabuuu added a commit to marabuuu/training that referenced this issue Oct 23, 2024

create png and save in folder highlights NFDI4BIOIMAGE#285

8abdfc1

marabuuu added a commit to marabuuu/training that referenced this issue Oct 23, 2024

add date to highlight.png NFDI4BIOIMAGE#285

45eb900

marabuuu added a commit to marabuuu/training that referenced this issue Oct 23, 2024

Update readme with png file from script NFDI4BIOIMAGE#285

1d5b675

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Highlights of Download stats on main page. #285

Highlights of Download stats on main page. #285

haesleinhuepf commented Oct 16, 2024

github-actions bot commented Oct 16, 2024

haesleinhuepf commented Oct 16, 2024 •

edited

Loading

marabuuu commented Oct 16, 2024

Highlights of Download stats on main page. #285

Highlights of Download stats on main page. #285

Comments

haesleinhuepf commented Oct 16, 2024

github-actions bot commented Oct 16, 2024

haesleinhuepf commented Oct 16, 2024 • edited Loading

marabuuu commented Oct 16, 2024

haesleinhuepf commented Oct 16, 2024 •

edited

Loading