Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Smart speed #3557

Open
DankMemeGuy opened this issue Oct 28, 2024 · 3 comments
Open

[Enhancement]: Smart speed #3557

DankMemeGuy opened this issue Oct 28, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@DankMemeGuy
Copy link

Type of Enhancement

None

Describe the Feature/Enhancement

Podcasting software "Overcast" includes a SmartSpeed feature that intelligently trims pauses in audio playback. This feature can result in a time savings of approximately 10-20% during playback. For podcasts or audiobooks, enabling this could significantly reduce listening time without altering the natural flow of the audio.

You can read more here https://medium.com/@eped/overcasts-smart-speed-vs-real-time-a759549ab48b

Why would this be helpful?

This feature would enhance the user experience by allowing listeners to consume content more efficiently, saving time while maintaining audio quality. It’s particularly useful for lengthy content like audiobooks and long-form podcasts, where small pauses add up, making the listening process quicker and more seamless.

Future Implementation (Screenshot)

Not needed

Audiobookshelf Server Version

2.16

Current Implementation (Screenshot)

No response

@DankMemeGuy DankMemeGuy added the enhancement New feature or request label Oct 28, 2024
@pwinnski
Copy link

That feature is the result of Marco Arment having developed a custom audio engine for iOS. It is 100% a mobile client feature, and the result of a unique and huge effort by a very talented developer. It is also very much iOS/iPadOS/WatchOS only.

All that to say, there's a reason that Overcast is the only app that does that, even ten years later! https://marco.org/2014/07/16/overcast

@DankMemeGuy
Copy link
Author

That feature is the result of Marco Arment having developed a custom audio engine for iOS. It is 100% a mobile client feature, and the result of a unique and huge effort by a very talented developer. It is also very much iOS/iPadOS/WatchOS only.

All that to say, there's a reason that Overcast is the only app that does that, even ten years later! https://marco.org/2014/07/16/overcast

I think it was more challenging in 2015 when he was building out his software, but times have changed. When using ChatGPT, this is what it suggested:

To implement an on-the-fly "Smart Speed" feature within AudioBookShelf, we’d ideally modify its playback engine to handle dynamic silence reduction during streaming rather than preprocessing the audio file. Since AudioBookShelf is a Node.js application, we'll focus on integrating this functionality within a Node.js audio pipeline.

Here's how you might go about adding this feature using ffmpeg to process audio in real-time. The following example script uses ffmpeg to dynamically remove silence and stream audio back to the player.

Prerequisites

Ensure ffmpeg is installed on your server, as it will perform real-time audio processing. You can install it using:

sudo apt-get install ffmpeg

Real-Time Silence Reduction in Node.js

Here’s an example of how you could add real-time silence reduction to AudioBookShelf's playback by creating a middleware function to process the audio stream. This approach hooks into the stream, detects silence, and trims it in real time before streaming the output to the player.

const { spawn } = require('child_process');
const express = require('express');
const fs = require('fs');
const app = express();

app.get('/stream/:audioFile', (req, res) => {
    const audioFilePath = `path/to/audiobooks/${req.params.audioFile}`;

    // Spawn an FFmpeg process to reduce silence dynamically
    const ffmpeg = spawn('ffmpeg', [
        '-i', audioFilePath,             // Input file
        '-af', 'silenceremove=start_periods=1:start_silence=0.5:start_threshold=-40dB:detection=peak', // Silence removal filter
        '-f', 'mp3',                     // Output format
        'pipe:1'                         // Output to stdout for streaming
    ]);

    // Set response headers
    res.setHeader('Content-Type', 'audio/mpeg');
    
    // Pipe FFmpeg output to the response
    ffmpeg.stdout.pipe(res);

    // Handle errors
    ffmpeg.stderr.on('data', (data) => {
        console.error(`FFmpeg error: ${data}`);
    });

    // Close response when FFmpeg finishes
    ffmpeg.on('close', () => {
        res.end();
    });

    // Handle client disconnects
    req.on('close', () => {
        ffmpeg.kill('SIGINT');
    });
});

// Start server
app.listen(3000, () => {
    console.log('AudioBookShelf streaming server is running on port 3000');
});

Explanation of the Code

  1. FFmpeg Audio Filter: We use the silenceremove filter from FFmpeg, which detects and removes silence based on customizable thresholds:

    • start_periods=1: Starts trimming after detecting one period of silence.
    • start_silence=0.5: Defines silence as any pause lasting 0.5 seconds or longer.
    • start_threshold=-40dB: Treats segments quieter than -40 dB as silence.
    • detection=peak: Uses peak values for detecting silence.
  2. Streaming with Express: This code sets up an Express route (/stream/:audioFile) to serve processed audio on-the-fly. FFmpeg streams directly to the client by piping its stdout to the HTTP response.

  3. Handling Connections: The server closes the FFmpeg process if the client disconnects (req.on('close')), saving resources.

Integration into AudioBookShelf

To integrate this directly into AudioBookShelf, you would:

  • Modify its existing audio streaming routes to pass audio files through this ffmpeg processing stream.
  • Ensure that your server can handle the CPU load, as real-time audio processing can be resource-intensive.

This approach avoids the need to pre-process entire files while providing an on-the-fly "Smart Speed" feature similar to Overcast's Smart Speed. This setup is suitable for streaming-based applications and works well within Node.js-based frameworks like AudioBookShelf.

Relevant:

  1. https://ffmpeg.org/ffmpeg-filters.html#silencedetect
  2. https://ffmpeg.org/ffmpeg-filters.html#silenceremove

That would be a streaming option, but a pre-processed option would be less intensive, but more storage (assuming the original file is kept intact). Unless I guess having an option at the time of upload on whether to trim silence, and whether to preserve the original or not. That would save the complexity of re-writing the audio engine since it would just need that FFMPEG pipeline at upload stage

@TaylorMichaelHall
Copy link

That feature is the result of Marco Arment having developed a custom audio engine for iOS. It is 100% a mobile client feature, and the result of a unique and huge effort by a very talented developer. It is also very much iOS/iPadOS/WatchOS only.

All that to say, there's a reason that Overcast is the only app that does that, even ten years later! https://marco.org/2014/07/16/overcast

Pocketcasts has been doing this for about as long - https://support.pocketcasts.com/knowledge-base/playback-effects/#htoc-trim-silence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants