[Enhancement]: Smart speed #3557

DankMemeGuy · 2024-10-28T14:54:53Z

Type of Enhancement

None

Describe the Feature/Enhancement

Podcasting software "Overcast" includes a SmartSpeed feature that intelligently trims pauses in audio playback. This feature can result in a time savings of approximately 10-20% during playback. For podcasts or audiobooks, enabling this could significantly reduce listening time without altering the natural flow of the audio.

You can read more here https://medium.com/@eped/overcasts-smart-speed-vs-real-time-a759549ab48b

Why would this be helpful?

This feature would enhance the user experience by allowing listeners to consume content more efficiently, saving time while maintaining audio quality. It’s particularly useful for lengthy content like audiobooks and long-form podcasts, where small pauses add up, making the listening process quicker and more seamless.

Future Implementation (Screenshot)

Not needed

Audiobookshelf Server Version

2.16

Current Implementation (Screenshot)

No response

pwinnski · 2024-10-30T15:00:31Z

That feature is the result of Marco Arment having developed a custom audio engine for iOS. It is 100% a mobile client feature, and the result of a unique and huge effort by a very talented developer. It is also very much iOS/iPadOS/WatchOS only.

All that to say, there's a reason that Overcast is the only app that does that, even ten years later! https://marco.org/2014/07/16/overcast

DankMemeGuy · 2024-11-01T02:37:32Z

That feature is the result of Marco Arment having developed a custom audio engine for iOS. It is 100% a mobile client feature, and the result of a unique and huge effort by a very talented developer. It is also very much iOS/iPadOS/WatchOS only.

All that to say, there's a reason that Overcast is the only app that does that, even ten years later! https://marco.org/2014/07/16/overcast

I think it was more challenging in 2015 when he was building out his software, but times have changed. When using ChatGPT, this is what it suggested:

To implement an on-the-fly "Smart Speed" feature within AudioBookShelf, we’d ideally modify its playback engine to handle dynamic silence reduction during streaming rather than preprocessing the audio file. Since AudioBookShelf is a Node.js application, we'll focus on integrating this functionality within a Node.js audio pipeline.

Here's how you might go about adding this feature using ffmpeg to process audio in real-time. The following example script uses ffmpeg to dynamically remove silence and stream audio back to the player.

Prerequisites

Ensure ffmpeg is installed on your server, as it will perform real-time audio processing. You can install it using:

sudo apt-get install ffmpeg

Real-Time Silence Reduction in Node.js

Here’s an example of how you could add real-time silence reduction to AudioBookShelf's playback by creating a middleware function to process the audio stream. This approach hooks into the stream, detects silence, and trims it in real time before streaming the output to the player.

const { spawn } = require('child_process');
const express = require('express');
const fs = require('fs');
const app = express();

app.get('/stream/:audioFile', (req, res) => {
    const audioFilePath = `path/to/audiobooks/${req.params.audioFile}`;

    // Spawn an FFmpeg process to reduce silence dynamically
    const ffmpeg = spawn('ffmpeg', [
        '-i', audioFilePath,             // Input file
        '-af', 'silenceremove=start_periods=1:start_silence=0.5:start_threshold=-40dB:detection=peak', // Silence removal filter
        '-f', 'mp3',                     // Output format
        'pipe:1'                         // Output to stdout for streaming
    ]);

    // Set response headers
    res.setHeader('Content-Type', 'audio/mpeg');
    
    // Pipe FFmpeg output to the response
    ffmpeg.stdout.pipe(res);

    // Handle errors
    ffmpeg.stderr.on('data', (data) => {
        console.error(`FFmpeg error: ${data}`);
    });

    // Close response when FFmpeg finishes
    ffmpeg.on('close', () => {
        res.end();
    });

    // Handle client disconnects
    req.on('close', () => {
        ffmpeg.kill('SIGINT');
    });
});

// Start server
app.listen(3000, () => {
    console.log('AudioBookShelf streaming server is running on port 3000');
});

Explanation of the Code

FFmpeg Audio Filter: We use the silenceremove filter from FFmpeg, which detects and removes silence based on customizable thresholds:
- start_periods=1: Starts trimming after detecting one period of silence.
- start_silence=0.5: Defines silence as any pause lasting 0.5 seconds or longer.
- start_threshold=-40dB: Treats segments quieter than -40 dB as silence.
- detection=peak: Uses peak values for detecting silence.
Streaming with Express: This code sets up an Express route (/stream/:audioFile) to serve processed audio on-the-fly. FFmpeg streams directly to the client by piping its stdout to the HTTP response.
Handling Connections: The server closes the FFmpeg process if the client disconnects (req.on('close')), saving resources.

Integration into AudioBookShelf

To integrate this directly into AudioBookShelf, you would:

Modify its existing audio streaming routes to pass audio files through this ffmpeg processing stream.
Ensure that your server can handle the CPU load, as real-time audio processing can be resource-intensive.

This approach avoids the need to pre-process entire files while providing an on-the-fly "Smart Speed" feature similar to Overcast's Smart Speed. This setup is suitable for streaming-based applications and works well within Node.js-based frameworks like AudioBookShelf.

Relevant:

That would be a streaming option, but a pre-processed option would be less intensive, but more storage (assuming the original file is kept intact). Unless I guess having an option at the time of upload on whether to trim silence, and whether to preserve the original or not. That would save the complexity of re-writing the audio engine since it would just need that FFMPEG pipeline at upload stage

TaylorMichaelHall · 2024-11-05T01:22:54Z

That feature is the result of Marco Arment having developed a custom audio engine for iOS. It is 100% a mobile client feature, and the result of a unique and huge effort by a very talented developer. It is also very much iOS/iPadOS/WatchOS only.

All that to say, there's a reason that Overcast is the only app that does that, even ten years later! https://marco.org/2014/07/16/overcast

Pocketcasts has been doing this for about as long - https://support.pocketcasts.com/knowledge-base/playback-effects/#htoc-trim-silence

DankMemeGuy added the enhancement New feature or request label Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement]: Smart speed #3557

[Enhancement]: Smart speed #3557

DankMemeGuy commented Oct 28, 2024

pwinnski commented Oct 30, 2024

DankMemeGuy commented Nov 1, 2024

TaylorMichaelHall commented Nov 5, 2024

[Enhancement]: Smart speed #3557

[Enhancement]: Smart speed #3557

Comments

DankMemeGuy commented Oct 28, 2024

Type of Enhancement

Describe the Feature/Enhancement

Why would this be helpful?

Future Implementation (Screenshot)

Audiobookshelf Server Version

Current Implementation (Screenshot)

pwinnski commented Oct 30, 2024

DankMemeGuy commented Nov 1, 2024

Prerequisites

Real-Time Silence Reduction in Node.js

Explanation of the Code

Integration into AudioBookShelf

TaylorMichaelHall commented Nov 5, 2024