Stream binary audio data from http request for librosa analysis

Question

I have a large audio file streaming from a web service.

I would like to load the audio data into librosa for batched stream analysis.

I took a look at librosa.core.stream where the description mentiones:

Any codec supported by soundfile is permitted here.

But I can't seem to figure out how I can feed the binary batch data from requests:

import requests
import numpy as np

audio_url = "http://localhost/media/audioplayback.m4a"

response = requests.get(
    audio_url,
    stream=True,
)

for chunk in response.iter_content(chunk_size=4096):
    npChunk = np.frombuffer(chunk, dtype=np.float64)
    # Load chunk data into librosa

I know I need to convert the audio format but I'm not sure what is the recommended way to do this. I know it is possible to load the data directly into numpy array instead of calling librosa.stream. But I can't figure out the combination of soundfile, audioread, or GStreamer to do the format conversion.

I am using python==3.6.5 inside conda environtment inside Windows Subsystem for Linux

Any help would be greatly appreciated! Thank you!

I am testing with youtube-dl audio URL as an [example](https://stackoverflow.com/a/50881927/9168936) — Rohit Mistry, Aug 10 '19 at 23:13
**Update**: made some progress using ffmpeg using [this approach](http://zulko.github.io/blog/2013/10/04/read-and-write-audio-files-in-python-using-ffmpeg/) but the amplitude data seems corrupted when looking at line-plot or playing back in `IPython.display.Audio(...)` — Rohit Mistry, Aug 11 '19 at 01:12
Can you provide a example URL with the kind of audio stream you look at? — Jon Nordby, Aug 16 '19 at 10:08
This comment has some hints on how to do this with Gstreamer https://stackoverflow.com/questions/3507746/use-python-gstreamer-to-decode-audio-to-pcm-data — Jon Nordby, Aug 16 '19 at 10:28
audiofile does not look to take anything but a file path. soundfile supports Filelike objects, but expects the entire file to be available (no streaming). — Jon Nordby, Aug 16 '19 at 11:11
An example audio URL from youtube is too long for SO markdown comment. I just get the URL using command `youtube-dl --skip-download --extract-audio --format bestaudio --get-url https://www.youtube.com/watch?v=Sv7y4rbm-9Q` — Rohit Mistry, Aug 17 '19 at 18:24

score -2 · Answer 1 · answered Aug 27 '19 at 09:56

My current solution is this:

You need to install pydub

from pydub import AudioSegment

audio_bytes = []
for b in request.files['audio_data'].stream.read():
    audio_bytes += [b]
audio_bytes = bytes(audio_bytes)    
s = io.BytesIO(audio_bytes)
audioObj = AudioSegment.from_file(s)
audioObj = audioObj.set_sample_width(2).set_frame_rate(16000).set_channels(1)
audioObj.export("input_audio.wav")
wav, sr = librosa.load("input_audio.wav")
wav = librosa.core.resample(wav, sr, 16000)
return wav

my frontend code is this:

recorder.onComplete = function(recorder, blob) { 
            console.log("Encoding complete");
            createDownloadLink(blob,recorder.encoding);

            //START AJAX HERE
            var fd = new FormData();
            fd.append('audio_data', blob);
            console.log('Transcribing...')
            document.getElementById('res_stat').innerHTML = "Waiting For Server's Response"

            $.ajax({
            type: 'POST',
            url: "/audio",
            data: fd,
            processData: false,
            contentType: false,
            dataType: "json",
            success: function(text){
                                    console.log("Output Received")

                                    document.getElementById("predOut").value = text.text;
                            }
            });

            console.log("Waiting For Server's Response")
        }

Thanks @thethiny, but this is not helpful for my needs. I'd have to use pydub to load the entire track to memory, then save to disk, then load from disk to librosa. That is a trivial workaround but I'd like to work with a stream instead. Where the length of the stream may be unknown at run-time — Rohit Mistry, Sep 14 '19 at 16:25

Stream binary audio data from http request for librosa analysis

1 Answers1