How is digital audio defined?

Question

I want to make a voice changer, found the description of some effects, but I don't exactly understand what the digital audio looks like.

How much information can I get from different audio formats, or input audio streams?

Can I get frequency, pitch and what the channels are?

Should I apply filters and stuff to original track, or to spectral representation via Fourier transform?

Where can I get that kind of information?

I don't get it. Not even the first sentence. I though by the title this would be some sort of philosophical debate! — n00dles, Feb 22 '17 at 16:39
@MarcW I just want to know what information can I get from an audio file. — Barsik the Cat, Feb 23 '17 at 11:47
Digital audio is simply digital data representing the exact voltage level of the audio signal at the sampling point. Digital audio has a sampling rate which determines how many times per second the audio signal is "sampled".
I suggest you read up on digital audio sampling theory and then read up on file formats and audio interfaces such as "portaudio". This will give you a better idea of how to deal with audio in a processing context. — Mark, Feb 23 '17 at 11:49
All you can get from an audio file is the samples. check out libsndfile documentation - this will give you a useful interface to most sound file formats and allow you to access and manipulate the audio. — Mark, Feb 23 '17 at 11:50
To expand on what @Mark said, digital audio typically (though it can be otherwise, though that's not very useful) fluctuations in air pressure over time (which is sound). Although this data contains, both theoretically and practically, complete information about the frequency contents of a signal, you have to process it to find that information -- there's no field in the audio file that says that from 1.034s to 1.038s there are sine waves of such-and-such amplitudes at such-and-such frequencies. — Linuxios, Feb 23 '17 at 16:49
When it comes to pitch, things are much more complicated, since pitch isn't as much a physical phenomenon as it is a psycoacoustic phenomenon. We could probably be more helpful if you specified your question: are you asking about general techniques for modifying vocal sounds, or are you asking for the basics of digital audio processing? (both are probably too broad). — Linuxios, Feb 23 '17 at 16:53
@Linuxios I think I should start from basics - before I go into making a voice changer I need to understand what I'm going to work with. I came here to ask the question because all the info I found was only saying about sampling rate and bit depth, but nothing about what exactly each sample contains — Barsik the Cat, Feb 24 '17 at 03:55
And what exactly do you mean by "voice changer", out of curiosity? — Linuxios, Feb 24 '17 at 07:20
@Linuxios yeah, it mostly does make sense. The voice changer I'm talking about - an application that can take pre-recorded voice or microphone input and apply different effects - motsly to mimic someone else's voice. — Barsik the Cat, Feb 24 '17 at 07:59
You may want to take a look at this question: http://sound.stackexchange.com/questions/38093/how-to-make-your-voice-sound-like-another-persons-voice — Michael Hansen Buur, Feb 25 '17 at 13:46

score 1 · Accepted Answer · answered Feb 23 '17 at 18:02

There are many audio formats, but generally, the file headers and format chunk can quite easily give you some specific format information if that's all you need; things like the amount of channels, samples per second, bits per sample, etc. Frequency and other data information is locked up in the data, which would need to be interpreted.

A MediaInfo report of a .wav file, most of which is reported by the header and format chunks:

How is digital audio defined?

1 Answers1