I want to make a voice changer, found the description of some effects, but I don't exactly understand what the digital audio looks like.
How much information can I get from different audio formats, or input audio streams?
Can I get frequency, pitch and what the channels are?
Should I apply filters and stuff to original track, or to spectral representation via Fourier transform?
Where can I get that kind of information?

I suggest you read up on digital audio sampling theory and then read up on file formats and audio interfaces such as "portaudio". This will give you a better idea of how to deal with audio in a processing context.
– Mark Feb 23 '17 at 11:49