I'm new to Machine Learning.
I've just finished the Coursera course. :)
And for my first practical attempt I wanted to "analyse" a local used cars selling website in order to compose a modal that would "predict" an end price.
And I have a problem with "encoding" car features: Some of them are "discrete" ( make, model, gearbox encoding : 1 - manual, 2 - automatic, 3 - semi-automatic, fuel encoding: 1 - petrol, 2 - diesel, 3 - electro, etc ), some are continuous ( engine volume, engine power, milage, etc ).
The issue is - some of these features might be absent as it is not compulsory to fill them all in.
My main question is: should I use some special value for representing a missing feature?
I don't feel like using "0" (zero) would do any good as "0 * x = 0" - absolutely any "theta" would do in this partical case. Should I set it to, say, "-1" or something? What is a common approach to this?
And what about feature scaling in that case?