-1

I have a problem. My dataset contains categorical variables and I need to convert them in numeric, because I have to do a comparison of accuracy between Logistic Regression and Neural Network. I used createDataPartition from caret, but train and set contain factor variables and not numeric. If I do partition in this other way:

split1<- sample(c(rep(0, 0.7 * nrow(data_d)), rep(1, 0.3 * nrow(data_d))))

I don't have a representative observations of 0 and 1 of numeric variables... How can I fix?

DATASET and DATA MANIPULATION:

'data.frame':   7032 obs. of  20 variables:
 $ y               : chr  "No" "No" "Yes" "No" ...
 $ gender          : chr  "Female" "Male" "Male" "Male" ...
 $ SeniorCitizen   : chr  "0" "0" "0" "0" ...
 $ Partner         : chr  "Yes" "No" "No" "No" ...
 $ Dependents      : chr  "No" "No" "No" "No" ...
 $ tenure          : chr  "1" "34" "2" "45" ...
 $ PhoneService    : chr  "No" "Yes" "Yes" "No" ...
 $ MultipleLines   : chr  "No" "No" "No" "No" ...
 $ InternetService : chr  "DSL" "DSL" "DSL" "DSL" ...
 $ OnlineSecurity  : chr  "No" "Yes" "Yes" "Yes" ...
 $ OnlineBackup    : chr  "Yes" "No" "Yes" "No" ...
 $ DeviceProtection: chr  "No" "Yes" "No" "Yes" ...
 $ TechSupport     : chr  "No" "No" "No" "Yes" ...
 $ StreamingTV     : chr  "No" "No" "No" "No" ...
 $ StreamingMovies : chr  "No" "No" "No" "No" ...
 $ Contract        : chr  "Month-to-month" "One year" "Month-to-month" "One year" ...
 $ PaperlessBilling: chr  "Yes" "No" "Yes" "No" ...
 $ PaymentMethod   : chr  "Electronic check" "Mailed check" "Mailed check" "Bank transfer (automatic)" ...
 $ MonthlyCharges  : chr  "29.85" "56.95" "53.85" "42.3" ...
 $ TotalCharges    : chr  "29.85" "1889.5" "108.15" "1840.75" ...
.....

After one-hot and manipulation, I have this:
'data.frame':   7032 obs. of  27 variables:
 $ y.Yes                                  : num  0 0 1 0 1 1 0 0 1 0 ...
 $ gender.Male                            : num  0 1 1 1 0 0 1 0 0 1 ...
 $ SeniorCitizen.1                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Partner.Yes                            : num  1 0 0 0 0 0 0 0 1 0 ...
 $ Dependents.Yes                         : num  0 0 0 0 0 0 1 0 0 1 ...
 $ tenure                                 : num  -1.2802 0.0643 -1.2394 0.5124 -1.2394 ...
 $ PhoneService.Yes                       : num  0 1 1 0 1 1 1 0 1 1 ...
 $ MultipleLines.Yes                      : num  0 0 0 0 0 1 1 0 1 0 ...
 $ InternetService.DSL                    : num  1 1 1 1 0 0 0 1 0 1 ...
 $ InternetService.Fiber.optic            : num  0 0 0 0 1 1 1 0 1 0 ...
 $ InternetService.No                     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ OnlineSecurity.Yes                     : num  0 1 1 1 0 0 0 1 0 1 ...
 $ OnlineBackup.Yes                       : num  1 0 1 0 0 0 1 0 0 1 ...
 $ DeviceProtection.Yes                   : num  0 1 0 1 0 1 0 0 1 0 ...
 $ TechSupport.Yes                        : num  0 0 0 1 0 0 0 0 1 0 ...
 $ StreamingTV.Yes                        : num  0 0 0 0 0 1 1 0 1 0 ...
 $ StreamingMovies.Yes                    : num  0 0 0 0 0 1 0 0 1 0 ...
 $ Contract.Month.to.month                : num  1 0 1 0 1 1 1 1 1 0 ...
 $ Contract.One.year                      : num  0 1 0 1 0 0 0 0 0 1 ...
 $ Contract.Two.year                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ PaperlessBilling.Yes                   : num  1 0 1 0 1 1 1 0 1 0 ...
 $ PaymentMethod.Bank.transfer..automatic.: num  0 0 0 1 0 0 0 0 0 1 ...
 $ PaymentMethod.Credit.card..automatic.  : num  0 0 0 0 0 0 1 0 0 0 ...
 $ PaymentMethod.Electronic.check         : num  1 0 0 0 1 1 0 0 1 0 ...
 $ PaymentMethod.Mailed.check             : num  0 1 1 0 0 0 0 1 0 0 ...
 $ MonthlyCharges                         : num  -1.162 -0.261 -0.364 -0.748 0.196 ...
 $ TotalCharges                           : num  -0.994 -0.174 -0.96 -0.195 -0.94 ...
...

with dput, I have thousands of number.. I can't report it here. 

0 Answers0