A classification model built on data of this type may not observe enough of the rare class to be able to distinguish the characteristics of the two classes. In my view, an SVM will work better in such situations.
In SVM a parameter called class.weights- a named vector of weights for the different classes, used for asymmetric class sizes might solve the problem you are facing.
Sample code:
library(e1071)
# weights: (example not particularly sensible)
i2 <- iris
levels(i2$Species)[3] <- "versicolor"
# Converting the dependent variable to binary(0-1) format
levels(i2$Species)[levels(i2$Species)=="versicolor"]<-1
levels(i2$Species)[levels(i2$Species)!=1]<-0
summary(i2$Species) # Summary of dependent
weights <- 100 / table(i2$Species) # Creating a named vector of weights
weights # a named vector which contains weights for each class
model <- svm(Species ~ ., data = i2, class.weights = weights)
In your dataset there are close to 2% observations of class 1 and 98% of class 0, so you should be passing a named vector of weights with 98 for class 1 and 2 for class 0(assuming that you want to give equal importance to each class).