You can use the Chi-2 test for $k$ independent samples to test whether your classifiers (your models for prediction of pass/fail) yield the same fractions of misclassified prediction. If $H_0$ is rejected and there are differences between the model performance estimates, you can use pairwise Chi-2 tests to test whether one prediction model with a specific variable subset performs significantly better than the others.
The general test design is a $k \times 2\; $ Chi-2 test where one outcome is correctly predicted cases and the other outcome is number of wrongly predicted cases (simple counts) in each of the $k$ rows. Each of these $k$ models use:
- Model 1: IV-1
- Model 2: IV-2
- Model 3: IV-3
- Model 4: IV-1 and IV-2
- Model 5: IV-1 and IV-3
…
etc.
N. Model N: IV-1 and IV-2 and IV-3
In the second testing round you compare IV-1 with IV-2, IV-1 with IV-3, IV-2 with IV-3, IV-1 with IV-1 and IV-2, IV-1 and IV-2 with IV-1 and IV-3, etc. in a series of pairwise Chi-2 tests. Should give you the significant most well-performing variable subset when you order them based Chi-2 significance probabilities.
You have only $3$ prediction variables so a Bonferroni correction for multiple testing should not be necessary. With larger variable sets to begin with the number of pairwise comparisons grow fast, and Bonferroni will be recommended.
SPSS or another software package can certainly perform the Chi-2 testing for you.
I recommend as reference book: S. Siegel, N J. Castellan, Nonparametric statistics for the behavioral sciences, McGraw-Hill.