2

I have a dataset in .csv format as shown:

NRC_CLASS,L1_MARKS_FINAL,L2_MARKS_FINAL,L3_MARKS_FINAL,S1_MARKS_FINAL,S2_MARKS_FINAL,S3_MARKS_FINAL,
FAIL,7,12,12,24,4,30,
PASS,49,36,46,51,31,56,
FAIL,59,35,42,18,18,45,
PASS,61,30,51,33,30,52,
PASS,68,30,35,53,45,54,
2,82,77,75,32,36,56,
FAIL,18,35,35,32,21,35,
2,86,56,46,44,37,60,
1,94,45,62,70,50,59,

Where the first column talks about the over all grade:

FAIL - Fail
PASS - Pass class
1 - First class
2 - Second class
D - Distinction

This is followed by marks of each student in 6 subjects.

Is there anyway i can find out performance in which subject makes a difference in overall outcome?

I am using Weka and had used J48 to build a tree.

/* UPDATE */

The summary of J48 classifier is:

=== Summary ===

Correctly Classified Instances       30503               92.5371 %
Incorrectly Classified Instances      2460                7.4629 %
Kappa statistic                          0.902 
Mean absolute error                      0.0332
Root mean squared error                  0.1667
Relative absolute error                 10.8867 %
Root relative squared error             42.7055 %
Total Number of Instances            32963 

Also I discretized the marks data into 10 bins with useEqualFrequency set to true. The summary of J48 now is:

=== Summary ===

Correctly Classified Instances       28457               86.3301 %
Incorrectly Classified Instances      4506               13.6699 %
Kappa statistic                          0.8205
Mean absolute error                      0.0742
Root mean squared error                  0.2085
Relative absolute error                 24.3328 %
Root relative squared error             53.4264 %
Total Number of Instances            32963 
Anvith
  • 21

1 Answers1

2

You seem to have an ordinal response variable, but it is not clear to me (not knowing this grading system) what is the order, but you will know. So one modern way (probably less known when you wrote the quetion 9 years ago) is ordinal regression. Try to peruse the tags , [tag.ordered-logit]. Search this site, many good posts. Some: