this question might be silly, but how do I know which category (value) within a feature is significant?
Say I have some exam result data of students from four different countries:

How do I know which country(A,B,C or D) is significant for student to pass? i.e. Can I say students from country A,B and C are more likely to pass while student from country D are more likely to fail? Or because country B and C has so much less data I can only deduce conclusion for country A and D? How do I know what is the threshold to do this?
One idea I have is to do a Chi square test on each country and to see which country is significant (by a confidence level of, say, 95%). However, this idea does not seem to work if we have significantly more data in one country than others since the expected value will skew towards the distribution of the country with more data.
Any thoughts will be appreciated.
Edit: This is because the next step I want to do is to find out students in which country has more influence in pass/fail of my data. I will be computing some sort of "influence measure" for each country. From my very limited statistics knowledge I guess my Null Hypothesis will be that country has no influence i.e. independence between a country and the pass/fail result?
(For reference, the measure which I want to do is shown below, using country A as an example. This tells me how many times it is likely for a student in country A to pass than student from the other countries. Based on the data above, we can already see country B will have a much higher "Influence" on "pass" in my data than other countries. But country B has much less data. How can we test this to say whether country B is significant or not?)
