3

My code is like this:

public class CaseAccentInsensitiveEqualityComparer : IEqualityComparer<string>
    {
        public bool Equals(string x, string y)
        {
            return string.Compare(x, y, CultureInfo.InvariantCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0;
        }

        public int GetHashCode(string obj)
        {
             // not sure what to put here
        }
    }

I know the role of GetHashCode in this context, what I'm missing is how to produce the InvariantCulture, IgnoreNonSpace and IgnoreCase version of obj so that I can return it's HashCode.

I could remove diacritics and the case from obj myself and then return it's hashcode, but I wonder if there's a better alternative.

Andre Pena
  • 52,662
  • 43
  • 183
  • 224
  • Are you going to use this with a dictionary, `HashSet`, or anything else that uses hash tables? – Rawling Oct 22 '12 at 19:14
  • What is the purpose of this class? Could you provide an example of syntax this would allow you to avoid? – paparazzo Oct 22 '12 at 20:00

2 Answers2

5

Returning 0 inside GetHashCode() works (as pointed out by @Michael Perrenoud) because Dictionaries and HashMaps call Equals() just if GetHashCode() for two objects return the same values.
The rule is, GetHashCode() must return the same value if objects are equal.
The drawback is that the HashSet (or Dictionary) performance decreases to the point it becomes the same as using a List. To find an item it has to call Equals() for each comparison.
A faster approach would be converting to Accent Insensitive string and getting its hashcode.

Code to remove accent (diacritics) from this post

static string RemoveDiacritics(string text)
{
    return string.Concat(
        text.Normalize(NormalizationForm.FormD)
        .Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch) !=
                                        UnicodeCategory.NonSpacingMark)
    ).Normalize(NormalizationForm.FormC);
}

Comparer code:

public class CaseAccentInsensitiveEqualityComparer : IEqualityComparer<string>
{
    public bool Equals(string x, string y)
    {
        return string.Compare(x, y, CultureInfo.InvariantCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0;
    }

    public int GetHashCode(string obj)
    {
        return obj != null ? RemoveDiacritics(obj).ToUpperInvariant().GetHashCode() : 0;
    }

    private string RemoveDiacritics(string text)
    {
        return string.Concat(
            text.Normalize(NormalizationForm.FormD)
            .Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch) !=
                                          UnicodeCategory.NonSpacingMark)
          ).Normalize(NormalizationForm.FormC);
    }
}
Community
  • 1
  • 1
Mitsugui
  • 66
  • 1
  • 5
-1

Ah, excuse me, I had my methods mixed up. When I implemented something like this before I just returned the hash code of the object itself return obj.GetHashCode(); so that it would always enter the Equals method.


Okay, after much confusion I believe I've got myself straight. I found that returning zero, always, will force the comparer to use the Equals method. I'm looking for the code I implemented this in to prove that and put it up here.


Here's the code to prove it.

class MyArrayComparer : EqualityComparer<object[]>
{
    public override bool Equals(object[] x, object[] y)
    {
        if (x.Length != y.Length) { return false; }
        for (int i = 0; i < x.Length; i++)
        {
            if (!x[i].Equals(y[i]))
            {
                return false;
            }
        }
        return true;
    }

    public override int GetHashCode(object[] obj)
    {
        return 0;
    }
}
Mike Perrenoud
  • 64,877
  • 28
  • 152
  • 226
  • I think you've got it backwards. Perhaps YOU should be using comments. I didn't downvote you, but I suspect that whoever did read the question and your answer and deemed it unhelpful. – Wug Oct 22 '12 at 18:48
  • @Wug, did you read my edit or the cached version of my answer? It would be helpful to get your arrogant feedback on my edit. – Mike Perrenoud Oct 22 '12 at 18:49
  • I'm not the downvoter, but I think that, when you pass an IEqualityComparer to a Dictionary or HashMap, the dictionary will use the IEqualityComparer to determine the Hash for their items (as it can't use the usual GetHashCode as the objects are not being compared normally). Returning a zero number will confuse these dictionaries. – Andre Pena Oct 22 '12 at 18:49
  • @AndréPena, I'm getting myself confused. Return zero, always, and that will force it to use the `Equals` method. – Mike Perrenoud Oct 22 '12 at 18:50
  • @BigM: If you edited it within 5 minutes of posting it, the edit gets lumped in with the original post and an edit history isnt visible. I'm just calling it like I see it, no need to get all defensive. – Wug Oct 22 '12 at 18:52
  • @Wug, it was your tone. You stated `Perhaps YOU`, the caps shows your arrogance and attitude. I'm sure it makes you feel better to comment like that, but it's not necessary. – Mike Perrenoud Oct 22 '12 at 18:54
  • Capitals equals emphasis. Yes, there was tone in what I posted. It's hard to convey tone with text, so I added some. Don't read hostility into every line you see. – Wug Oct 22 '12 at 18:55
  • @Wug, it's not hostility, it's arrogance. But maybe when people are like that with you it gives you a warm fuzzy feeling. At any rate, I've got myself straight and the answer is correct, I've proven the model more than once and it's even on an answer here on Stack Overflow. – Mike Perrenoud Oct 22 '12 at 18:58