4

I am looking for some built-in function which is able to find out in how many character two string differ.

For example:

CharDiff('SO0035F', 'SO005F') = 1

CharDiff('12345', '34512') = 0
jannagy02
  • 2,291
  • 1
  • 25
  • 42
  • 1
    I'm pretty sure there is no built-in function for that in sql-server or it would be listed here: http://msdn.microsoft.com/en-US/us-en/library/ms181984.aspx but it should not be too hard to write your own function for that, besides that, what would be your expected result for example for `CharDiff('SO0035F', 'SO035F')` (0 exists in both strings but the count of it varies) should this also return 1? – DrCopyPaste May 12 '14 at 11:25
  • Yes, I see that it was not clearly written, I think I should write my own function for this. The answer of your question is that it should check not only the existance of the character in the other string but that how many times it apperas it also matters. – jannagy02 May 12 '14 at 11:38
  • What result do you expect when comparing 'ab' and 'cde'? – msi77 May 13 '14 at 14:52
  • I expect 5, but now it is clear that it can't be solved without wrinting my own function. – jannagy02 May 14 '14 at 10:47
  • @jannagy02 as you wrote your last comment after my answer, did you consider my answer? – DrCopyPaste May 15 '14 at 09:40
  • @DrCopyPaste Of course, but if you read my question again, I emphasized that BUILT-IN. But I will probably get back to this question in some days, please be patient if it help me, I will accept :) – jannagy02 May 15 '14 at 13:53

2 Answers2

4

I needed something slightly different. I needed a comparison of '1234' to show as 2 characters different than '1243' even though they contain the same characters.

I came up with the following:

 CREATE FUNCTION dbo.CharDiff (@string1 NVARCHAR(MAX), @string2 NVARCHAR(MAX))
 RETURNS INT
 AS
 BEGIN
     DECLARE @diffs INT = 0
     WHILE LEN(@string1) > 0 AND LEN(@string2) > 0
     BEGIN
        IF SUBSTRING(@string1,1,1) <> SUBSTRING(@string2,1,1)
            SELECT @diffs = @diffs + 1
        SELECT @string1 = SUBSTRING(@string1,2,9999)
        SELECT @string2 = SUBSTRING(@string2,2,9999)
     END
     RETURN @diffs + LEN(@string1) + LEN(@string2)
 END
Scott R. Frost
  • 1,955
  • 1
  • 20
  • 25
2

As this page lists all string functions available in SQL Server, I am pretty sure there is no built-in functionality for exactly this use case available.

However, with a little help from this post I came up with the following which seems to fit your needs:

CREATE FUNCTION dbo.CharDiff (@string1 NVARCHAR(MAX), @string2 NVARCHAR(MAX))
RETURNS INT
AS
BEGIN

    DECLARE @allDifferences INT = 0
    DECLARE @charCount1 INT
    DECLARE @charCount2 INT

    --do this as long as both strings are longer than 0
    WHILE LEN(@string1) > 0 AND LEN(@string2) > 0
    BEGIN
        --get char count for the character at index 1 in string 1
        SELECT @charCount1 = (LEN(@string1) - LEN(REPLACE(@string1, SUBSTRING(@string1, 1, 1), '')))
        --get char count for the character at index 1 in string 1 but for string2
        SELECT @charCount2 = (LEN(@string2) - LEN(REPLACE(@string2, SUBSTRING(@string1, 1, 1), '')))

        --strip all chars that now have been counted from string 2
        SELECT @string2 = REPLACE(@string2, SUBSTRING(@string1, 1, 1),'')
        --strip all chars that now have been counted from string 1
        SELECT @string1 = REPLACE(@string1, SUBSTRING(@string1, 1, 1),'')

        --add difference to counting variable
        SELECT @allDifferences = @allDifferences + ABS(@charCount1 - @charCount2)
    END

    --is there any rest length on any of those string?
    SELECT @allDifferences = @allDifferences + ABS(LEN(@string1) - LEN(@string2))


    RETURN @allDifferences
END

Basically I just count the occurence of the character that currently exists on index 1 in the first string in both strings. Then I delete all those already counted characters from both strings (that way index 1 holds another character each iteration), as long as both strings have any characters in them. The rest length of a string that might still exist after that loop can simply be added to @allDifferences.

Community
  • 1
  • 1
DrCopyPaste
  • 3,913
  • 1
  • 21
  • 55