3

I've always found the lack of a built-in aggregate function that just strings together values using some user defined separator, frustrating.

The closest thing I know of is the XML hack:

select s.string as [text()] 
from strings s 
for xml path('')

But in my opinion that's a very strange way of doing it and you still have to handle the case with trailing or leading separator junk. So, I thought I'd write a CLR aggregate:

select dbo.Fold(s.string, ', ') 
from strings s

That's kind of nice. Except this thing doesn't perform very well when the number of rows reaches 5000. I don't really expect to run it against that much data but to my surprise the performance of the aggregate degrades rather rapidly as the data set grows. I didn't understand why at first, then I realized that the way the UDF is handled in SQL Server is the problem.

I'm using an UDF with a StringBuilder class, because of this I have to implement IBinarySerialize. But for reasons I do not understand SQL Server is calling the Write/Read pair between each iteration of Accumulate. Obviously this creates a performance problem when the string gets big. Is there anyway to avoid this costly serialization?

FYI, the XML hack are orders of magnitudes faster.

John Leidegren
  • 58,171
  • 19
  • 126
  • 149

1 Answers1

1

What you are doing is implementing the MySQL GROUP_CONCAT

See these for inspiration...

Personally, I use the XML trick.

Community
  • 1
  • 1
gbn
  • 408,740
  • 77
  • 567
  • 659
  • The XML trick appears to be the right choice in this matter, it's just a horrible syntax. The group_concat stuff maintains a frequency table while running the aggregate, I'm guessing it performance is just as bad but it's an interesting approach. – John Leidegren Jun 13 '11 at 11:02