5

How do I remove all non alphanumeric words from a list of strings (List<string>) ?

I found this regex !word.match(/^[[:alpha:]]+$/) but in C# how can I obtain a new list that contains only the strings that are purely alphanumeric ?

Konamiman
  • 48,742
  • 16
  • 110
  • 136
LeMoussel
  • 4,699
  • 10
  • 58
  • 112

4 Answers4

10

You can use LINQ for this. Assuming you have a theList (or array or whatever) with your strings:

var theNewList = theList.Where(item => item.All(ch => char.IsLetterOrDigit(ch)));

Add a .ToList() or .ToArray() at the end if desired. This works because the String class implements IEnumerable<char>.

Konamiman
  • 48,742
  • 16
  • 110
  • 136
3
  Regex rgx = new Regex("^[a-zA-Z0-9]*$");
  List<string> list = new List<string>() { "aa", "a", "kzozd__" ,"4edz45","5546","4545asas"};
  List<string> list1 = new List<string>();
  foreach (var item in list)
  {
     if (rgx.Match(item).Success)
     list1.Add(item);
  }
Ghassen
  • 785
  • 11
  • 25
1

With LINQ + regex, you can use this:

list = list.Where(s => Regex.IsMatch(s, "^[\\p{L}0-9]*$")).ToList();

^[\\p{L}0-9]*$ can recognise Unicode alphanumeric characters. If you want to use ASCII only, ^[a-zA-Z0-9]*$ will work just as well.

M.kazem Akhgary
  • 17,601
  • 7
  • 53
  • 108
wingerse
  • 3,538
  • 1
  • 24
  • 54
0

There's a static helper function that removes all non-alphanumeric strings from a List:

    public static List<string> RemoveAllNonAlphanumeric(List<string> Input)
    {
        var TempList = new List<string>();
        foreach (var CurrentString in Input)
        {
            if (Regex.IsMatch(CurrentString, "^[a-zA-Z0-9]+$"))
            {
                TempList.Add(CurrentString);
            }
        }
        return TempList;
    }
GeorgDangl
  • 2,026
  • 1
  • 27
  • 35