6

I want to get a list of all possible values for a regular expression.

Input :

2W
9WW
7W0W3

where W can be any digit from 0 to 9. i.e. W = [0-9]

Output:

20,21,22,....29
900,901,...910,911,...999
70003,70013,70023,...71003,72003,...79093

What I did :

I'm using Java and decided to create an ArrayList of Integers.

I created a method ArrayList<Integer> getNumbers(String regex).

ArrayList<Integer> getNumbers(String regex){

ArrayList<Integer> fullList = new ArrayList<Integer>();

char[] cArray = regex.toCharArray(); //converted the string into a character array.

   for(int i=1;i<cArray.length;i++) {

       if(cArray[i] == 'W') {              

           for(int j=0;j<10;j++) {
               //I'm not sure what goes here
              fullList.add(the number with 'w' at this index replaced by 'j');
           }               
       }

   }
return fullList;
}

Is there any better way or library functions available to generate all such numbers?

How can I achieve this?

halfer
  • 19,471
  • 17
  • 87
  • 173
Nikhil
  • 6,187
  • 9
  • 29
  • 58
  • This is opposite to what regex does. I do not know if there are libraries that generate possible strings that match a pattern. – Wiktor Stribiżew Aug 27 '15 at 08:04
  • Regex has got nothing to do with this question.. Its all about logic here – TheLostMind Aug 27 '15 at 08:14
  • Python package `sre_yield` provides a list of all matching strings for a regular expression pattern. – IceArdor Aug 27 '15 at 08:16
  • 1
    If you're going to do your own string processor to solve this problem, don't build up the array list explicitly in memory. Create an iterator that returns one value at a time. Your RAM will thank you. – IceArdor Aug 27 '15 at 08:18

2 Answers2

2

This is not quite a regex-based problem, but from an algorithmic perspective you can do the followings:

  • Count the number of W's in your string.
  • Based on the number of W's, create the product of range(0,9), for example if you have 2 W you need to create the products of two [0...9] lists that will be something like 0,0-0,1-0,2-...-9,9.
  • Loop over the combinations and replace them with a simple string formatting. For instance when you are iterating over a triple combination suppose with 3 variables i,j,k you want want to replace them in a string like 7W0W3W, you can do "7%d0%dW%d"%(i,j,k).

And if you are looking for a general regex to wrap all the cases you can use a regex like (w) (w in a capture group) then you need to first access to position of the match groups and replace them with combination items (i,j,k,..).

Mazdak
  • 100,514
  • 17
  • 155
  • 179
2

It's better to call the input string as "pattern", not "regular expression". Also it's probably better to create a "virtual" list which generates the strings on demand. Here's the sample implementation:

public static List<String> getNumbers(String pattern) {
    final char[] chars = pattern.toCharArray();
    int size = 1;
    for(char ch : chars)
        if(ch == 'W') {
            if(size == 1_000_000_000)
                throw new IllegalArgumentException("Too many 'W' to fit the list");
            size*=10;
        }
    final int finalSize = size;
    return new AbstractList<String>() {

        @Override
        public String get(int index) {
            char[] res = chars.clone();
            for(int i=res.length-1; i>=0; i--) {
                if(res[i] == 'W') {
                    res[i] = (char) ('0'+(index % 10));
                    index/=10;
                }
            }
            return new String(res);
        }

        @Override
        public int size() {
            return finalSize;
        }
    };
}

First we count the number of 'W' chars and calculate the target list size accordingly. Then we return an implementation of AbstractList which for given list index replaces the 'W' symbols with the remainders of index division by 10. This list does not take the memory, it generates the String only when you request it. If you want to get the hard-copy of such list, you can use new ArrayList<>(getNumbers(pattern)).

Tagir Valeev
  • 92,683
  • 18
  • 210
  • 320