0

I'm currently working on a dictionary and as I was trying to modify my data for easier reading and writing using regex, I ran into StackOverFlowException. My Word class consist of 2 attributes: word_target and word_explain.

I tried to format data as follow:

WORD_TARGET
<word_target>
WORD_EXPLAIN
<word_explain>
END_OF_WORD

Writing method:

public static void exportDictionary() {
    try {
      PrintWriter writer = new PrintWriter(EXPORTED_DICTIONARY_DATABASE_PATH);
      System.out.println(words.size());
      for(Word word : words) {
        writer.printf("WORD_TARGET\n%s\nWORD_EXPLAIN\n%s\nEND_OF_WORD\n", word.getWord_target(), word.getWord_explain());
      }
      writer.flush();
      writer.close();
    }
    catch (IOException e) {
      System.out.println(e.getMessage());
    }
  }

Reading method:

public static void loadOriginalDictionaryData() throws IOException {
    BufferedReader reader = new BufferedReader(new FileReader(new File("src\\main\\resources\\test.txt")));
    Pattern regex = Pattern.compile("WORD_TARGET\n((.|\n)*)\nWORD_EXPLAIN\n((.|\n)*)\nEND_OF_WORD\n");
    String line;
    String temp = "";
    while((line = reader.readLine()) != null) {
      temp += line + "\n";
      if(line.equals("END_OF_WORD")) {
        Matcher matcher = regex.matcher(temp);
        if (matcher.find()) {
          //parameters stand for word_target and word_explain
          insertWord(matcher.group(1), matcher.group(3));
          temp = "";
          }
      }
    }
    reader.close();
  }

Everything worked great for medium size String but when I tried to match Strings consist of 100+ lines I got StackOverflowException because matcher uses recursionas I understand. I am aware that this problem can be solved by replacing the (.|\n)* group by a character sequence consist of every possible character. However, the database that i have is a bit whacky so i'm force to add everything on the keyboard including special characters i.e \t,\n,.... I'm new to both Java and regex so this is a bit overwhelming for me.

So please help me build the proper pattern or maybe a better way to read the data.

Thank you for all the support <3

  • Never use `(.|\n)*?`. See [this YT video of mine](https://www.youtube.com/watch?v=SEobSs-ZCSE) to see why. There are always better solutions. In Java, you can compile the pattern with `Pattern.DOTALL`, or replace `.` with `(?s:.)`. – Wiktor Stribiżew Apr 15 '22 at 19:10

0 Answers0