I'm currently working on a dictionary and as I was trying to modify my data for easier reading and writing using regex, I ran into StackOverFlowException. My Word class consist of 2 attributes: word_target and word_explain.
I tried to format data as follow:
WORD_TARGET
<word_target>
WORD_EXPLAIN
<word_explain>
END_OF_WORD
Writing method:
public static void exportDictionary() {
try {
PrintWriter writer = new PrintWriter(EXPORTED_DICTIONARY_DATABASE_PATH);
System.out.println(words.size());
for(Word word : words) {
writer.printf("WORD_TARGET\n%s\nWORD_EXPLAIN\n%s\nEND_OF_WORD\n", word.getWord_target(), word.getWord_explain());
}
writer.flush();
writer.close();
}
catch (IOException e) {
System.out.println(e.getMessage());
}
}
Reading method:
public static void loadOriginalDictionaryData() throws IOException {
BufferedReader reader = new BufferedReader(new FileReader(new File("src\\main\\resources\\test.txt")));
Pattern regex = Pattern.compile("WORD_TARGET\n((.|\n)*)\nWORD_EXPLAIN\n((.|\n)*)\nEND_OF_WORD\n");
String line;
String temp = "";
while((line = reader.readLine()) != null) {
temp += line + "\n";
if(line.equals("END_OF_WORD")) {
Matcher matcher = regex.matcher(temp);
if (matcher.find()) {
//parameters stand for word_target and word_explain
insertWord(matcher.group(1), matcher.group(3));
temp = "";
}
}
}
reader.close();
}
Everything worked great for medium size String but when I tried to match Strings consist of 100+ lines I got StackOverflowException because matcher uses recursionas I understand. I am aware that this problem can be solved by replacing the (.|\n)* group by a character sequence consist of every possible character. However, the database that i have is a bit whacky so i'm force to add everything on the keyboard including special characters i.e \t,\n,.... I'm new to both Java and regex so this is a bit overwhelming for me.
So please help me build the proper pattern or maybe a better way to read the data.
Thank you for all the support <3