I'm in the planning stages of programming my own Katakana => Romaji converter. However, I've noticed that every other converter already out there just converts literally. I want to try and employ a more "intelligent" converter (note that the end result will be a default, the user will be able to tweak the transliteration if they so choose)
As an example, have this name of a character from one of my stories: ネックスス
Converters will give the result: Nekkususu... but nobody who isn't familiar with Romaji would read that as Nexus...
So I'm trying to figure out some rules for better transliteration. So far, I have the following:
- The substring
kkuscan be replaced withx - A
uat the end is most likely silent and can be dropped (these two rules already "fix" Nexus) Tuis better asTsu, andTiasChi.
However, that's where my knowledge ends. What I'd like to know is, for starters, are the above three rules correct? Are there exceptions I should be aware of? Are there other rules that could make my converter more accurate?
As mentioned before, perfection isn't required because the user may adjust the result, but I would feel better about the feature if it at least made an effort.