8

This is what I want to do:

I have a large Anki collection (over 20,000 notes of Japanese vocabulary). I want to add Japanese pitch/accent information to each note.

Possible solutions? I have no idea how to do it, but I know of an Anki addon (here) which partially does this, but it is buggy and its source data is not complete at all. I know that there are online resources like Weblio which display a number corresponding to pitch/accent patterns. But I don't know how to synthesize the various resources in order to get that data into my collection.

In theory, what would I need to be able to bulk-add pronunciation meta-data to a large list of notes?

Update

I have since discovered that the github repository does in fact contain a file with pitch/accent on a very large number of words. The problem is that unless the search term is identical, it does not append that data. For example, if 監督 is in the source data, it will not write the pitch/accent to an Anki note with 監督する because the terms are not identical.

So the pitch/accent data is available. But I still have no idea how to manipulate that data to enable a bulk-add to an Anki collection, while avoiding the problem described above.

Tommi
  • 3,492
  • 1
  • 11
  • 37
kandyman
  • 226
  • 1
  • 4
  • It would require some clever programming, but I can see how I would create a program, iterating over the input file, and using different heuristics to match search terms. And last 10% might be faster done by hand – Peter M. - stands for Monica Mar 28 '18 at 20:45
  • @PeterMasiar sounds interesting. what do you mean by the last 10%? – kandyman Mar 28 '18 at 21:38
  • Last 10% of the words which were not matched by previous heuristics. I would likely to create a script, reading Excel spreadsheet (or CSV, comma-separated values), used different heuristics to match words/sounds, and edited last 10% by hand. Then generate Anki deck. If you don't know much programming, Python is perfect fit: Py3.x understands Unicode natively, is great for parsing text files in flexible ways, and is recommended as first language for beginners. – Peter M. - stands for Monica Mar 28 '18 at 21:43
  • @PeterMasiar do you think it would be possible to keep the learning history of items in the deck? so that the pitch/accent data is added to the notes, without affecting anything else about the note? are you interested in taking on a project like this? – kandyman Mar 28 '18 at 22:27
  • I have no idea if it is possible to export learning history, add some properties and rebuild the deck. Could be, Anki developers would know. Certainly interesting project, I wish I had time to pursue that, but alas for next few years I will not have time. Contact Anki developers, someone might do it, and cheaply, with some luck. But adding new data and starting deck from scratch, with no learning history, is not that big deal. Just honestly answer "easy" and after few iterations entries are spaced again widely for repetitions. – Peter M. - stands for Monica Mar 29 '18 at 03:17
  • But for deck of 20K entries, starting them again would be a real pain, I agree. :-/ – Peter M. - stands for Monica Mar 29 '18 at 03:35
  • Sorry to hear you can't do it. As a matter of interest, how much do you think someone would charge for a project like that? Do you have a rough estimate? – kandyman Mar 29 '18 at 21:17
  • 1
    It depends on many variables. I assume hourly rate, so finding someone who knows Anki core (and does not need to charge you for learning it) would make the project cheaper. Also, because Anki is "free-libre" project, developers might do it very cheaply if you agree to share the developed code back with the community, and it is in line with the development they want to do anyway. – Peter M. - stands for Monica Mar 30 '18 at 03:09
  • 1
    It never hurts to ask on development forum. Few times I requested a feature from open-source project (not Anki), developer not only implemented it for free, but also thanked me for the suggestion. Other times, developer agreed to work for half the usual rate ($50 instead of $100 per hour) if I agree to share the code with the community (it was for commercial extension for a commercial website, and was NOT implemented). – Peter M. - stands for Monica Mar 30 '18 at 03:20
  • 1
    Forum for support/contact developers seems to be https://anki.tenderapp.com/discussions and AnkiDriod is open source, but some other parts are commercial, so you need more research. Sorry I cannot help more. Also consider when asking volunteers to add a feature for a free software (for free), you are asking those volunteers to work for free. Volunteers do only what THEY want. :-) – Peter M. - stands for Monica Mar 30 '18 at 03:26

0 Answers0