I’ve been learning Japanese since 1998 and while I’m pretty good for a gaijin living away from Japan, progress has been slow and sporadic. I feel that part of the problem is that Japanese learning apps and web sites aren’t interactive and smart enough. I’ve tried dozens, some of them for months, and they each have many severe usability issues or glaring mistakes that confuse and discourage learners. For instance, here are two examples of mispronounciations from the iOS app Nihongo, a modern dictionary:
The first example (1) should be pronounced いれて (irete) in this context. That kanji is shared by multiple verbs, so it’s sometimes pronounced はいる (hairu) depending on surrounding words and context. No learning app performs this analysis right now. It’s even more complicated with proper nouns (example 3). These often have dozens of exceptional pronounciations, and even Japanese adults find it challenging to pronounce the names of small cities or unusual last names. With deep learning, this should be solveable given enough examples, e.g. using a RNN like LSTM.
During my spare time, from Q4-2017 to Q1-2018, I decided to attempt to explore the feasibility to single-handedly create a cross-platform app that would help me and my kids practice Japanese and learn more efficiently.
I was also hoping to:
Eventually, integrate deep learning, e.g. recognizing handwritten kanjis, automatically inferring the pronounciation of ambiguous words, etc.
Assess if sustainable revenue (e.g. $50K) could be achieved within a reasonable timeframe (e.g. 1 year)
I had zero interest in learning both Android and iOS native app development, e.g. using Java/Kotlin and Swift/Objective C. That’s too many programming stacks for my taste.
I didn’t want to have to buy a Mac - I’m happy with Windows and Linux, I don’t have the patience to master a 3rd OS.
Main technical concerns
Performance - e.g. how unresponsive is the app?
Ease of debugging, automated testing, etc. I wanted this to be a fun learning experience using modern tools, e.g. avoid hair-pulling debugging sessions using printfs and long test cycles.
Robustness - even if only 5% of users experienced crashes, there was no point in attempting commercialization.
Two major options
I explored popular frameworks to develop portable mobile apps. I quickly eliminated:
Unity. I knew Unity already but I felt its was overkill for a “small” app. I quickly eliminated this option because the starting time was too high (more than 1 second). I was also concerned that integrating hardware-accelerated deep learning would be tricky down the line.
Xamarin. I tried a few apps and was unimpressed with the runtime performance. C# again presented potential difficulty down the line to integrate deep learning.
I first tried developing the app using Qt, QML and C++, since I had recent experience with this development environment, albeit on Windows Desktop. A few lessons I learnt while developing small proof-of-concepts:
Getting started with Qt + Android was relatively easy. There are good samples and very little C++ is required.
A Mac development environment is required for iOS. I was able to install a Hackintosh VM over Linux. It was a bit slow but it worked. There was however the concern that future Apple updates would break my development environment. So I was mentally prepared to buy a real Mac if I decided to go ahead with commercialization.
I needed to bundle a real Japanese TrueType fonts to avoid the (Han unification nightmare)[https://en.wikipedia.org/wiki/Han_unification]. That took several MBs, slowing down incremental compilation and deployment. The solution was to bundle this in a separate resource file than the QML files. That way frequent changes to the (relatively small) QML files didn’t require re-uploading everything.
After optimization, turn-around time to recompile and redeploy on Android was about 15 seconds (acceptable), and about 1 minute on iOS. (painful).
Qt offers many options for displaying UI. I used the relatively new Qt Quick 2 and was pleased with it. The runtime performance was excellent even during stress tests - a consistent 60fps when displaying a few screenfuls of sentences and <100 ms when loading new screens.
I was originally developing on Windows 10 and often ran into the 260-character path limitation in some of the Android build tools. This was very annoying - the error messages were often cryptic. I tried Windows junctions (equivalent of symbolic links) and while that solved some problems, it wasn’t a simple long-term solution with git. I ended up simply keeping my paths really short, e.g. renaming “claforte/learning/tutorials/001_some_kind_of_tutorial” into “cl/lrn/tuts/001_smof”. Felt like MS-DOS.
While Qt comes with useful small components, the ecosystem is limited. Building an eye-pleasing interactive design would have likely required me to spend weeks doing graphic design work for which I have no talent nor interest. That’s the #1 reason why I decided to try…
2nd attempt: using React Native
Dealing with native packages was a lot of hassle, and again, required the Apple and Android toolchain.
Still ran into 260 characters path problems on Windows. Same work-around - keeps paths you control really short.
3rd attempt: using Expo
I then upgraded to Expo and boy, am I happy I did:
Expo curates and integrates the most common, high-quality React Native packages. The native packages are precompiled into standard apps that streamline development. This eliminates most incompatibility problems I ran into previously.
No more need to compile and deploy on Android or Mac. I never had to deal with adb again. All you do is install the standard Expo app on your phone, scan your QR code, and you’re done.
Developing with React Native and Expo was a load of fun:
Hot-reloading works great all the time. It often took less than 5 seconds for my code changes to be reloaded.
Runtime performance isn’t as great as Qt but still very good. I got 30fps without difficulty. I was however careful not to load too much stuff in each screen, e.g. loading only the first 10 matches in the dictionary at first.
The ecosystem is exciting - lots of useful packages, and most of them don’t need native compilation. With expo’s mindshare increasing, these components are generally compatible right away. Some of these components are beautiful and helped me achieve a polished look easily.
Automated testing with Jest was really fun and fast. The only issue I ran into here was debugging the client-server communication. I know I should have written more mocks, but for my purposes, it felt simpler to simply have the client tests cover some of the integration tests as well.
(BTW, my daughter Emily stars in the videos, and my wife Misako recorded the audio sentences.)
Other technical issues
Asynchronous code is also both a blessing and curse in node. IMHO there are too many ways to express asynchronous concepts and most tools (e.g. debuggers) can’t handle it – especially when Babel transpiling makes things even more complicated. Tracing small asynchronous function was ridiculously complicated. At least, with the momentum behind React Native and Expo, I’m hopeful these problems will get resolved in the next year or so.
Server and NoSQL database
I used feathersjs and the default NeDB NoSQL database to build a quick-and-dirty server including a real-time SocketDB API. The Feathers CLI generator are very helpful. The documentation was so-so. (It’s since improved a bit.)
One task that took much longer than I anticipated was importing and processing the Japanese dictionary. It was provided in a 90MB XML format, covering ~170K entries and definitions. I first tried to load the whole thing at once, and ran out of memory (12GB). I had to implement a sax parser. It took me a while to get the abstraction at the right level so that code wouldn’t be tag-specific all over the place. Another issue is that the loading code could run really fast (thousands of entries per second) but uploading them to the server to be saved was 20 times slower. The sax parsing code was based on callbacks, while the Feathers client dealt with asynchronous functions I had to await for. My first attempts at plugging these two together always resulted in promises or connection timeouts. It took me a day to step back and realize the simplest way was to implement a two-pass process instead.
Overall, this learning experience has been positive. I did however put this specific Japanese app project on the backburner. Mainly, I don’t think I can turn this into a profitable product - the market for Japanese learning software is just too small and the effort to turn this prototype into a fleshed out product wouldn’t be worth it.
For the time being I’m focused on other exciting projects that involve more Deep Learning and 3D graphics.