Font Size

HOME > No.5, Jun 2016 > Feature Story : Machine Translation Opening Japan to the World

Machine Translation Opening Japan to the World

Hitoshi Isahara

Machine translation technology is making progress every year, and against the backdrop of big data technology development, hopes are high for achieving ever greater accuracy. However, while machine translation among Western languages is approaching the level of practical use, there are still many hurdles which hamper machine translation from Japanese to other languages. We talked about this topic with Professor Hitoshi Isahara, who has been tackling these issues in his work at the cutting edge of machine translation research since the 1980s.

Interview and report by Madoka Tainaka

Obstacles to Effective Japanese Machine Translation

Professor Hitoshi Isahara came to Toyohashi University of Technology in 2010 to further his pursuit of machine translation research. This research was a continuation of his work with the Ministry of International Trade and Industry’s (MITI) Electrotechnical Laboratory, and the Ministry of Posts and Telecommunications’ Communications’ Research Laboratory (currently the National Institute of Information and Communications Technology, NICT).

“I have worked on a variety of research projects, such as machine translation of Japanese into Chinese, Thai, Malay, and Indonesian, the creation of a database of spoken Japanese and the development of Japanese-Chinese and Chinese-Japanese language processing technology, etc. However, unfortunately, while the accuracy of machine translation from Japanese into those other languages is improving, it is still inadequate. For example, when translating from Japanese to English, in addition to the differences in word order, Japanese has the peculiarity of context dependence. For example, Japanese often omits subjects, and this becomes an obstacle for translation. In order to use the machine translation for information outbound transmission, the quality needs to be fully guaranteed, but in reality the output of machine translation systems can not yet be relied upon. In order to make the output useful, we need to employ all kinds of ingenuity,” says Professor Isahara.

In fact, in the 1980s, Japan was the world leader on machine translation research. In addition to academic research by universities, many major electronics companies also invested in machine translation system development. Before long however, business interest in the field petered out. Professor Isahara says that the reason behind this they lacked the concept of providing a “service” to users. Companies simply applied the existing business model of creating, packaging and selling a system to the process of commercialization of machine translation technology, but this model produced unsatisfactory results for users.

“In addition one must consider the fact that in Japan there was little concept of systematically documenting and recycling information, so there was no structure in place for incremental system development and the provision of continuous service. Even for professional translators, knowledge of the domain and prior detailed information are essential. Likewise, for machine translation, the construction of a frequently updated user friendly database as well as development of operating techniques and mechanisms are just as indispensable as improvements in the accuracy of the translating engine.”

The three essential steps for machine translation

In this context, Professor Isahara cites the following three elements are essential to the process used in machine translation: (1) create Japanese text that is easy to translate by machine, (2) extract salient terms that fit with the field, and (3) build a post-editing environment.

“First of all, just following Step 1, making the Japanese input easy to translate, is quite effective. I then researched what kind of input sentences would make the machine translation go smoothly, without lowering the quality of the Japanese.”

To test this, he asked for cooperation from a local business, getting them to rewrite their company manual based on his rules for easy-to-translate Japanese, so-called “controlled language,” and conducted an experiment to measure the accuracy of the translation. The rules were simple: to include subjects and objects wherever possible, to avoid long sentences, and to avoid complex expressions. In the context of such research developments, and given the essential role of controlled language, the International Organization for Standardization is currently promoting international standardization in this field.

Step 2, the extraction of terms, means accumulating a large volume of documents related to a particular field, and from that list, semi-automatically extracting often-used “moderately long phrases”. “For example, these are phrases such as ‘the effect of carbon dioxide on global warming’ or ‘gas decompression characteristics when fissures occur in the pipeline’. These are automatically extracted and carefully examined by those well versed in the field. Appropriate parallel translation glossaries are prepared in advance.”

The final step is Step 3, post-editing. For this, Professor Isahara has incorporated the use of social crowdsourcing. Currently the task of post-editing cannot be omitted, since the quality of the document will suffer without adjusting the translated text. However, relying on professional translators is very costly. Therefore, as a cost effective solution, he recruits volunteers with suitable knowledge in the field assist with the work. The Toyohashi University of Technology website (English version) has been equipped with a machine translation engine and editing tools, and he found that with the help of foreign students in post-editing, it is possible to get an accurate translation.

“Our foreign students know a lot about the university, and so well able to manage quality control. For example, several students collaborated to correct a text translated from English into Indonesian, with the result that they were able to achieve a level of accuracy close to that of a professional translator.”

Involving social communities of various fields

Presently, in the business world, translation is generally left up to professional translators from the get-go, but the merit to bringing in machine translation is that even if the accuracy is modest, you can speed up the process without incurring any costs. In the current context of trends such as globalization and an influx of foreigners to Japan, as well as a revitalized inbound market, the demand for outbound information through machine translation will surely continue to rise.

“For example, Japan is getting ready to host the Rugby World Cup in 2019, and from now, there will certainly be more and more articles posted in Japanese. When that happens, if we can get help with post-editing by rugby fans, we anticipate that we will be able to transmit information fairly well in English. By gathering and studying the results of those translations, we will also be able to further improve translation accuracy. In particular, I would be so happy if senior citizens who have a lot of knowledge and ability and want to contribute to society, would participate in social crowdsourcing. Although it is basically volunteer work, we might need to prepare some incentives, such as giving autographs of famous players for each contribution,” says Professor Isahara. In fact, a joint research project on the topic of translating rugby articles, has already been commenced by Toyohashi University of Technology in collaboration with Rikkyo University and NICT.

In the future, Professor Isahara says that he would like to facilitate better public relations in English between Japan and the world through volunteer networks in various fields of interest to foreigners, such as railways cameras and ramen. Furthermore, he is concentrating on developing links with IT companies to create shared databases for manuals for business use, and other purposes. He has already launched such a joint project with Microsoft Japan and BroadBand Tower, Inc. Regarding future prospects, Professor Isahara says that the translation of various languages will expand from the base of Japanese-English translation, which may eventually result in the creation of a new international community. Although the issue of quality assurance remains a challenge, Professor Isahara will continue to strive to make machine translation more useful to society. 

Collaborative Research Project on Machine Translation
Collaborative Research Project on Machine Translation

Reporter's Note

Professor Isahara, having initially worked on natural language processing, eventually switched to translation engine research and has been devoting his energies to this field ever since. He is currently shifting the basis of his research to more directly practical applications, through the extraction of terms and the creation of a framework that aims for practical use.

“I have been engaged in Japanese machine translation since the early days, and I am still continually searching for ways to make it more usable. As I have gotten older, I think that my inclination to be useful to society has gotten stronger. If we do nothing, Japan’s information outbound transmission is going to fall increasingly behind. Even if the accuracy of current machine translation is insufficient, it is far better than not having it at all. Therefore, I am always striving to improve this,” says Professor Isahara.

I want to be optimistic about how much we can contribute to machine translation technology innovation, through the framework that Professor Isahara presents and through crowdsourcing, with the power of social communities brought to life by science.

References

  1. Hitoshi Isahara et al. (2014). ISO Language Resource Management Technical Specification Proposal for Controlled Natural Language: Basic Concepts and General Principles, Fourth Workshop on Controlled Natural Language (CNL 2014).
  2. Takako Aikawa, Kentaro Yamamoto, and Hitoshi Isahara. (2012). The Impact of Crowdsourcing Post-editing with Collaborative Translation Framework, JapTAL2012, pp.1-10.
  3. Hitoshi Isahara. (2012). Toward Practical Use of Machine Translation, JapTAL2012, pp.23-27.
  4. Midori Tatsumi, Anthony Hartley, Hitoshi Isahara, Kyo Kageura, Toshio Okamoto, Katsumasa Shimizu. (2012). Building Translation Awareness in Occasional Authors: A User Case from Japan, 16th Annual Conference on the European Association for Machine Translation (EAMT2012), pp.53-56.
  5. Anthony Hartley, Midori Tatsumi, Hitoshi Isahara, Kyo Kageura, Rei Miyata. (2012). Readability and Translatability Judgments for “Controlled Japanese”, 16th Annual Conference on the European Association for Machine Translation (EAMT2012), pp.237-244.

Share this story

Researcher Profile

Dr. Hitoshi Isahara

Dr. Hitoshi Isahara studied natural language processing until Master level at Kyoto University. After graduation, he was engaged in research for machine translation and natural language processing, then received PhD (Engineering) in 1995. He held following various important posts: President of Asia-Pacific Association for Machine Translation and President of International Association for Machine Translation. These achievements were recognized, then he has been appointed to conference ambassador by Japan National Tourism Organization in 2015. Currently, Dr. Isahara is a Director of the Information and Media Center at Toyohashi University of Technology. His research interests are Machine translation, Lexical semantics, and Association by human.

Reporter Profile

Madoka Tainaka

Madoka Tainaka is a freelance editor, writer and interpreter. She graduated in Law from Chuo University, Japan. She served as a chief editor of “Nature Interface” magazine, a committee for the promotion of Information and Science Technology at MEXT (Ministry of Education, Culture, Sports, Science and Technology).

ページトップへ