SCIPRESS FORMA
Back
Forma, Vol. 18 (No. 2), pp. 97-117, 2003
Original Paper

A Comparative Study of Translated Texts through the Analysis of Their Word Spectra: Application to a Text in Botchan

Kazuya Hayata

Department of Socio-Informatics, Sapporo Gakuin University, Ebetsu 069-8555, Japan
E-mail address: hayata@earth.sgu.ac.jp

(Received February 21, 2003; Accepted April 9, 2003)

Keywords: Word Spectrum, Statistical Linguistics, Translated Texts, Hellinger Distance, Spiral Mapping

Abstract. Frequency distributions of word-length data are presented for the translated texts of paragraphs sampled from the novel Botchan by Soseki Natsume. Languages of the translations are English, German, French, Spanish, Russian, Filipino, Malay, and Indonesian. Divergence between the different spectra for the same language is measured by calculating the Hellinger distance, DH2. The results show that irrespective of the languages it maintains the same order of magnitude, specifically DH2 ~ 10-2. In addition, a method for visualizing the evolution of the data is proposed.


[Full text] (PDF 588 KB)