Research Article


DOI :10.26650/acin.842758   IUP :10.26650/acin.842758    Full Text (PDF)

Two Level Kazakh Morphology

Züleyha YinerAtakan Kurt

We present a comprehensive two level morphological analysis of contemporary Kazakh with implementation and a disambiguation test data set on the Nuve Framework. Our study differs from the similar studies in a number of ways: (i) Our study covers both derivational and inflectional morphology to a greater extend (ii) Our implementation consisting of orthographic rules, morphotactics, a root lexicon of roughly 24 thousand roots, a lexicon of roughly 150 suffixes is open source which can be downloaded, reviewed and tested. (ii) Roughly 10 thousand manually disambiguated parses are available as a morphological disambiguation data set. (iii) It is easily extensible meaning it can be modified or extended with new rules without any programming. (iv) we are able to tackle emerging problems quickly and easily since Nuve is maintained by our study group. (v) Our implementation can handle separately written morphemes or digraphs etc. directly. (vi) We also have a Turkish morphological parser/generator in Nuve for morphology based machine translation between Turkish and other Turkic languages since these closely related languages have a lot in common from lexical, morphological, and syntactic aspects.

DOI :10.26650/acin.842758   IUP :10.26650/acin.842758    Full Text (PDF)

İki Düzeyli Kazak Morfolojisi

Züleyha YinerAtakan Kurt

Bu çalışmada Çağdaş Kazakça’nın iki düzeyli kapsamlı bir morfolojisini sunulmuştur. Çalışma Nuve Çatısı üzerinde gerçeklenmiş ve belirsizlik giderme veri seti ile test edilmiştir. Çalışmamız benzerlerinden bir kaç yönden farklılık göstermektedir:(i) Çalışmamız hem yapım hem çekim morfolojisini benzerlerinden daha geniş olarak ele almaktadır. (ii) İki-düzeyli yazım kuralları, ek dizilim kuralları, yaklaşık 24 bin kelimelik sözlük ve yaklaşık 150 adetlik ek sözlüğünden oluşan gerçeklememiz açık kaynak kodlu olarak paylaşıma açılmıştır. Üçüncü taraflarca indirilebilir, gözden geçirilebilir ve test edilebilir. (iii) Gerçeklememiz var olan kuralların değiştirilmesi veya yenilerinin eklenmesiyle kolayca genişletilebilir bir yapıdadır. Programlama gerektirmez. (iv) Nuve Çatısı çalışma grubumuz tarafından geliştirildiği için ortaya çıkan yeni problemleri kolay ve hızlı bir şekilde çözebilmekteyiz. (v) Gerçeklememiz ayrı yazılan ekler, iki sembolden meydana gelen harfler gibi durumları kolayca ele alabilmektedir. (vi) Nuve Türkçenin iki düzeyli morfolojisini de içermektedir. Bu sayede kelime hazinesi, kelime yapısı ve cümle yapısı yönlerinden büyük benzerlikler içeren Türki dillerle Türkçe arasında morfoloji tabanlı makina çeviri yapılabilir.


PDF View

References

  • Abdukerim, G., Tursun, E., Yang, Y., & Li, X. (2019). Uyghur morphological analysis using joint conditional random fields: Based on small scaled corpus. Discrete & Continuous Dynamical Systems-S, 12(4&5), 823. google scholar
  • Ablimit, M., Kawahara, T., Pattar, A., & Hamdulla, A. (2016). Stem-affix based Uyghur morphological analyzer. International Journal of Future Generation Communication and Networking, 9(2), 59-72. google scholar
  • Alam, Y. S. (1983). A two-level morphological analysis of Japanese. In Texas Linguistic Forum, 22, 229-252. google scholar
  • Altintas, K., & Cicekli, I. (2001). A morphological analyser for Crimean Tatar. Proceedings of the 10th Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN’2001), North Cyprus, 180-189. google scholar
  • Antworth, E. L. (1990). PC-KIMMO: A two-level processor for morphological analysis. Summer Institute of Linguistics, International Academic Bookstore, Dallas, Texas. google scholar
  • Bekmanova, G., Sharipbay, A., Altenbek, G., Adali, E., Zhetkenbay, L., Kamanur, U., & Zulkhazhav, A. (2017). A uniform morphological analyzer for the Kazakh and Turkish languages. Proceedings of the Sixth International Conference on Analysis of Images, Social Networks and Texts, Moscow, Russia, 20-30. google scholar
  • Biray N., Ayan E., Ercilasun G. K., (2015). Çağdaş Kazak Türkçesi Ses-Şekil- Cümle Bilgisi- Metinler (2nd ed.). Istanbul, Turkey: Bilge Kültür Sanat. google scholar
  • Eryiğit, G., & Adalı, E. (2004, February). An affix stripping morphological analyzer for Turkish. Proceedings of the IASTED International Conference Artificial Intelligence and Applications, Innsbruck, Austria, 299-304. google scholar
  • Gökgöz, E., Kurt, A., Kulamshaev, K., & Kara, M. (2011, May). Two-level Qazan Tatar morphology. Proceedings of the 1st International Conference on Foreign Language Teaching and Applied Linguistics (FLTAL’11), Sarajevo, Bosnia and Herzegovina, 428-432. google scholar
  • Görmez, Z., Ünlü B. S., Kurt, A., Kulamshaev, K., & Kara, M. (2011). An overview of two-level finite state Kyrgyz morphology. Proceedings of the 2. google scholar
  • International Symposium on Computing in Science & Engineering (ISCSE)., Aydin, Turkey, 48-52 google scholar
  • Karttunen, L. (1983, December). KIMMO: a general morphological processor. In Texas Linguistic Forum, 22, 163-186. google scholar
  • Keskin, R. (2012). Two Level Uyghur Morphology and Uyghur Turkish Machine Translation. (Master’s Thesis). Fatih University the Graduate Institute of Sciences and Engineering, Istanbul. google scholar
  • Kessikbayeva, G., & Cicekli, I. (2014, June). Rule-based morphological analyzer of Kazakh language. In Proceedings of the 2014 Joint Meeting of SIGMORPHON and SIGFSM, Baltimore, Maryland, 46-54. google scholar
  • Kessikbayeva, G., & Cicekli, I. (2016). A rule based morphological analyzer and a morphological disambiguator for Kazakh language. Linguistics and Literature Studies, 4(1), 96-104 google scholar
  • Kim, D. B., Lee, S. J., Choi, K. S., & Kim, G. C. (1994, August). A two-level morphological analysis of Korean. In Proceedings of the 15th Conference on Computational Linguistics, Vol 1, 535-539. google scholar
  • Koskenniemi, K. (1983, August). Two-level model for morphological analysis. In Proceedings of the Eighth International Joint Conference on Artificial Intelligence, Karlsruhe, Germany, 683-685. google scholar
  • Makazhanov, A., Sultangazina, A., Makhambetov, O., & Yessenbayev, Z. (2015). Syntactic annotation of Kazakh: following the universal dependencies guidelines. A report. In Proceedings of the International Conference Turkic Languages Processing- TurkLang-2015, Kazan, Tatarstan, 338-350. google scholar
  • Makhambetov, O., Makazhanov, A., Yessenbayev, Z., Sabyrgaliyev, I., & Sharafudinov, A. (2014). Towards a data-driven morphological analysis of Kazakh language. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi, 7(2), 31-36. google scholar
  • Makhambetov, O., Makazhanov, A., Yessenbayev, Z., Matkarimov, B., Sabyrgaliyev, I., & Sharafudinov, A. (2013, October). Assembling the Kazakh language corpus. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, 1022-1031. google scholar
  • Oflazer, K. (1994). Two-level description of Turkish morphology. Literary and linguistic computing, 9(2), 137-148. google scholar
  • Orhun, M., Tantug, A. C., & Adali, E. (2009). Rule based analysis of the Uyghur nouns. International Journal on Asian Language Processing, 19 (1), 33-43. google scholar
  • Shylov, M. (2010). Two level Turkmen morphology and a Turkmen Turkish machine translation, (Master’s Thesis). Fatih University the Graduate Institute of Sciences and Engineering, Istanbul. google scholar
  • Şanlı, T. (2018). Kırım Tatarcası’nın biçimbilimsel çözümlemesi ve Kırım Tatarcası-Türkçe biçimbilimsel makina çevirisi Sistemi. (Master’s Thesis). Istanbul University Institute of Graduate Studies in Science and Engineering, Istanbul. google scholar
  • Tantuğ, A. C., Adalı, E., & Oflazer, K. (2006, August). Computer analysis of the Turkmen language morphology. Proceedings of the 5th International Conference on Natural Language Processing, Turku, Finland, 186-193. google scholar
  • Tyers, F. M., & Washington, J. (2015). Towards a free/open-source Universal Dependency Treebank for Kazakh. In Proceedings of the International Conference Turkic Languages Processing, TurkLang-2015, Kazan, Tatarstan, 276-289 . google scholar
  • Washington, J., Salimzyanov, I., & Tyers, F. M. (2014, May). Finite-state morphological transducers for three Kypchak languages. Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC, Reykjavik, Iceland ,3378-3385. google scholar
  • Yiner, Z., Kurt, A., Kulamshaev, K., & Zafer, H. R. (2016, May). Kyrgyz orthography and morphotactics with implementation in NUVE. Proceedings of International Conference on Engineering and Natural Sciences, Sarajevo, Bosnia and Herzegovina, 1650-1658. google scholar
  • Zafer, H. R., Tilki, B., Kurt, A., & Kara, M. (2011, May). Two-level description of Kazakh morphology. Proceedings of the 1st International Conference on Foreign Language Teaching and Applied Linguistics (FLTAL’11), Sarajevo, Bosnia and Herzegovina, 560-564. google scholar
  • Zafer, H. R., “Nuve: A natural language processing library for Turkish in C#”. [Online]. Available: https://github.com/hrzafer/nuve. (05.12.2020). google scholar

Citations

Copy and paste a formatted citation or use one of the options to export in your chosen format


EXPORT



APA

Yiner, Z., & Kurt, A. (2021). Two Level Kazakh Morphology. Acta Infologica, 5(1), 79-98. https://doi.org/10.26650/acin.842758


AMA

Yiner Z, Kurt A. Two Level Kazakh Morphology. Acta Infologica. 2021;5(1):79-98. https://doi.org/10.26650/acin.842758


ABNT

Yiner, Z.; Kurt, A. Two Level Kazakh Morphology. Acta Infologica, [Publisher Location], v. 5, n. 1, p. 79-98, 2021.


Chicago: Author-Date Style

Yiner, Züleyha, and Atakan Kurt. 2021. “Two Level Kazakh Morphology.” Acta Infologica 5, no. 1: 79-98. https://doi.org/10.26650/acin.842758


Chicago: Humanities Style

Yiner, Züleyha, and Atakan Kurt. Two Level Kazakh Morphology.” Acta Infologica 5, no. 1 (Dec. 2021): 79-98. https://doi.org/10.26650/acin.842758


Harvard: Australian Style

Yiner, Z & Kurt, A 2021, 'Two Level Kazakh Morphology', Acta Infologica, vol. 5, no. 1, pp. 79-98, viewed 6 Dec. 2021, https://doi.org/10.26650/acin.842758


Harvard: Author-Date Style

Yiner, Z. and Kurt, A. (2021) ‘Two Level Kazakh Morphology’, Acta Infologica, 5(1), pp. 79-98. https://doi.org/10.26650/acin.842758 (6 Dec. 2021).


MLA

Yiner, Züleyha, and Atakan Kurt. Two Level Kazakh Morphology.” Acta Infologica, vol. 5, no. 1, 2021, pp. 79-98. [Database Container], https://doi.org/10.26650/acin.842758


Vancouver

Yiner Z, Kurt A. Two Level Kazakh Morphology. Acta Infologica [Internet]. 6 Dec. 2021 [cited 6 Dec. 2021];5(1):79-98. Available from: https://doi.org/10.26650/acin.842758 doi: 10.26650/acin.842758


ISNAD

Yiner, Züleyha - Kurt, Atakan. Two Level Kazakh Morphology”. Acta Infologica 5/1 (Dec. 2021): 79-98. https://doi.org/10.26650/acin.842758



TIMELINE


Submitted24.12.2020
Accepted26.04.2021
Published Online29.06.2021

LICENCE


Attribution-NonCommercial (CC BY-NC)

This license lets others remix, tweak, and build upon your work non-commercially, and although their new works must also acknowledge you and be non-commercial, they don’t have to license their derivative works on the same terms.


SHARE




Istanbul University Press aims to contribute to the dissemination of ever growing scientific knowledge through publication of high quality scientific journals and books in accordance with the international publishing standards and ethics. Istanbul University Press follows an open access, non-commercial, scholarly publishing.