Efficiency of Machine Translation in the Language Processing Process; Using Context Clues in Finding the [Exact] Meaning of Quranic Words [In Persian]

Shams, Zaynab; Chehreh , Sepideh

doi:10.52547/jsal.6.2.101

All

Webpages

Books

Journals

IERF

Journal of Studies in Applied Language (JSAL)

Volume 6, Issue 2 (4-2023) JSAL 2023, 6(2): 101-130 | Back to browse issues page

‎ 10.52547/jsal.6.2.101

‎ 20.1001.1.29809304.1402.6.2.5.8

Mendeley

Zotero

RefWorks

Shams Z, Chehreh S. (2023). Efficiency of Machine Translation in the Language Processing Process; Using Context Clues in Finding the [Exact] Meaning of Quranic Words [In Persian]. JSAL. 6(2), 101-130. doi:10.52547/jsal.6.2.101
URL: http://jsal.ierf.ir/article-1-31-en.html

Efficiency of Machine Translation in the Language Processing Process; Using Context Clues in Finding the [Exact] Meaning of Quranic Words [In Persian]

Zaynab Shams ^*¹

, Sepideh Chehreh²

1- PhD student of Qur'anic and Hadith Sciences, Faculty of Theology, Kashan University, Iran , Z.Shams@qom.ac.ir
2- Master of Artificial Intelligence, Islamic Azad University Science and Research Branch, Iran

Abstract: (4313 Views)

Translation is the transfer of the content of a text from the source language in to the target language, which is done by finding semantic equivalents between the two languages. The most important problems facing translation are the ambiguities in vocabulary and sentence structure. In a division, there are five important types of lexical ambiguity (categorical ambiguities, homophones, homographs, polysemy and transitive ambiguity), and two important types of structural ambiguity (real structural ambiguities and systemic ambiguities). Machine translation (MT), which is a part of the computer-based field of natural language processing (NLP) in computational linguistics and artificial intelligence, is considered as one of the automatic techniques that that convert unstructured text into structured data, and by converting text into information, it has been able to apply further analysis to the data to extract useful information. In this article, which was compiled in a library method, a theoretical plan has been proposed to resolve the issues surrounding the meaning of words in the machine translation of the Quran, the purpose of which is to help better understand the meaning of the words of the Quran, by taking advantage of the context clues and styles of the expressions. In the proposed method, a more suitable equivalent word is chosen in the target language by taking advantage of the context rule and text mining techniques, and referring to it. In this plan, the context is considered in the scale of words, which can be developed to other types if the conditions are met. In short, this plan has two steps: prioritizing (weighting) the adjacent words next to each other (any word within the range of verses where there is a consensus about their simultaneous descent) and then, comparing with the homonyms words (polysemous), and also comparing the equivalents of a word with the equivalents of other words (synonymization). In order to make the results more accurate, more specifications of the words can be prepared manually, tables that include things such as whether the verses are Meccan or Medinan, the order of revelation of the Surahs, the concepts and interpretations that are mentioned in the meaning of the words of the Qur'an in dictionaries such as Lisan al-Arab by Ibn Manzur and The Book of Vocabulary in the Strange Qur'an by Al-Ragheb Al-Isfahani and so on. Indexing techniques are used to obtain input data. In the pre-processing stage, the data that is less important (Stop Words) (such as “al-lazi (which)”, “al-lati (that is)”, “lam (not)”, “k'ana (was)”, “kaannama (as if)”, etc.) should be removed to get a better output. To change the shape of the data, the diacritic can be removed to make coding easier, and to reduce the sample size, the infix of the words can be used. In order to prepare a record of specifications for each word that is processed as input, based on the rule of context clues, at first, it is necessary to create a tokenizer, to prepare it in the primary data, and in the entire collection of input verses, a weight should be assigned to each word based on the two criteria of spatial proximity and frequency of repetition. The closer the words are to the desired word or the more it is repeated, the more weight is assigned to it, which represents their stronger semantic connection, and vice versa. Naturally, the words that are in the same verse (have the same number of the verse) have a greater influence than the words that are in other verses and at a further distance. In measuring the frequency criterion, weighted frequency (TF/IDF Weight) is used to show the importance of the word in the surah, the value (TF/IDF value) increases proportionally to the number of times a word appears in each surah or set of input verses, and is balanced by the number of verses that are in the Surah and contain the word. Finally, it was concluded that by using the contiguity of words and the semantic relations between them, and with the help of text mining techniques, a greater understanding of the vocabulary was obtained, which leads to a more appropriate selection of the equivalent word in the target language.

Keywords: Computational Linguistics, Sociolinguistics, Machine Translation, Qur'an, Context Correlation, Finding Equivalents for the Words

Full-Text [PDF 537 kb] (3471 Downloads)

Type of Study: Research | Subject: Sociolinguistics
Received: 2021/09/24 | Accepted: 2022/07/28 | Published: 2023/04/22

References

1. Alshaari, M., Elfitori, K., (2014). "Computable Difference Matrix for Synonyms in the Holy Quran". World Academy of Science, Engineering and Technology International Journal of Social, Behavioral, Educational, Economic, Business and Industrial Engineering, Vol. 8, No. 5, Pp. 1401-1404. Doi: 10.1999/1307-6892/9998243

2. AlSukhni, E. A., Mohammed N. Alsmadi, Izzat M. (2016). "An Automatic Evaluation for Online Machine Translation: Holy Quran Case Study", (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 7, No. 6, Doi: 10.14569/IJACSA.2016.070614 [DOI:10.14569/IJACSA.2016.070614]

3. Esmaili, M., (2012). Concepts and techniques of data mining. Kashan: Kashan Azad University.

4. Falahati, M., (2006). "Ambiguity in machine translation", Library and Information Sciences, Vol. 9, No. 3, pp. 21-38.

5. Izadi, M., & Zandieh, N. (2012). "Context and Spontaneous Occurrence of Meaning in the Exegetic School of Allameh Tabatabaee". Quran and Hadith Studies‎, Vol. 5, No. 2, pp. 5-23. Doi: 10.30497/quran.2012.1005

6. Javadipour, M., (2014). Overview of machine translation. The second electronic conference of new researches in science and technology.

7. Lyons, J. (1970). Semantics, 2th edition. Cambridge: Cambridge University Press

8. Naseh, A. A., (2006). "The functions of context in the interpretation of the Holy Quran". Qur'an and Hadith research, Vol. 1, No. 1, pp. 107-130.

9. Paktchi, A., (2012). Translation of the Holy Quran; A theoretical and practical approach (studying from vocabulary level to sentence construction), 3th edition Tehran: Imam Sadegh University (AS).

10. Rajabi, M., (1967). The method of interpretation of the Qur'an. Qom: University and District Research Institute.

11. Rezaei Esfahani, M. A., (2011). The logic of Quran translation, 2th edition, Qom: Al-Mustafa International Translation and Publishing Center.

12. Saeed, J. I. (2003). Semantics, Malden (MA)/Oxford: Blackwell.

13. Soyuti, A., (2000). Al-Itqan fi Ulum al-Qur'an, 2th edition, Beirut: Dar al-Kitab al-Arabi,

14. Tabari, M., (1984). Al-Tafsir al-Jame'i (Jame'i al-Bayan), Beirut: Dar al-Fekr.

15. Tadin, M. A., (2011). Studying the current problems of solving ambiguity in machine translation, Master thesis, Shiraz University.

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.