The Adequacy and Acceptability of Machine Translation in Translating the Islamic Texts

Islamic translation is considered as a special distinguished sub-discipline of applied linguistics. It is one of the most important areas of translation because it carries the values and eternal message. Through the history, the first translation work was of religious books. This study attempts to evaluate the adequacy and acceptability of four machine translation (MT) systems (World lingo, Babylon translation, Google translate, Bing translator) in translating the Islamic texts. In addition, it aims to evaluate the Islamic translation outputs based on functional characteristics (accuracy, suitability, and well-formedness) and sub-characteristics (syntax, terminology, reliability, and fidelity). The findings indicted that Google Translate System is the most adequate and acceptable among the other three systems (World lingo, Babylon translation, Bing translator) in translating the Islamic texts. The findings also revealed that Google Translate is acceptable in producing Islamic translation outputs in regard to the following functional characteristics (accuracy, suitability, and well-formedness) and sub-characteristics (syntax, terminology, reliability and fidelity) due to Google Translate advancement.


Introduction
The emergence of Machine Translation (MT) has a significant role in translating the Arabic texts in particular because human translation cannot handle huge amounts of texts that need to be transferred to other languages.
Recently, the evaluation methods play an extremely significant role in the advancement of computer-mediated translation systems.With the emergence of Islamic English, there is a need to attain an exact Arabic into English MT production.Machine Translation is taking new dimensions, mainly in getting an accurate output, especially in Arabic into English MT.
Machine translation refers to transfer text from one language to another by using software.Daniel & Martin, (2009) defined MT as an automatic process of transferring text from one human language to the target language by using context information.Gerber (2012, p. 7) support the above idea ''the goal of the translation process is to take Arabic source text drawn from many different genres, both spoken and written, and translate it into fluent English while preserving all of the meaning present in the original Arabic text.Translation agencies will use their own best practices to produce high quality translations.While we trust that each agency has its own mechanism of quality control, we provide the following specific guidelines so that all translations are guided by some common principles''.
Islamic translation is considered as a special distinguished sub-discipline of applied linguistics.It has its own characteristics.The translator should know its skills and its issues.Islamic Translation is one of the most important areas of translation because it carries the values and eternal message.Through of the history, we can find that the first translation work was of religious books of the Torah, the Bible, and the Quran.Therefore, there is a demand to evaluate the adequacy and acceptability of free MT system in translating the Islamic texts.For drawing such evaluation, a comparative analysis of Islamic translated texts will be conducted according to specific criteria concentrating on the characteristics of the outcome quality: syntax, terminology, reliability and fidelity.The development in the area of MT from Arabic into English is manifest to be important due to getting a relatively correct translation of Arabic text.Even though a number of Arabic MT systems have somewhat attain a satisfactory level of output translation, the level of transferring exact data from a source language into target language requires additional processing by approximately all MT systems.

Research Problem
Research work conducted in Arabic into English Machine translation system is very important in the field of Translation.Hence, there are few studies conducted in evaluation of Arabic-English MT system.
In addition, there is a need to conduct research on evaluation of the adequacy and acceptability of Arabic-English MT systems in translating the Islamic texts due to the huge and size of Arabic Islamic text to be translated in the recent years.The previously conducted studies focused on analysis variety of literary, economic, legal, journalistic and technical texts but not Islamic texts.According to the best knowledge of the researcher, there is no research conducted regard adequacy and acceptability of Arabic-English MT systems in translating the Islamic texts

Significance of the Study
Previous studies of MT have been conducted with a view investigating the efficiency of MT systems, output evaluation of Arabic into English MT in translating literary, technical and legal text, problems of Arabic MT, and morphological and syntactic representation in MT.However, this study is unlike the above-mentioned studies.This study focuses on evaluation the adequacy and acceptability of four MT systems (World lingo, Babylon translation, Google translate, Bing translator) in translating the Islamic texts.In addition, it aims to evaluate the Islamic translation output based on functional characteristics (accuracy, suitability, and well-formedness) and sub-characteristics (syntax, terminology, reliability and fidelity).Furthermore, this study is significant because it is a serious attempt at getting a better understanding of the effectiveness of MT in translating the Islamic texts and it is beneficial for translators, students, educators and scholars in the field of translation.

Review of Literature
The evaluation of MT systems is an important area of exploring, together for determining the Adequacy, the acceptability of current MT systems and quality of the output of MT systems.Evaluation of MT output has been shown in many studies see Abu-Al-Sha'r & AbuSeileek, 2013;Hebresha & Ab Aziz, 2013;Carpuat et al., 2010;Gerber, 2012;Marcu et al., 2012;Huck, Stein, & Ney, 2011;Abu-Al-Sha'r, 2009;Galley et al., 2009;Attia, 2008;Salem et al., 2008;Habash, 2007;and Al-Otoum, 2006.Recently, the evaluation process of MT has an extremely significant role in the development of advancement of MT systems.Abu-Al-Sha'r & AbuSeileek (2013) investigated the advancements of MT systems between 2008 and 2013.The researchers evaluate seven MT systems by making a comparative analysis and re-evaluation of the texts according to certain criteria to evaluate the output.The corpus of analysis contains a variety of literary, economic, legal, journalistic and technical texts.The findings indicate that there was some advancement in the characteristics output of Google MT system in comparison with other six MT systems.Hebresha & Ab Aziz (2013) conducted a study to design an automatic translation system to translate Classical Arabic texts into English based on Rule-based approach.Arabic MT system includes analysis, transfer and generation stages.A comparative evaluation conducted between MT system output and human translation output for tracing the effectiveness of Arabic automatic translation system.The result of the evaluation indicates that 89.4% is the accuracy of the output, which demonstrates that utilizing Rule-based approach afford good output in translating the classical Arabic into English.Huck, Stein, & Ney (2011) conducted a study entitled ''Advancements in Arabic-to-English Hierarchical Machine Translation''.They investigated a number of advanced method and model in statistical MT form Arabic into English.They focus on the framework of hierarchical phrase-based translation.They gathered complementary techniques that were examined in isolation and mainly on different pairs of language.The combination of the techniques and models yield noteworthy advancement over a baseline utilizing a normal set of models.The outcome hierarchical systems present competitive on the large-scale of National Institute of Standards and Technology Arabic into English translation job.Shenassa & Khalvandi (2008) designed an evaluation system to analyze the different English translations output of the Quran by using computational linguistic.
Abu-Al-Sha'r & AbuSeileek (2013, p. 527) state that ''Translation from Arabic into English is a complex and demanding process where productivity could be determined by the quality and range of its dictionary''.Al-Otoum (2006) conducted a study to evaluated Arabic into English MT systems: namely Tran Sphere, and An-nakel.He examined the overall quality of translation resulted by these two MT systems concerning the four functional criteria: readability, fidelity, terminology, and syntax.The results indicated that the output of both systems contains low faithfulness, language problems, a mistranslation of terms, and inadequacy.Izwaini (2006) evaluated three Arabic-English MT systems to identify the problems of Arabic MT, its causes, and solutions.MT namely Google, Sakhr, & Systran is investigated to recognize a variety of frequent linguistic problems.The findings illustrate that the output of the three MT has two deficiencies in Google MT concerning the writing format.Sakhr MT has better output due to the diacritics in its system.Systran MT provides literal translation output, problems in grammar, word order, and many items remain as it is.The output of Systran's distorted language.

Research Methodology
The present section deals with designing of research methodology of the study.It presents the objectives and the questions of the study.Furthermore, it describes in detail the corpora under investigation and research instrument.

Objectives of the Study
This study attempts to evaluate the adequacy and acceptability of four MT systems (World lingo, Babylon translation, Google translate, Bing translator) in translating Islamic texts.In addition, it aims to evaluate the Islamic translation output based on functional characteristics (accuracy, suitability, and well-formedness) and sub-characteristics (syntax, terminology, reliability and fidelity).Adequacy refers to the quality and correctness of the output translation, whereas acceptability refers to the linguistic appropriateness of the MT output.

Questions of the Study
1) Are there any statistically significant differences between MT systems (World lingo, Babylon translation, Google translate, Bing translator) in translating the Islamic texts?
2) Are there any statistically significant differences in Islamic translation output based on functional characteristics (accuracy, suitability, and well-formedness) and sub-characteristics (syntax, terminology, reliability, and fidelity)?

Corpora
The corpora used in this research consist of five Islamic text-genres covered supplication, part of a sermon, pillars of Islam, Hadith (sayings of the prophet of Islam) from Arabic into English using four Machine Translation: World lingo, Babylon translation, Google translate, Bing translator.
This study proposes to evaluate the four functional characteristics syntax, terminology, readability, and fidelity.The Islamic texts will be translated via four MT systems (Google translate, Bing translator, Babylon translation, World lingo).Different scales will be applied in this evaluation.A panel of three specialist referees in translation had Ph.D. in applied linguistics evaluated the MT outputs.They translated 10 % of the sample outputs together.Furthermore, they discussed points of differences until the agreement was reached.Then they assessed the output texts individually.The inter-rate reliability between them was calculated.It was found 0.85, which is statistically accepted for the purposes of this study.For the total translation, it was found that 0.84 for fidelity, 0.89 for syntax, 0.86 for terminology and 0.81 for readability.
To evaluate readability and fidelity, a measurement of a 4-point scale (0 to 3) will be used.Where 0 shows the lowest result while 3 shows the clarity of the meaning in spite of the incidence of other mistakes, will be applied to recognize whether the output of the Islamic texts is exactly transferred by the four MT systems.To measure the syntax of the output texts 5-point scale will be used.To evaluate the terminology measurement of a 2-point scale will be used to verify whether the translation is accurate.After that, the overall marks will be organized in accord to the most reliable findings of the output on each matrix.

Findings of the Study
The following section deals with findings of the study under the following subtitles, readability of selected sample of Islamic texts by four MT systems, the fidelity of selected sample of Islamic texts by four MT systems, the syntax of selected sample of Islamic texts by four MT systems and the overall evaluation of the four MT systems.

Readability of Selected Sample of Islamic Texts by Four MT Systems
Table 1 below presents the mean score measurement of readability of 30 statements under study by the four MT systems.To measure the readability the researcher applies a 4-point scale from 0 to 3.This scale measures to what extent the SL text is accurately transferred into TL text.Hence, 0 is the lowest score and 3 is the highest score.However, (3) shows that the meaning is correctly conveyed, clear, and acceptable from the first reading; (2) shows that the meaning of the outputs translation seems clear with some justification; whereas (1) indicates that the meaning is below 50% of being understood, and (0) shows that the meaning is completely unclear.
The findings in Table 1 indicate that the overall evaluation of readability in the four MT systems shows that Google Translate system has gained a high level of clarity in transferring the texts from Arabic into English (80 %) and its output is more acceptable than the other MT systems.Whereas the means of readability in the outputs of Bing Translator system is (63.3 %) which also seems acceptable, the means of readability in the outputs of Babylon Translator system is (53.3) and the World lingo Translator system is the least acceptable with (13.3%) means of readability.It should be noted that ( 3) and ( 2) on the scale show the positive points.

Fidelity of Selected Sample of Islamic Texts by four MT Systems
Table 2 below illustrates the mean score measurement of fidelity of 30 statements under study by the four MT systems.To measure the fidelity the researcher applies 4-point scale from 0 to 3.This scale to measure to what extent the information of the 30 statements transferred completely and faithfully.Hence, 0 is the lowest score and 3 is the highest score.However, (3) shows that almost all the information is faithfully conveyed from the first reading; (2) shows that the information is relatively faithful; whereas (1) indicates that the percentage of faithfulness in transferring the information is below 50%; and (0) indicates that the information is not translated faithfully.
The findings in table 2 demonstrate that Google Translate System has achieved a high level of faithfulness in transferring the Islamic texts (73.4%) compared with the other three MT systems.These systems were Bing translator, Babylon translator, World lingo translator were below (50 %) which produce unfaithfully translation.

Syntax of Selected Sample of Islamic Texts by Four MT Systems
Table 3 below illustrates the mean score measurement of the syntax of 30 statements under study by the four MT systems.To measure syntax that contains all the grammatical problems.The researcher used a 5-point scale from 1 to 5.This scale is to measure the grammatical correctness of the statements.Where, 1 is the lowest score and 5 is the highest score.However, (5) shows that the statement is grammatically perfect; (4) indicates that the statement is almost perfect but with few minor problems.
(3) Shows that the statement is reasonably grammatical, but has less serious problems; (2) indicate that the statement is almost ungrammatical, but has many serious problems that affect the meaning; and (1) shows that the statement is fully ungrammatical fragment.

Terminology of Selected Sample of Islamic Texts by Four MT Systems
Table 4 below presents the mean score measurement of terminology of the 30 statements under study by the four MT systems.To measure terminology that proves the accuracy of the translation.The researcher used a 2-point scale from 0 to 1.This scale is to decide whether the translation is correct or not.
Where ( 0) is assigned to the terms that are wrongly translated, but (1) indicates that the terms are correctly translated.The means of the outputs translation of Google translate system (83.3%) and Bing Translator system (60 %) has gone beyond the level of 50%, which proves that Google translate system generate an accurate choice of terms equivalent to the Islamic translated text.Findings also revealed that the means of terminology of Babylon translator system is 46.7 but the World lingo Translator System has mistranslated majority of terms.

The Overall Evaluation of the Four MT Systems
To evaluate the outputs of the four MT systems, the percentages of the means of the four scales (readability, fidelity, syntax, and terminology) were calculated.Table 5 presents the means of the evaluation of the Islamic texts translated by the four systems.The findings indicate that Google translate system has achieved the highest performance in readability, fidelity, syntax, and terminology.Table 5 also shows that the percentages of fidelity and syntax are below 50% for Babylon Translator system and World lingo Translator system; whereas the percentage of readability and terminology is higher than 50% for Google Translate System and Bing Translator system.Table 6 presents the means of each functional sub-characteristic of the four MT systems.The means in Table 6 shows the percentage of each sub-characteristic out of (25%).
The findings indicate that Google translate system has achieved the highest percentage in readability, fidelity, syntax, and terminology.Table 7 presents the percentages of the functional characteristics of the General Software Quality (GSQ).These functional characteristics constitute 100%, which is divided into three criteria: suitability with (25%), accuracy, (50%) and well-formedness with (25%).
The percentages of each criterion: readability, fidelity, syntax, and terminology are calculated in terms of suitability, accuracy, and well-formedness.Hence, readability stands for suitability and has the percentage of (25%), accuracy represents both fidelity and terminology and has (50%), and well-formedness represents syntax which has (25%) The overall percentages of suitability, accuracy, and well-formedness show that Google Translate System is the most adequate and acceptable among the other three systems taking into account the criteria of readability, terminology, fidelity, and syntax.Google translator System has achieved the highest percentage (78.34%),followed by Bing Translator System, Babylon Translator system and World lingo Translator system, which is below 50%.

Discussion, Conclusion & Recommendations
The discussion of the results is presented in two sections.First presents the discussion related to the first question whether there are statistically significant differences between MT systems (World lingo, Babylon translation, Google translate, Bing translator) in translating the Islamic texts.The second is the discussion devoted to the second question whether there are any statistically significant differences in Islamic translation output based on functional characteristics (accuracy, suitability, and well-formedness) and sub-characteristics (syntax, terminology, reliability and fidelity).
By comparing the mean scores of outputs translation of the four MT systems (World lingo, Babylon translation, Google Translate Service, Bing translator) in translating the Islamic texts.The findings revealed that Google Translate System is the most adequate and acceptable among the other three systems in translating the Islamic texts as seen in the tables above.
The findings regard the second question indicated that Google Translate System has achieved very high percentage in terms of four sub-functional characteristics: readability is 20 %; fidelity is 18.35%; syntax is 19.17%; and terminology is 20.82% compared with the other systems that are Bing translator, Babylon translation, World lingo.Bing translator comes second where readability got the higher percentage (15.8)among the other three standards.Tables 5 and 6 illustrate the means of the overall evaluation of the seven MT systems according to the four scales: readability, fidelity, syntax, and terminology.Each standard stands for 25% of the GSQ (General Software Quality) which is 100%.Table 7 shows overall percentages of suitability, accuracy, and well-formedness of the output translation.The findings: suitability is 20 %, accuracy is 39.17 out of 50, and well-formedness 19.17; with the total (78.34%)show that Google Translate System is the most adequate and acceptable among the other three systems.Followed by Bing Translator System, Babylon Translator system and World lingo Translator system, which is below 50%.The findings of this study are also in accord with the findings reported by the earlier studies regarding the effectiveness of MT systems (see, for instance, Abu-Al-Sha 'r & AbuSeileek, 2013;Arabglot, 2012;Abu Alsha'r, 2008).These findings clarify that why some translation outputs are unintelligible, indecipherable, unfaithful, and inaccurate.Abu-Al-Sha'r & AbuSeileek (2013, p. 534) state that ''Google Translate can make intelligent guesses as to what an appropriate translation should be.In addition, the alternatives suggested by Google gear the focus towards the urgent need to a perfect transfer of the translation output in the present time''.
Another explanation, which supports this result according to Google (2016), is that "When Google Translate generates a translation; it looks for patterns in hundreds of millions of documents to help decide on the best translation for you.By detecting patterns in documents that have already been translated by human translators, Google Translate can make intelligent guesses as to what an appropriate translation should be.This process of seeking patterns in large amounts of text is called "statistical machine translation".Since the translations are generated by machines, not all translation will be perfect.The more human-translated documents that Google Translate can analyse in a specific language, the better the translation quality will be.This is why translation accuracy will sometimes vary across languages." To sum up, Google Translate System is the most adequate and acceptable among the other three systems (World lingo, Babylon translation, Bing translator) in translating the Islamic texts.The finding of the present study also indicates that Google Translate is acceptable in producing Islamic translation output in regard to the following functional characteristics (accuracy, suitability, and well-formedness) and sub-characteristics (syntax, terminology, reliability and fidelity) due to Google Translate advancement.Abu-Al-Sha'r & AbuSeileek (2013) support these findings by stating that Google Translate advancement in producing satisfactory Arabic translation has exceeded expectations, due to the better understanding of the unique characteristics of Arabic language and adopting and applying the most suitable processing approaches.
The findings of the current study recommend that there is critical need for further research in this area to fill the gap in research.The researcher recommended conducting further studies with a larger number of Islamic texts to present a clear picture of the investigated phenomenon.Further studies and researches can be carried on to disprove or verify these findings.In addition, this paper is restricted only to four MT systems (World lingo, Babylon translation, Google translate, Bing translator).Further studies may be carried on to investigate other MT systems and linguistic features.

Table 1 .
The mean of the outputs scores regard readability

Table 2 .
The mean of the outputs scores regard fidelity

Table 3
indicates that the outputs translation of Google translate system has the least grammatical problems compared to the other MT systems.Google Translate System has achieved very high percentage compared with the other systems that are Bing translator, Babylon translation, World lingo.

Table 3 .
The mean of the outputs scores regards to syntax

Table 4 .
The mean of the outputs scores with regard to terminology

Table 6 .
Table 5 also indicates that the percentages of readability, fidelity, syntax, and terminology are below 50% for Bing Translator System, Babylon Translator system and World lingo Translator system.Overall evaluations of Sub-functional characteristics (25% each)

Table 7 .
The percentages of each criterion in the General Software Quality (GSQ)