Evaluation of Students ’ Mathematical Ability in Afghanistan ’ s Schools Using Cognitive Diagnosis Models

Cognitive diagnosis models (CDMs) are restricted latent class models that can be used to analyze response data from educational or psychological tests. The Deterministic Input Noisy Output “ AND ” gate (DINA) model and the Deterministic Input Noisy Output “ OR ” gate (DINO) model there are two popular cognitive diagnosis models (CDMs) for educational and evaluation assessment. They show different views on how cognitive skills are related and the likelihood of an item responding correctly. This study aims to comparison between these two models and comparison between girls and boys for cognitive diagnosis modeling. In addition, this research aims at determining the 8 th grade students ’ level of mathematics at the school level. Followed by the analysis of a set of data from Trends in International Mathematics and Science Study (TIMSS) 2011 mathematics assessment is used to examine the Mathematical abilities of students in Grade 8, which measures 13 attributes and includes 32 questions. A sample size of 274 includes 129 girls and 145 boys, and the students are selected based on the multistage cluster sampling method from Ghor province. Under the cognitive diagnosis assessment framework, the deterministic, inputs, noisy, “ and ” gate (DINA) model and the deterministic, inputs, noisy, “ or ” gate (DINO) model are used. The results demonstrated that the highest probability of mastery belonged to attribute 4 at (0.4836). However, the lowest probability belonged to attribute 24 and 32 which is (0.12).


INTRODUCTION
Cognitive Diagnostic Models (CDMs; Rupp, Templin, & Henson, 2010) Educational test performance examines an individual's overall ability in a set of specific discrete skills, called attributes, each of which is possible Is or may not be dominated, disintegrated, provided this way. Detailed description or specification of the feature, his strengths and weaknesses in the field of test ability. A set of profiles of possible traits for a given test shows the intellectual skills classes that can be assigned to them.
Cognitive diagnostic models (CDMs) are statistical and psychometric models developed to identify the ability of testers to master fine-grained skills based on a predefined matrix. Cognitive diagnostic tests can be used to identify skill combinations that the examiner may have, or does not have, or does not possess all (Su, 2013). The purpose of each of these models is to classify the entrance exam according to the required skills. Using CDMs has the advantage of not being available in other ways. First of all, many other models including IRT assume the statistical one-dimensionality of the data and require it as a prerequisite for calibrating the data display and estimating the parameters. In most models, one-dimensionality is necessary as a prerequisite for locating the subjects in a hypothetical chain. One of the important features of CDM is that there is no need for the next one. One-dimensionality in educational settings seems to be somewhat problematic because research has shown that academic tools typically take on a set of attributes or skills that can each create a separate statistical dimension. (Afzali, 2016).

/ 11
In response to these difficulties, a number of researchers (Ayers, Nugent, & Dean, 2008;Chiu, 2008;Chiu & Douglas, 2013;Chiu, Douglas, & Li, 2009;Park & Lee, 2011;Willse, Henson, & Templin, 2007) have explored the potential of nonparametric classification techniques as heuristic or approximate methods for assigning examinees to proficiency classes. (A heuristic uses clever computational shortcut strategies to obtain a solution that is very close, if not identical, to the optimal solution.) Software for implementing these techniques can be developed from efficient cluster analysis programs that are readily available in the major statistical packages. The Deterministic Input Noisy Output "AND" gate (DINA) model (Junker & Sijtsma 2001;Macready & Dayton 1977) and the Determin-istic Input Noisy Output "OR" gate (DINO) model (Templin and Henson 2006) are two popular cognitive diagnosis models (CDMs). CDMs for educational assessment (DiBello, Roussos, & Stout 2007;Haberman & von Davier 2007;Leighton & Gierl 2007;Rupp, Templin, & Henson 2010) decompose an examinee's ability in a domain into binary cognitive skills called attributes, each of which an examinee may or may not have mastered. Distinct profiles of attributes define different proficiency classes. From the observed item scores, maximum likelihood estimates of the model parameters are obtained that are then used to assign examinees to the different proficiency classes. Software for fitting the DINA model and the DINO model using marginal maximum likelihood estimation via the Expectation Maximization (EM) algorithm (MMLE-EM) is available through the pack-age CDM implemented in R (Robitzsch, Kiefer, George, & Uenlue 2016).
The DINA model and the DINO model represent different views on how the mastery of attributes and the probability of a correct item response are related. The DINA model is a conjunctive model, meaning that only mastery of all attributes required for an item maximizes the probability of a correct response. In contrast, the DINO model is a disjunctive model, which means that mastery of a subset of the required attributes is a sufficient condition for maximizing the probability of a correct response (for a detailed discussion of these concepts, consult Henson, Templin, & Willse 2009).
Recently, however, Liu, Xu, and Ying (2011) demonstrated that the DINO model and the DINA model share a "dual" relation: One model can be expressed in terms of the other, and which of the two models is fitted to a given data set is essentially irrelevant because, after appropriate trans-formations, the item parameter estimates are identical (as is shown in detail below) and thus, the estimates of examinees' proficiency class memberships are identical too. This also means that the two models must share the same theoretical properties-what applies to one model automatically holds for the other model. Hence, one proof fits both models, and one set of simulations suffices to cover both models.
General CDMs have become the new theoretical standard in cognitively diagnostic modeling (de la Torre 2011; Henson, Templin, & Willse 2009;Rupp, Templin, & Henson 2010;von Davier 2005von Davier , 2008von Davier , 2014. In this article, a proof of the duality of the DINA model and the DINO model is presented that is tailored to the form and parameterization of general CDMs. The presentation is preceded by a brief review of some key technical concepts concerning CDMs. As an example of how the duality of the DINA model and the DINO model allows to condense separate proofs for the two models into a single proof, a compact proof of the condition of complete-ness of the Q-matrix is presented that covers both models. The Discussion summarizes the practical and theoretical implications of the DINA-DINO duality. In this study, we attempt to find the level or surface of the slipping and guessing of the students of Grade 8 using the DINA and DINO models. In particular, the International Mathematics and Science Study ) TIMSS, a quadrennial assessment administered by the International Association for the Evaluation of Educational Achievement (IEA) since 1995, evaluates the mathematics and science abilities of fourth and eighthgraders. The TIMMS has taken an exam every four years in many countries, for example, 1999, 2003, 2007, 2011, 2015 and 2019, Afghanistan is not eligible for this competition.
Thus, in Afghanistan, there is no study about the evaluation of students' Mathematical abilities using cognitive diagnosis models. There are only a few limited kinds of research on undergraduate Mathematics Education in Afghanistan. So, this article is addressing the following objectives and questions.

Contribution to the literature
• The present study is concerned with identifying the strengths and weaknesses of Afghan students in the eighth-grade mathematics skills of the TIMMS 2011 database and all of its research methodology areas. • This study contributes to education practices by incorporating skill hierarchies with assessments.
• The simulation analysis would provide valuable information about the potential inaccuracy of parameter estimates due to misspecification of the relationships between attribute and possible attribute profiles.

1.
To Consistency the application of CDMs in identifying school students' mathematic abilities at grade 8

2.
To determine the 8 th grade students' level of mathematics at the school level. In Afghanistan, most students in Department of Mathematics, who have basic Mathematical skills, have distinct disadvantages, and a significant part of students fail to learn higher level subjects. The academic failure in this course and the weak results in national and international tests are due to this weakness. So, this study pays more attention to construct the hierarchy of this course at the time of developing educational programs, and follow a cognitive diagnosis model in the process of learning as well as planning to ensure reciprocity. Prerequisite knowledge for curriculum concepts and planning before training high-level skills can have a significant impact on the quality of education and learning of Mathematics in Afghanistan middle schools.
Research on the impact of cognitive theory on test design was very limited as mentioned in (Gierl & Zhou, 2008;Leighton et al., 2004). Most CDMs application examples in the literature are limited to no more than eight attributes (Hartz, 2002;Rupp & Templin, 2008b) because of the long computing time for models with larger numbers of attributes and items. If the number of latent classes can be reduced from 2 , the sample size needed to obtain stable parameter estimates from CDMs calibrations will decrease. This will also result in faster computing time. One solution to decrease the number of latent classes is to impose hierarchical structures (Leighton et al., 2004) on skills. The resulting approach is able to assess and analyze more attributes by reducing the number of possible latent classes and the sample size requirement (de la Torre, 2008de la Torre & Lee, 2010). Two methods to estimate attributes with hierarchical structures could be as de la Torre (2012) suggested: First, keeping the EM algorithm as is, but without any gain in efficiency, the prior value of attribute patterns not possible under the hierarchy can be set to 0, and second, for greater efficiency, but requiring minor modifications of the EM algorithm, attribute patterns not possible under the hierarchy can be dropped.

DINA Model
The DINA model (Deterministic Input; Noisy "And" Gate, Haertel 1989; Junker and Sijtsma 2001) is a commonly discussed DCM (Junker and Sijtsma 2001;de la Torre and Douglas 2004;Templin and Henson 2006). The probability of a correct response defined by the DINA model is a function of a latent variable : Here, is the − ℎ examinee's real knowledge status (master or non-master) on the − ℎ attribute and is the element of Q-matrix that defines requirement of the − ℎ h attribute by the − ℎ item. If the − ℎ examinee has mastered all the measured attributes in the Q-matrix, then = 1; otherwise, = 0.
Given , the probability of a correct response ( = 1| ) is defined by the DINA for the − ℎ item as: here the slipping parameter is the probability of an incorrect response for the − ℎ item when the − ℎ individual has mastered all the attributes measured by the − ℎ item. The guessing parameter is the probability of a correct response for the − ℎ item when the − ℎ individual has not mastered all of the attributes measured by the − ℎ item. The DINA model is a conjunctive model that uses two parameters (slip and guessing) to define the probability of a correct response. A positive response is most likely when all attributes measured by the item has been mastered. Lacking a single attribute or more measured by that item will reduce an examinee's probability of a correct response to the level of guessing.
The guessing parameter, , is the probability of responding an item correctly for a respondent who has not mastered at least one required attribute: If a respondent masters all the required attributes, = 1, the probability of responding the item correctly is equal to the probability of not slipping for the item, 1 − . On the other hand, if the respondent fails to master at least one of the required attributes, = 0, the probability of responding the item correctly drops to the prob-ability of guessing for the item, . The DINA model order-constrains the slipping and guessing parameters: 1 − is assumed to be greater than ; thus, the probability of responding an item correctly is guaranteed to be always higher for the respondents who mastered all the measured attributes than the respondents who lacked at least one of the measured attributes, regardless of the magnitudes of slipping and guessing parameters (Rupp et al., 2010).

The attribute mastery indicator is formulated
, where A is the total number of attributes measured, and indicates whether attribute is measured by item i. The possible values that takes are 0 or 1. The other indicator identifies whether the respondent in latent class j mastered attribute , which takes values of 0 or 1 as well. Since the attribute mastery indicator, , is created through multiplication of each alpha for every measured attribute, lack of a single measured attribute would cause the value of to be 0.

DINO Model
Deterministic input, noisy-or-gate model, known as DINO, is a compensatory CDM (Templin, 2004;Templin & Henson, 2006) because it assumes that lack of one measured attribute can be compensated by another attribute. More specifically, mastery of at least one attribute compensates deficit of all the other measured attributes. Similarly, to the DINA model, the slipping and the guessing parameters are estimated at the item level. The DINO model works with a disjunctive condensation rule in which the presence of at least one measured attribute guaranties a high probability of endorsing an item (Rupp et al., 2010).
DINO model estimates the probability of a correct response for item i in latent class c as follows: where is the probability of correct response, is the observed response, is the latent response variable, and and are, respectively, the slipping and the guessing parameters (Rupp et al., 2010). The latent response variable in the DINO model above is defined as follows: where specifies whether attribute a is measured by item .
whether the respondent in latent class c mastered attribute a, which takes values of 0 or 1 as well.
In case that attribute a is not measured by item , would take a value of 0, and consequently the value of 1 − would not matter. On the other hand, if attribute a is measured by item , would take a value of 1, and accordingly 1 − counts for the final value that takes. If the respondent in latent class c masters attribute a, takes a value of 1, and thereby 1 − would be 0. However, if the respondent in latent class does not master attribute a, 1 − is 1. Because the occurrence of =1 depends on existence of at least one 0 in the multiplication term, mastering at least one attribute greatly increases the probability of endorsing the item. The DINO model is useful when only one attribute is required to be mastered among more than one attribute (Rupp et al., 2010). The slipping, , and the guessing, , parameters of the DINO model are defined in the same way as in the DINA model.
Compared with the DINA model, the major difference is the way the latent response variable is calculated. Under the DINO model, mastering any one of the required attributes will give correct or positive answers. This means that one model can be expressed in terms of the other and both models can be fitted by the same software. (As an aside, note that the characterization of the special relationship between the DINA model and the DINO model as ''dual'' deviates from the well-defined meaning of this term in operations research; for details, consult Papadimitriou & Steiglitz, 1998.) Model is two popular cognitive diagnosis models (CDMs) for educational assessment. They represent different views on how the mastery of cognitive skills and the probability of a correct item response are related. Recently, however, Liu, Xu, and Ying demonstrated that the DINO model and the DINA model share a "dual" relation and which of the two models is fitted to a given data set is essentially irrelevant because the results are identical.

Q-matrix
The analysis of most CDMs is based on an itemattribute incidence matrix called a Q-matrix (Tatsuoka, 1983). The diagnostic power of CDMs relies on the construction of a Q-matrix with attributes that is theoretically appropriate and empirically supported . Studies on the Q-matrix can be normally categorized as exploratory approaches intend to discover the Q-matrix from the data when whole Qmatrix is unknown. Confirmatory approaches aim to purify a certain Q-matrix in which some elements of the Q-matrix are assumed to be known. Although an entirely exploratory approach obtains no information about the number of attributes in advance, an approach given the number of attributes is still regarded as exploratory here as long as it estimates the whole Qmatrix (Chung, 2014). After defining, determining and identify the Q-matrix for measuring the test, the next step is to construct the Q-matrix.
In this study to form a Q-matrix, after translating the protocol or the codebook of the questions, encode a copy of the Mathematic questions for Grade 8 of the TIMMS 2011 with attributes and the coding protocol and provide it to 3 Math teachers with bachelor degree, who had 6year, 8-year and 10-year training experiences, respectively. They are asked for constructing the Qmatrix separately and independently. In a twodimensional matrix in which the columns contained those skills and each question measures attributes in the rows of the question, by specifying either 1 or 0. Attributes are explained in Analysis section, Table 3.  Item 1

METHOD
A quantitative research approach was used to collect data for the current research. A total of 274 Afghan students within 4 schools participated. In each classroom, 16 different classes of Afghanistan mathematics tests were assigned randomly to students.

Research Participants and Research Tool
The R package CDMs is used to fit the response data. A simple from in this research we study the mathematical experts' opinions, contains 8 linear hierarchical traits, it is given annually at approximately 4 test centers in Ghor province of Afghanistan high schools, it there were 275 students in different areas of Firouzkouh city145 boys and 129 girls. the average of examinees is around 11-17 years old.
The same questionnaire was used as in Taiwan, each student was requested to answer only one out of eight booklets and only Booklets 1, 3, 5, and 7 were used for the current study. These booklets were selected based on the criterion that each attribute analyzed in the study had to be included in at least three items . For the purpose of comparisons across subgroups, these schools were selected into the rural and urban groups. Schools located in a geographically isolated area and in village or rural area, and there were also some schools located in middle of Firozkoh city.

Research Analysis
Mathematical response datasets of the students in Grade 8 in Afghanistan were analyzed in this study. Students responded to the multiple-choice and constructed response items, which assessed four content domains: Data and Chance, Geometry, Algebra, Algebra, and Number. The DINA model and DINO model was used to fit the response data. The test was composed with 32 items, including 15 multiple-choice and 17 constructed response items. There were 129 female and 145 male participants in this study.
Quantitative analyses were carried out in the process of test development and Q-matrix construction. the data was analyzed using TIMSS 2011 with eighth grade mathematics data-sets from the students of Afghanistan were compared in this study. Students responded to the multiple-choice and constructed response items, assessing four content domains: Number, Algebra, Geometry, and Data and Chance. analyzed together with the Q-matrix using the R. Improving the teaching and learning of Mathematics and Science through providing data on student progress in relation to different types of curricula, educational practices and educational  Table 3 shows the Marginal probability of mastering each of the thirteen attributes. According to the results in the table, the highest probability of mastery in the attribute belong to the attribute 4 at (0.4836) and the lowest probability belong to attribute 24 and 32 which is (0.12). Table 4 shows the marginal probability of mastering of each item. According to the results in the table, the highest probability of mastery belongs to item four at (0.489) and the lowest probability belongs to attribute 21 which is (0.0875). Table 5 shows the guessing and slipping parameters based on the DINA model. According to Table 5, the lowest guessing parameter of DINA models belongs to Item #32 and the highest guessing coefficient belongs to Item #4, and the lowest slipping coefficient belongs to Item #2 and the highest slipping coefficient belongs to Item #14. The coefficient of lowest indicates a possibility of incorrectly responding to those who possess the skills needed to answer the question. The smaller the guessing and slipping parameters, the better the fit between the diagnostic measurement and experimental data in the CDMs (Ravand, Barati, & Widhiarso. 2012).
The average values of the guessing and slipping parameters in DINA model are 0.1537 and 0.3462. The mean guessing parameter indicates that for the students who have not mastered all the required skills for an item, there is still, on average, a 15.37 percent chance that they will choose the correct response and the average slipping parameter indicates that for the students who have mastered all the skills required for an item, there is still, on average, a 34.61 percent chance that they will choose the incorrect response. The most informative items on a test are the ones whose slipping and guessing probabilities are low (Rupp et al., 2010). Generally speaking, small guessing and slipping parameters indicate a good fit between the diagnostic assessment design, the response data, and the postulated DINA model. The table above shows each item guess and slip parameters based on the DINA model, the information in this table has the lowest guessing coefficients for item 32 and 31 with 3.2E-110 and 9.10E-16 the highest guessing coefficients its belong to the item#4 and 12 with values of 0.364 and 0.3332 these coefficients are likely to answer the question correctly for students demonstrates that they do not have the skills needed to answer the question. Also, the lowest slip value is related to items #2, 24 and 32 with values all is equal to the 0 and the highest slip coefficient is related to items#14 and 10 with values of 0.7533 and 0.7315 This coefficient indicates the probability of students answering the question incorrectly have the skills needed to answer the question. And also, the item of guess and slip parameters based on the DINO model, the information in this table has the lowest guessing coefficients for item 32 with 1.08E-145 and the highest guessing coefficients it belongs to the item#4 with values of 0.3742. Also, the lowest slip value is related to items #27, 30 and 31 with values all is equal to the 0 and the highest slip coefficient is related to items#22 with values of 0.7787 This coefficient indicates the probability of students answering the question incorrectly have the skills needed to answer the question.

DISCUSSION
This research aimed at evaluating the application of two popular core Cognitive Diagnosis Models, the Deterministic Input Noisy "And" gate (DINA) and Deterministic Input Noisy "Or" gate (DINO) by identifying school students' mathematic abilities at grade 8. The analysis was done to show the level of probability in every attribute in the questionnaire. The results demonstrated that the highest probability of mastery belonged to the attribute 4 at (0.4836). However, the lowest probability belonged to attribute 24 and 32 which is (0.12). Then, another descriptive analysis was done to show the level of probability in every item in the questionnaire. The results showed that the highest probability of mastery belonged to the item four at (0.489). However, the lowest probability belonged to item 21, which is (0.0875).  Rahimi, et al. (2018) also found that most of the attribute were not mastered in each skill, but the status of the individuals in the SUM skill. In addition, de la Torre and Sun Lee (2010) focused on one CDM, the deterministic in-puts, noisy "and" gate (DINA) model, and the invariance property of its parameters. Using simulated data involving different attribute distributions, they found that the DINA model parameters are absolutely invariant when the model perfectly fits the data. Another related study was conducted by Ravand (2016) which demonstrated the application of the G-DINA to the reading comprehension data of a high-stakes test. The study showed Syntax was the easiest and Inference was the most difficult attribute. The second most difficult attribute was Main Idea, followed by Detail and Vocab. The same results were also found by (Grabe & Stoller, Lumley, 1993). Moreover, the findings of this study are in line with those of Baghaei and Ravand (2015) who applied the linear logistic test model to these data. Further, Yi Chiu and Ko ̈ hn (2015) prove that the ACTCD also found that an extension to the statistical framework of the ACTCD, originally developed for test data con-forming to the Reduced Reparameterized Unified Model or the General Diagnostic Model is valid also for both the DINA model and the DINO. Additionally, Kaya and Leite (2017) present longitudinal models for CDM. They indicate that the proposed models provide adequate convergence and correct classification rates. Finally, Yamaguchi and Okada (2018) examined which CDMs better fit the actual data by comparatively fitting representative CDMs to (TIMSS, 2007) assessment data across seven countries. First, CDMs was shown to have a better fit than did the item response theory models. Second, main effects models generally had a better fit than other parsimonious or the saturated models. Related to the second finding, the fit of the traditional parsimonious models such as the DINA and DINO models were not optimal.
Thus, related studies show that CDM has been applied in different contexts such as mathematics and language contexts. However, studies also show that there are no enough studies conducted in mathematic context.

CONCLUSION
This research aimed at evaluating the application of two popular core Cognitive Diagnosis Models, the Deterministic Input Noisy "And" gate (DINA) and Deterministic Input Noisy "Or" gate (DINO) by identifying school students' mathematic abilities at grade 8. This research also tried to determine the 8 th grade students' level of mathematics at school level. The research applied Trends in International Mathematics and Science Study (TIMSS) 2011 mathematics assessment in order to evaluate DINA and DINO models through examining Mathematical abilities of students in Grade 8. It measured 13 attributes which included 32 questions.
First a descriptive analysis was done on DINA model to show the level of probability in every attribute in the questionnaire. The results demonstrated that the highest probability of mastery belonged to the attribute 4 at (0.4836). However, the lowest probability belonged to attribute 24 and 32 which is (0.12). Then, another descriptive analysis was done to show the level of probability in every item in the questionnaire. The results showed that the highest probability of mastery belonged to the item four at (0.489). However, the lowest probability belonged to item 21, which is (0.0875). Secondly, the same analysis was calculated on DINO model to demonstrate each item guess and slip parameters. Results show that the lowest guessing coefficients for item 32 with 1.08E-145 and the highest guessing coefficients belonged to the item#4 with values of 0.3742. in addition, the lowest slip value related to items #27, 30 and 31 with values all equals to the 0 and the highest slip coefficient is related to items#22 with values of 0.7787. This coefficient indicates the probability of students answering the question incorrectly have the skills needed to answer the question.
On the other hand, the R software analyses were done to show the levels of Guess and Slip in the DINA and DINO models. The results on average values of the guessing and slipping parameters are 0.1537 and 0.3461. The mean guessing parameter shows that the participants who did not master all the required skills for an item chose the correct response. However, the participants who mastered all the required skills for an item chose the incorrect response.
In addition, a calculation was done to show values related to each item in Guess and Slip parameters based on the DINA model. Findings on Guess showed that the lowest guessing coefficients belonged to item #32 with 3.19E-110 respectively. However, the highest guessing coefficients belonged to the item 4 with values of 0.364 respectively. So, these coefficients might answer the question correctly for students who tended not to have the skills needed to answer the question.
Findings on Slip on the other hand showed that the lowest slip value was related to items #2, 24 and 32 with values all equals to the 0 while the highest slip coefficient is related to items#14 with values of 0.7533. So, this coefficient specifies the probability of students answering the question incorrectly, who had the skills needed to answer the question.