Exploring Cluster Changes in Students ’ Knowledge Structures Throughout General Chemistry

Chemistry is traditionally perceived as difficult to comprehend. Its mastery requires that a variety of concepts be linked to form an organized knowledge system. The connections need to be made not only between the concepts associated with the macroscopic level of the chemistry triplet but also between the submicroscopic and symbolic levels. Many factors influence a learner ’ s success in bridging concepts between these levels. In this study, the aim was to identify and examine the changes in general chemistry students ’ knowledge structures by utilizing Word Association Tests. Although many studies have examined knowledge structures and aspects of the chemistry triplet, almost none has considered both at the same time. This study highlights the interconnectedness between the chemistry triplet and changing knowledge structures in overall student populations and in high- and low-achieving students. It provides insights on why students fail to understand chemistry and suggests ideas for future research as limiting factors were noted.


INTRODUCTION
Learning is influenced by the learners' prior knowledge, epistemology, and ability levels (Tyson, Venville, Harrison, & Treagust, 1997). Knowledge and the abilities that students possess before instruction play an important role as potential source of learning difficulties (Hewson & Hewson, 1983). Problems in scientific explanations, multiple meanings and representations, use of models, or terminology and language (Abraham, Williamson, & Westbrook, 1994;De Jong, Blonder, & Oversby, 2013;Ebenezer & Erickson, 1996;Haidar & Abraham, 1991;Markic, Broggy, & Childs, 2013) all may significantly hinder students' conceptual understanding of chemistry concepts. This is a problem since basic comprehension of chemistry, such as the structure of matter or bonding theory, is crucial to continue learning in advanced chemistry classes (Ebenezer, 2001). Since many students have gaps in understanding basic elements of the structure of matter and concepts related to it, they may later fail to understand more advanced topics, e.g., acid-base chemistry, electrochemistry, or chemical equilibrium (Adadan & Savasci, 2012;Gilbert, de Jong, Justi, Treagust & van Driel, 2003;Uzuntiryaki & Geban, 2005).
This study aimed to shed light on the development of students' cognitive structures related to basic elements of chemistry knowledge during a set of courses in undergraduate general chemistry using Word Association Tests (WAT) (Johnson, 1967(Johnson, , 1969.

THEORETICAL FRAMEWORK
Constructivist theory suggests that learning is based in the interaction between the learner and the environment as well as the construction of concepts through experience in relation to a learner's prior knowledge (Ausubel, Novak, & Hanesian, 1986). Following this theory, the results of learning are cognitive structures, or organized knowledge (Tsai, 2001), that are developed by the individual. There is, however, no single accepted definition of the term 'cognitive structure', and there is also limited information available on how these structures are formed (Taber, 2008). Nevertheless, the term is highly used in the literature and different interpretations are 2 / 12 available (Nakiboglu, 2008;West, Fensham, & Garrand, 1985;White, 1985). Interpretations of the term 'cognitive structure' can be related to other terms like 'structural knowledge' (Jonassen & Marra, 1994) or 'knowledge structure (Nakiboglu, 2008). The term 'knowledge structure' is subsequently used in this paper, as the authors believe that the term best represents the structures that the study aimed to visualize. Studies that use 'structural knowledge' focus on how conceptual understanding is structured as the interrelationships between concepts and related terms are made (Liu & Ebenezer, 2018), which is also our understanding of knowledge structures (Derman & Eilks, 2016). Learning can be interpreted in this means as the formation and transformation of knowledge structures by integrating new information and experiences, which can affect the knowledge structure as a whole.
According to Nakhleh (1992), incorrect understandings, in other words, misconnections hinder learning when the learner attempts to integrate new information into preexisting knowledge structures. Misinterpretations and misconceptions, or alternative conceptions, may arise. Because of the resilience of misconceptions, teaching for the development of scientifically-reliable conceptual understanding is difficult to achieve (Othman, Treagust, & Chandrasegaran, 2008). Thus, knowledge of student misconceptions and their related knowledge structures are valuable components of teachers' pedagogical content knowledge in designing and operating effective instruction (Magnusson, Krajcik, & Borko, 1999).
These claims hold true in general, and for chemistry education in particular, since the nature of chemistry knowledge is composed by a highly structured set of concepts that encompass each of chemistry's phenomenological-macroscopic, sub-microscopic, symbolic, and procedural representational levels (Dori & Sasson, 2008;Johnstone, 1991Johnstone, , 2000. The first three are widely referred to as the chemistry triplet (Johnstone, 2000), while the fourth, process, is suggested to be the way that the other levels interrelate with one another and is an intermediate between one or more of the other representations (Dori & Hameiri, 1998;Dori & Sasson, 2008).
Some of the most-often mentioned concerns in advanced chemistry learning are lacks in students' understanding of the relations in the three representational levels and resulting misunderstandings of basic chemistry concepts. The difficulties students face in solving chemistry problems are often caused by insufficiently settled concepts, e.g., deficiencies in knowledge, misconceptions, or missed connections between terms, concepts, and representational levels (Gilbert, de Jong, Justi, Treagust & van Driel, 2003;Nakiboglu, 2008;Taber, 2008). Research suggests that students must develop a high degree of abstract thinking skills to learn chemistry well (Blake & Nordland, 1978) because chemistry is full of abstract and theoretical concepts, technical language, and models, e.g., in the fields of the structure of matter (Adadan, Trundle, & Irving, 2010;Eilks, 2013;Liu & Lesniak, 2005) or bonding theory (Levy Nahum, Mamlok-Naaman, Hofstein, & Taber, 2010;Othman et al., 2008).
Since the 1980s, different methods have been suggested to research the nature of knowledge structures in science (Lee, 1986(Lee, , 1988 such as concept mapping, interviews, or Word Association Tests. Çalik et al. (2005) mentioned, however, several difficulties in finding effective research methods for examining students' knowledge structures in chemistry. It even seems possible that knowledge structures change during examination, e.g. by questions and impulses during the research.
Word Association Tests (WATs) were originally suggested by Johnson in the 1960s (Johnson, 1967(Johnson, , 1969 and recently became a common tool in science education research. WATs try to map concepts in student understanding and their interrelations to form knowledge structures (Bahar & Hansell, 2000;Derman & Eilks, 2016;Nakiboglu, 2008;Schizas, Katrana, & Contribution to the literature • The difficulty of understanding chemical concepts has been studied widely in the literature; however, a few researchers have investigated the relationship between students' performance in general chemistry courses and their knowledge structures, which are difficult to visualize. This study is one of the very few studies that utilized two programs, Gephi and JPathfinder, in a way that made the structures rich and easy to understand. • Among the studies focusing on students' knowledge structures, most rely on the structures determined in one semester or quarter. This study aimed to analyze the evolution of structures collected over an academic year for the entire general chemistry curriculum. • Most studies generate structures based on the data collected from a small group of students, which might not reflect the majority's knowledge adequately. In this study, a total of 1914 students participated and completed all the surveys. In addition to getting structures representing a larger group of audience, the high number of participants enabled the team to determine the eccentricity values for the structures that help the reader quickly determine the central concepts in the students' minds.

/ 12
Stamou, 2013; Shavelson, 1972Shavelson, , 1974. WATs are a tool to examine aspects of the knowledge structure of an individual in a specific domain. The method is, however, suggested to be better used for the analysis of a large group of participants to capture a better representative structure for the targeted group (e.g., a cohort of general chemistry students). WATs are also suggested to allow insights into the structure and work of the human memory (Petrey, 1977;Thomson & Tulving, 1970).
Studies on the development of and change in knowledge structures of basic chemistry concepts, e.g., among students while being exposed to general chemistry instruction, are still rare in the literature. There is no one relating the knowledge structures explicitly to the three representational levels of chemistry (Johnstone, 1991) and its extension by the process domain (Dori & Hameiri, 1998;. Correspondingly, this study intended to answer the following questions: 1. How do undergraduate chemistry students' knowledge structures change over time after taking multiple chemistry courses in general and with reference to the three representational levels of chemistry in particular?
2. Do high-and low-achieving students' knowledge structures change differently over the course of the general chemistry series?

METHODOLOGY Participants and Design
After getting the approval of the Institutional Review Board, participants for this study were invited from undergraduates enrolled in general chemistry at a public university in northern California. The total numbers of students completed each of the three surveys are as follows: Chemistry 2A, 617 students; Chemistry 2B, 541 students; and Chemistry 2C, 756 students.
In order to examine the relationship between students' achievement in the course and the knowledge structures generated, the students were categorized as high-and low-achievers. Due to differences in when the surveys had to be administered and the type of the data available, high-and low-achieving students were defined differently when analyzing the data from each course. In Chemistry 2A, the only data available was each student's score on the chemistry placement test that all students were required to pass before they could enroll in the course. The test was scored out of 44. A score of 24 or above meant that a student passed the test and could enroll directly in the course; a student who obtained a lower score was required to pass a workload chemistry class before they could start the chemistry series. Of the students who participated, 58.2% passed the placement test and 41.8% did not. It was decided that this was not a large enough gap to clearly be able to see differences between top and bottom students. Thus, all the students' scores were sorted numerically and the 200 most extreme scores from each end were used. Gender was not considered when extracting this data. In Chemistry 2B, the high-achieving data pool was composed of students who had achieved an 'A' or a 'B' in Chemistry 2A, which was 303 survey respondents (50.3%). The low-achieving students included those who had earned a grade of a 'C' in Chemistry 2A, which was 299 students (49.7%). In Chemistry 2C, the highachieving students were those who had obtained either an 'A' or a 'B' in Chemistry 2A as well as an 'A' in Chemistry 2B, which was 187 students (24.7%). The lowachieving students were those who had earned a 'C' in both of the previous courses, which was 183 students (24.2%). The extremes were chosen in this case in the hope that they would reveal a strong difference in the knowledge structures as well as because the sample size for both was very similar.

Word Association Test (WAT) and relatedness coefficients: Qualitative to quantitative data
Three separate Word Association Tests (WAT), one for each in a series of three courses, were prepared as surveys. Each survey was offered in the penultimate week of each quarter over the span of an academic year via an online link that students could choose to complete. Participation was rewarded with extra credit. The Institutional Review Board at the university approved the study before it was conducted.
In the Chemistry 2A survey, students were asked to list the first 10 words they thought of when they read each of 9 stimulus words: atom, bonding, energy, matter, change, forces, stoichiometry, structure, and reaction. Each stimulus word was selected because of its relevance to the course; the group that determined them was comprised of two chemistry education professors and three graduate students. For Chemistry 2B and 2C, the number of stimulus words increased to 13 and 17, respectively. In Chemistry 2B, the new stimuli were acid/base, solubility, equilibrium, and spontaneity, while in Chemistry 2C, electrochemistry, periodic trends, coordination chemistry, and kinetics were added. It was predicted that students would not complete the surveys if the time asked of them was significant; thus, only 5 responses were asked for. This was the most significant difference across each course. Data analysis for each progressed in the same fashion.
To begin the process of generating each knowledge structure, the qualitative survey responses were transformed into quantitative data. The initial step in this process was to create lists of the 25 most popular response words for each stimulus. Each student's response was first standardized to an agreed-upon metric. For example, "hydrogen," "boron," and any other element given as a response were all coded as "element" due to their similarity, i.e. that students were all thinking of the elements that are comprised of atoms. The purpose of this coding was to streamline the data while being as mindful of student intent as possible. The codes for each response were determined in group meetings composed of one professor of chemistry education and several undergraduate students.
Once all the responses had been coded, the frequency of each response for each stimulus was determined by adding up the number of times students wrote it and multiplying this number by a frequency factor because students who put a word as their first response more closely associated it with the stimulus than a student who put the same word as their fifth response. Ftot is a response's total frequency, F1 is the number of times a word appeared as the first response, F2 is the number of times it appeared as a second response, and so on. The equation used to calculate the total frequency for a single response word is shown below: = 1 * 1 + 0.8 * 2 + 0.6 * 3 + 0.4 * 4 + 0.2 * 5 (1) The responses with the twenty-four highest frequencies comprised the lists used to generate the relatedness coefficients, with the stimulus itself as the first word in each list (Gulacar, 2014). It was thought that if students put a stimulus word as a response word for another stimulus, the connection between the two must be significant. However, stimuli often did not appear as responses for themselves; thus, if each stimulus had not been included in its own list in some way, valuable connections would have been lost. The relatedness coefficients, which are a measure of the strength of the connections between stimuli, were calculated using a formula that was developed by Garskof and Houston (1963). Each place on the list was assigned a value from twenty-five -which is given to each stimulus -to one, which is given to the twenty-fourth most-common response. For each match between two lists, the values of each word were multiplied together. The products of these pairs were added up and divided by the sum of all the squares through 25 minus 1, which is the value that two identical lists would have. Below is an example calculation using a list of ten words instead of twenty-five. The method of calculation, however, would be the same.
Columns A and C in Figure 1 contain the stimuli followed by their responses; columns B and D contain the values assigned to each. Pairs of words are highlighted in different colors. Note the pair in yellow; bonding is the stimulus in one list and the response in another. In this list, the student associates bonding with structure, but if the stimuli had not been assigned a value, this connection would have been lost. These sample lists generate a relatedness coefficient as calculated below: = 10 × 8 + 5 × 5 + 4 × 3 × +3 × 1 + 2 × 7 + 1 × 4 10 2 + 9 2 + 8 2 + 7 2 + 6 2 + 5 2 + 4 2 + 3 2 + 2 2 + 1 2 − 1 = 0.36 This procedure results in the relatedness coefficient between the pair of stimulus words; the larger the value, the more similar the two lists for each stimulus word are to each other. These coefficients are transformed into visual links between stimuli.

JPathfinder, R, and Gephi: Transforming quantitative data into visual data
The relatedness coefficients were inputted into JPathfinder, which contains an algorithm developed by Schvaneveldt (1990) that converts relatedness coefficients from similarity values into distances. For each set of data, the RC values were constructed into upper triangular matrices that paired stimulus words, which are now referred to as nodes in the knowledge structure, into all possible sets. The conversion to distance values determines how far apart the nodes would be in the knowledge structure: the higher the value of the RC, the more associated a pair of stimulus words is, and thus the distance between them should be shorter than that of two seemingly unrelated nodes. JPathfinder also provides eccentricity values for each node in the network: the smaller the deviation from the standard orbital, the more central the concept is in the network. The smaller the eccentricity value for a node, Distance matrices outputted from JPathfinder were entered as matrix objects in the R platform, a compiler used for statistical programming. The data was transformed from a matrix to a distance object that would generate coordinates for the placement of each node in the knowledge structure from a multidimensional scaling (MDS) function developed by Spekkink (2015). The resultant 2D coordinate system placed each node within proximal orientation to each other according to the similarity conversion from JPathfinder. These dimensions were indexed to each stimulus word as an edge table, along with the distance values and linkage combinations, into an open visualization program called Gephi.
The knowledge structure was analyzed using a combination of two layouts: Multi-Dimensional Scalar (MDS) and Network Splitter 3D. This first organized the data into their proximal orientation in space and then correlated the nodes into closely related topics, respectively. Nodes were color coded for visibility.

The extended chemistry triplet
The generated networks were further analyzed using the extended chemistry triplet. Each stimulus word was classified according to one or multiple levels of the extended chemistry triplet in the hope that this would reveal patterns in students' thinking, namely the phenomenological, sub-microscopic and symbolic levels (Johnstone, 1991) and the process domain (Dori & Hameiri, 1998;. Words were categorized by collecting ideas from different perspectivesundergraduates, graduate students, and professors -and making decisions based on these differing ideas as well as the general chemistry curriculum and how it is taught.
The top 10 responses that the 2C student population had for each word were examined and it was decided based on what students' thought processes were suggested to have been as they were completing the survey. The 2C responses were chosen because they encompassed all of the stimulus words, and the stimuli that they shared had the same response words as the 2A and 2B responses, only in a slightly different order. In cases where it was determined that more than one category from the quadruplet was applicable, the most relevant classification is left un-italicized in the table.

RESULTS AND DISCUSSION
As shown in the following sections, students' knowledge structures evolved over the course of the series. As students learn new concepts, the structure of their knowledge constantly changes (Hovardas & Korfiatis, 2006). This evolution resulted in changes of proximity between the nodes in the knowledge structure: distances between the pre-existing concepts from 2A fluctuate depending on the other connections made with new concepts introduced in the series. This new knowledge then influenced the positioning of nodes within the students' network. Therefore, knowledge structures for 2B and 2C could be compared directly with those from 2A and with each other.

The Overall Knowledge Structures
Significant translations of certain stimulus words in the knowledge structures occurred from one course to the next with the introduction of new the concepts stoichiometry, reaction, change, forces, and energy. In Chemistry 2A (Figure 2), the general student population associated stoichiometry more with stimuli such as atom and matter. In both 2B ( Figure 3) and 2C (Figure 4), this concept was correlated with reaction and change. Figure 2. Overall knowledge structure for 2A students with each concept marked with its relation to the four representational levels of chemistry. The pentagon represents the macroscopic, the diamond represents the submicroscopic, the square represents the symbolic, and the triangle represents the process.
Likewise, from 2A to subsequent courses, forces was also transferred to the opposite side of the knowledge structure. Initially, forces was associated with reaction, whereas in 2B and 2C this node established closer connections with bonding and structure.
While reaction, change, and energy did not make as great a change as stoichiometry and forces, their positions relative to each other changed within their maintained cluster. The distance that previously separated reaction and change in 2A decreased in the subsequent courses. In Chemistry 2B, this closely-associated pair was not identified with energy, which was more grouped with forces and thus bonding as well. This grouping was brief; as students learned topics such as kinetics and spontaneity in 2C, energy was again associated with the "reaction" cluster.
Also, consistency was achieved between all three groups in parts of each structure. The cluster containing matter, atom, structure, and bonding remained relatively the same in each knowledge structure, except for the small changes in distances between concepts as students made more connections with knowledge gained in each course.
It is interesting to note that for the most part, students incorporated new concepts from 2B and 2C into alreadyexisting clusters. The exceptions to this are acid/base and solubility in 2B and electrochemistry in 2C. Acid/base and solubility were unclustered in 2B, the class in which they were introduced. In 2C, they shifted closer together, although they continued to remain unassociated from any other cluster. Electrochemistry, which appears only in Figure 3. Overall knowledge structure for 2B students with each concept marked with its relation to the four representational levels of chemistry. The pentagon represents the macroscopic, the diamond represents the submicroscopic, the square represents the symbolic, and the triangle represents the process. Figure 4. Overall knowledge structure for 2C students with each concept marked with its relation to the four representational levels of chemistry. The pentagon represents the macroscopic, the diamond represents the submicroscopic, the square represents the symbolic, and the triangle represents the process. the 2C graph, formed no close associations with any other topics.

The extended chemistry triplet
For further analysis, all of the stimulus words were classified as representing one of the three representational levels of chemistry -the macroscopic, submicroscopic, symbolic (Johnstone, 1991) -and the process domain (Dori & Hameiri, 1998;. Although all the words can be described in any category at any level, the authors decided to categorize them based on the students' top 10 response words in the hope of better understanding students' thought processes. Johnstone (1991) has suggested that one of the reasons students might have difficulty solving chemical problems is because they fail to understand concepts and terms at more than one representational level, and sophisticated chemical thinking often requires thinking about concepts at more than one level. The analysis indicates that most of the time, students can relate each stimulus word to only one level. Table 1 shows all of the stimulus words with their chemistry triplet classifications.
The level of analysis was deepened visually by marking each stimulus word on each knowledge structure using a symbol to indicate each representational level of chemistry. In cases where more than one of the chemistry triplet categories was determined to be relevant, the larger the symbol, the more directly correlated that category was determined to be to the stimulus word in question.
The knowledge structure for 2A students (Figure 2) shows a rough pattern of clustering, but no single statement can be made about any of the categories of the representational levels of chemistry. Both poles of the structure contain stimuli that were identified across all areas of the representational levels of chemistry. The left portion of the structure is roughly submicroscopic, although forces, which is primarily submicroscopic, is on the opposite side of the structure. The right side of the network contains most of the macroscopically identified words, although as with matter, which exists on the opposite side of the graph, is also basically macroscopic.
The sole word classified as symbolic, stoichiometry, exists in its own cluster. The two process-identified words, change and reaction, appear on the right side of the graph as part of a larger cluster. The structure of 2B students ( Figure 3) shows a stronger pattern of clustering with the submicroscopic-classified stimuli compared to the 2A structure as the macroscopic stimuli became more spread out. Note that the placement of forces has shifted from the right side of the structure to the left, where it forms a cluster with other submicroscopic words. When compared to the 2A structure, the macroscopically-identified matter has shifted further away from the submicroscopic cluster on the left. The process words all form a part of the same cluster, albeit one that is not comprised solely of them. Stoichiometry also became associated with this cluster. However, when looking more closely at this cluster, a high degree of variation in classifications can be seen. Reaction, equilibrium, and change, which are all processand macroscopically-identified, are clustered with one macroscopic word (spontaneity) and one symbolic one (stoichiometry). This pulling-in of stoichiometry is particularly noteworthy when compared to the 2A structure. This variation and the shifting placement of stoichiometry beg the question of how these stimuli are interacting with each other: Do students truly associate them with one another more strongly than they do with the other stimuli because of their shared classification? Or, are they separate sub-clusters whose proximity to The process category could be significant here because of the way the process level is defined as something in between other representations and not necessarily as its own defined category (Dori & Sasson, 2008). Thus, it could make sense that its cluster is more varied.
The 2C structure ( Figure 4) shows a much more consolidated cluster of submicroscopic stimuli, again located on the left side of the structure. A cluster of macroscopic topics continued to be present on the right and has as well become more defined, but, as with the 2B graph, more of the macroscopic-identified stimuli continued to be spread elsewhere on the structure. However, the symbolic stimuli were more widespread across the structure compared to that of the 2B students.
The stimuli introduced in the 2C WAT could be part of the explanation for this. In contrast to the stimuli introduced in 2B, most of which were macroscopic, two of the four new words in the 2C WAT (periodic trends and coordination chemistry) were presented primarily symbolically, with a third (electrochemistry) presented as partially symbolic. However, these did not form a new symbolic cluster and instead were more associated with other, already-existing clusters. Students did not associate them at the symbolic level; instead, the most meaningful connections they created were with submicroscopic and macroscopic topics, and thus these new words in 2C were incorporated into pre-existing clusters. The most significant example of this is coordination chemistry, which students closely associated with structure and thus was tightly incorporated into the submicroscopic cluster.
Also, it was noted that stoichiometry, which in contrast to the other symbolically identified stimuli, has pulled itself out of a cluster on the right and became associated with nothing else, as observed in CHE 2A structure.

Eccentricity Values
In order to enrich the interpretations of the knowledge structures and reveal what concepts the students determined to be truly central, eccentricity values for each knowledge structure were determined. When the similarity relationships were put into the MDS coordinate system, the nodes were thrown randomly into space and related to one another based on distances found in JPathfinder. Therefore, we cannot take the central concept to be in the exact center of the Gephi visualization, as it behaves analogously to a magnetic field where nodes attract and repel each other according to other nodes' distances from one another (Spekkink, 2019). Eccentricity data provides the most useful means to find the concept(s) that students access within their knowledge structure to understand their current general chemistry course. The higher the eccentricity value, the more extraneous the stimulus is to the rest of the structure. Thus, central concepts have the lowest eccentricity value among all stimuli, while concepts with the highest eccentricity values were what students deemed to be the least related to other concepts (JPathfinder). Table 2 notes the eccentricity values for each stimulus word follows, with the central concepts marked with an asterisk.
The central concept for each knowledge structure, which was generated by JPathfinder, maintains a strong linearity throughout the general chemistry series. The central concepts for the overall courses are as follows: in 2A, the centers were energy and forces; in 2B and 2C, it was energy.
It is interesting to note that based on the eccentricity data, forces begins as a central concept in 2A. By 2C, however, it has become one of the least central concepts. This can also be explained by curriculum changes, as intermolecular or intramolecular forces are not highly emphasized in 2C. There is a greater emphasis on kinetics, reaction mechanisms, nuclear energy, and organic chemistry, and it is possible that students forget the underlying principles for the reactions and interactions happening at the particular level (Johnstone, 1991). Another notable aspect of this data is how similar the centrality data for 2B and 2C are. With the exception of bonding and forces, all of the eccentricity values are the same for each stimulus. This reflects the lack of changes seen in the knowledge structures between 2B and 2C and provides more support for the idea that upon transitioning from 2B to 2C, students did not change their existing knowledge structures so much as add in new concepts.

High-and Low-Achieving Students
A new set of knowledge structures were generated for the high-and low-achieving students, which were identified based on their scores on Chemistry Placement Test and grades in CHE 2A, 2B, and 2C. These structures were compared to determine the characteristics of highand low-achieving students' knowledge structures. However, on the contrary of the findings reported in different studies (McGowen, 2013;von der Heidt, 2014) utilizing concept maps, the analysis of clusters on these structures indicated that there is no significant difference between these knowledge structures. One explanation might be that knowledge structures develop similarly but are used more effectively by higher achieving students. Another reason could be that the WAT assessment on its own is not able to reveal how students utilize their knowledge and does not account for the presence of a gap in student ability. It may be useful to compare students' knowledge structures to data retrieved from student performances in computing algorithmic and conceptual problems. One final reason behind this lack of difference between high-and lowachieving students' structures could be related to how Gephi determines the clusters with its limitations. Gephi uses the nodes' (stimuli words') relative distances with each other to create clusters (Spekkink, 2019). While doing this, it is possible that the program ignores certain variations in the distances and still groups the words of the low-achieving students in the same way as it groups those of the high-achieving students.

LIMITATIONS
As the Gephi-generated concept maps are not physical representations of students' minds but the visually represented mathematical networks used for analytical purposes, the number of students participated in the study had a direct effect on the structures.
The subset of students who participated in the WAT was one limitation. This is significant because of the way the WAT is analyzed. As the number of students who participate in the survey decreases, the frequencies of each word become smaller and more words with the same frequency occur.
Another limitation presented itself across all surveys, but it was especially apparent in the 2C survey because of its length. More students started the survey than completed it, and the way each WAT was structured, 2A words were listed first, then 2B, then 2C. If a student chose to stop participating halfway through, or wrote in answers that were not meaningful, the later stimuli were affected negatively.

CONCLUSIONS
This study revealed the evolution of the knowledge structures of undergraduate students enrolled in general chemistry. Change and progress in the development of students' knowledge structures were identified through Word Association Tests (Johnson, 1967(Johnson, , 1969. By time of general chemistry learning, networks become richer in connections as students are exposed to more concepts and because more stimulus words can be provided in each WAT.
Evolution of a general chemistry student's knowledge structure is evident in the changes of organization and placement of key concepts within their structure over the course of their education. Comparison of the location of each node and their proximity to other closely associated nodes between Chemistry 2A, 2B, and 2C knowledge structures reveals aspects of students' comprehension of learned topics and their relation towards one another after completion of the series. The knowledge structures in Figure 2 to 4 indicate that students' knowledge becomes more networked by time, but also that certain concepts are re-interpreted and differently connected one to another after certain phases of instruction.
The interpretation of knowledge structures paired with the different representational levels of chemistrynamely the macroscopic, sub-microscopic, symbolic levels (Johnstone, 1991) and the process domain (Dori & Hameiri, 1998; -adds a new thought to the literature. It, however, became clear that students do not simply form separate clusters for the different representational levels. It is also clear that students connected certain stimuli to one representational level, whereas they found other stimuli to be simultaneously represented at multiple levels. Nevertheless, it seems they do not conceptualize that all the given stimuli have to be considered and interpreted on all relevant representational levels, as suggested by Johnstone (1991) for forming an expert view on chemical knowledge.
Further applications or WATs might better take this association of stimuli to representational levels into consideration when studies and corresponding tests are designed. Different stimuli can be given which are more clearly identifiable to represent one of the representational levels of chemistry or the process domain. Teaching may do the same when concepts are introduced in class.
The study did not identify any significant differences in high-and low-achieving students' knowledge structures. It is not fully clear whether there are specific reasons for the similarities observed in the structures or a limitation of the study prevented the authors determining the differences. Further research should examine this point, either with a different and purposefully selected sample, with a revised WAT, or with techniques such as interviews, WATs combined with concept maps, or WATs and think aloud approaches.