Effects of Grade Level and Object Size on Students’ Measurement Estimation Performance

The current study examined the effects of grade level and object size on the ability to estimate the measurement of objects. Fifth(n = 198) and sixth-grade (n = 208) students and freshman undergraduate students (n = 71) from Taipei city, Taiwan, participated in the study. The findings indicated a tendency for measurement estimation ability to increase with grade level. Overall, estimation performance was most accurate with medium-sized to-be-estimated objects (TBEOs), while estimates of small-sized TBEOs were more accurate than those of large-sized TBEOs. A tendency to underestimate the measurements of large-sized TBEOs was observed in all groups. Those with good estimation abilities showed a preference for using body parts and convenient objects as references. Moreover, the integration of measurement units constructed from previous experience and eyeballing was an essential skill used by good estimators. Suggestions for measurement estimation instruction are discussed.


INTRODUCTION
Measurement estimation is an important domain of mathematics education (Ministry of Education [in Taiwan], 2010; National Council of Teachers of Mathematics [NCTM], 2006). Skills in measurement estimation aid students in speedily judging the reasonableness of an estimated answer obtained by physical measurements or in efficiently completing tasks involving measurements (Gooya, Khosroshahi, & Teppo, 2011). Nevertheless, students tend to perform poorly on assessments of measurement estimations, with close to or fewer than half of participants achieving a criterion level of reasonable accuracy (e.g., item 2013-4M3 #10 M1461E1, National Center for Educational Statistics, 2013).
Researchers have argued that grade (level of schooling) (Tretter, Jones, Andre, Negishi, & Minogue, 2006) and object size Jones, Tretter, Taylor, & Oppewal, 2008) are influential factors affecting student performance in measurement estimation. Some researchers (e.g., Swan & Jones, 1980; have found a positive association between grade and estimation performance. However, this association has not been consistently observed in some studies on elementary school students (e.g., Desli & Giakoumi, 2017;Montague & Van Garderen, 2003;Pike & Forrester, 1997). One might debate that such inconsistency could result from some specific reasons such as the curricula (Montague & Van Garderen, 2003) and instruction (Desli & Giakoumi, 2017;Pike & Forrester, 1997) provided in school mathematics. Since measurement estimation is included in mathematics curricula in many countries (e.g., Ministry of Education, 2010Education, , 2018NCTM, 2006;Ruwisch & Huang, 2018), it is worthwhile to carry out a deep inspection of the influence of grade on students' measurement estimation performance.
There is some evidence from studies in the science field that object size influences estimation accuracy. For example, students from the fifth to 12th grades were found to accurately estimate the size of objects between 10 cm and 10 m, but their accuracy declined for both bigger (e.g., 1 billion meters) and smaller (e.g., 1 micrometer) sizes of objects . Such results are also evidenced in Jones, Tretter, Taylor, and Oppewal's (2008) data collected from novice and experienced teachers. Nevertheless, for estimating the measurements of to-be-estimated objects (TBEOs) of 2 / 16 different sizes, whether differences exist in estimation performance between elementary school students and undergraduate students who are at a higher education level remains unclear. Specifically, the sizes of TBEOs that are likely to be regarded either as mathematicsbased or as classroom objects (1 cm through 1 m).
Students often over-and under-estimate object measurements (Jones, Forrester, Gardner, Andre, & Taylor, 2012), and their over-and under-estimations may differ depending on the measurement attributes. For instance, young elementary school students (Forrester, Latham, & Shire, 1990) and middle school students (Jones et al., 2012) were found to underestimate length measurements but overestimate area and volume measurements. In contrast, Forrester and Shire (1994) found that underestimations of volume by students increase as the volume of the TBEO increases. Despite the inconsistency of the findings, understanding of students' errors may help diagnose significant ways of mathematical thinking and strategies (Ryan & Williams, 2007). Seeing that errors occur frequently when estimating the measurements of both large-sized and small-sized objects Jones et al., 2008), examining whether differences exist in student error patterns when they estimate the measurements of TBEOs of different sizes is essential for gaining insights into students' difficulties in estimating measurements.
Previous studies (Joram, Subrahmanyam, & Gelman, 1998;Montague & van Garderen, 2003) have shown that students who are skilled in measurement estimations tend to use effective strategies (e.g., benchmark comparison) for reaching reasonable estimates. However, types of strategies employed by students who are skilled in measurement estimation for estimating the measurements of school objects of various sizes have not been thoroughly explored. In addition, information on good estimators' thinking about their uses of estimation strategies is lacking.
The current study aimed to explore the effects of grade and object size on student performance in the measurement estimations of classroom objects. Furthermore, student error patterns were inspected for estimating large-sized and small-sized TBEOs, including length, area, and volume estimations. Another purpose of the study was to investigate strategies used by students who perform well in measurement estimation and to understand successful mathematical thinking while using these strategies.

Mathematical Thinking Involved in Measurement Estimation
Measurement estimation is a mental process of determining the measurement of an attribute of an object without tools (Joram et al., 1998). The idea of measurement estimation is closely related to conceptual understanding of physical measurement because visualization, mental comparisons, and operations of units are common requirements (Joram et al., 1998;Joram, Gabriele, Bertheau, Gelman, & Subrahmanyam, 2005). Thus, both perceptual and inferential abilities are required for the successful estimation of measurements (Forrester et al., 1990). Perceptual abilities include an accurate visualization of the magnitude of objects, while inferential abilities encompass knowledge of the relative magnitude of numbers and measurement units, which pertains to benchmark knowledge (e.g., the use of measure unit), and computational operations. These skills include proportional reasoning, which requires an understanding of multiplicative relationships, and spatial abilities such as the construction of spatial imagery for processing mental representations (Jones, Taylor, & Broadwell, 2009;Jones et al., 2012).
The accurate estimation of length, area, and volume measurements of a three-dimensional (3-D) object is crucial for understanding the physical world. The capability of estimating these spatial measurements facilitates the development of a sense of space occupied by objects (Jones et al., 2012). Furthermore, the measurement of length, area, and volume varies in complexity, and therefore the processes of estimating these attributes differ (Jones et al., 2012). For example, to estimate a length, students across grades tend to imagine iterating a unit of length with an object and then report the estimated length (Lehrer, 2003). Compared to the estimation of length, the estimation of area (or volume)

Contribution to the literature
• The current study examined effects of grade level and object size on students' measurement estimation ability via mixed methods. How object size impacts students' estimation performance and the strategies used by students who were skilled in estimation across the fifth-and sixth-grade groups and undergraduates were inspected. • Students' estimation ability grew with the increase in grade level. Overall, students performed the best on estimating medium-sized objects. They were better able to estimate small-sized than large-sized objects. All grade groups tended to underestimate the measurements of large-sized TBEOs. Good estimators showed a preference for using body parts and convenient objects as references for making estimations.
3 / 16 of a two-dimensional (2-D) (or 3-D) object requires a higher level of proportional reasoning Jones et al., 2012) and spatial thinking. Thus, students tend to be better skilled in the estimation of length measurement than of area (Pike & Forrester, 1997) or volume (Huang, 2016).

Effect of Grade on Measurement Estimations
The role of grade in students' measurement estimation performance has attracted considerable attention. Pike and Forrester (1997) examined 62 students aged 6-11 by means of two sets of tasks through a laptop computer screen, including a textbook format and a story context corresponding to the textbook format. The results revealed no significant grade effects on estimation performance in either length or area estimation. Despite the results, all the students across grades performed better on estimating the textbookformat tasks, which were close to the classroom-based activity, than on the other set of tasks. This suggests that students' prior classroom-based experience of measurement estimations in the textbook context may have had some influence on their estimation performance.
In another study, Montague and Van Garderen (2003) found that the fourth graders, who received the mathematics curricula reflecting the NCTM standards, performed as well as the sixth graders, who did not receive the same mathematics curriculum. Furthermore, in a recent investigation on Greek students' ability to estimate lengths, Desli and Giakoumi (2017) found no significant grade differences in the performance of the third and fifth graders when metric units were used. Although the third and fifth graders were introduced only to metric units in the Greek mathematics curriculum, grade effects were only observed when nonstandard units were used. Desli and Giakoumi argued that an inadequate instruction of metric units may constrain the progress of students' competence in measurement estimation.
In contrast, Tretter, Jones, Andre, Negishi, and Minogue (2006) described the importance of learning experience for constructing an understanding of object size from extremely small to very large based on the data of students' performance in estimating scientific phenomena. The results revealed that students from elementary school through doctoral programs were found to universally hold distinct conceptual categories of scale and to use different unit sizes as references for objects of various sizes. For instance, elementary school students used "centimeter" as a measurement unit for small-sized objects (e.g., textbooks) and "meter" (or body length) as a unit for large-sized objects (e.g., an elephant). Tretter et al. also claimed that students' thinking about size demands an understanding of concepts of scale. Greater knowledge of measurements and experiences in direct and indirect comparisons may help facilitate the development of scale concepts, which in turn improves estimation abilities.    also suggest that students' estimation ability can be developed through measurement knowledge and physical measurement skills learned in elementary mathematics, including concepts of measurement units, comparing objects and sizes, unit conversion, and the use of estimation strategies. Along with the increase in learning experience, Tretter et al. stated that the ability to use proportional reasoning and visual-spatial skills improves gradually with the enrichment of physical experience of visualizing scale and experience of the mental operation of measurement units. This in turn helps students make progress in conceptualizing the size of objects and scale (Jones et al., 2008).

Influence of Object Size on Measurement Estimation
Forrester, Latham, and Shire (1990) examined the influences of object size on elementary school students by using various estimation tasks in which the lengths of the measure units and the lengths of the TBEOs used in the area and volume tasks were both less than 30 cm. The findings revealed that object size influenced the performance of young students aged 5-8 (N = 70) on the estimation of area and volume tasks, but not on length estimation tasks involving the lengths of steps, jumping, and lying down. In the science field,  surveyed the conceptions of spatial scale (distance) of 215 participants ranging in grade from the fifth to 12th grades as well as doctoral students. The findings revealed that across grades students used one or more unit(s) as quick mental reference(s) but they differed in their ability to estimate measurements of objects of different sizes. When the sizes of the TBEOs were close to the range of human scale such as body length, which is a convenient unit for making comparisons, the estimation tasks were accurate. However, the estimation accuracy decreased when the TBEO size was either very small or very large.
Moreover, Huang, Heinze, Ruwisch, Hoth, and Chang (2019) examined 240 seventh to ninth graders' ability to estimate lengths by using a length estimation assessment in which the sizes of the TBEOs were between 1 mm and 1 m. The results revealed that all the students performed better in the context where the TBEOs were not small (> 12 cm) but were touchable than in the small-object contexts (≤ 12 cm) with touchable and untouchable objects. Taken together, object size seems to be an influential factor on students' measurement estimation performance. Despite discrepancies in the definitions of object sizes between studies, the length of objects provided in elementary mathematics textbooks and tasks was commonly between 1 mm and 100 cm (Desli & Giakoumi, 2017;Forrester & Shire, 1994;Huang et al., 2019).

/ 16
As to the effect of object size on measurement estimation, some researchers (Joram et al., 1998; have described the different mathematical skills needed for processing the estimations of TBEOs of various sizes, and suggested that the complexity of processing skills may influence estimation accuracy. Joram, Subrahmanyam, and Gelman (1998) suggested that the use of units and unit iteration is the most frequent process for estimating the measurements of an object. Thus, the size of a TBEO may affect the accuracy of an estimate which involves calculations of the number of measure units used. For estimating the measurement of a large-sized object, for example, multiplicative thinking and complicated computations involving more measurement units are needed, and therefore, bigger estimation errors are likely to be made. In contrast,  indicated that for estimating the measurement of a small-sized object with the naked eye, being able to accurately compare a measure unit with a TBEO and an understanding of fractions (e.g., a fraction of a meter) are demanded for estimation accuracy. Thus, in addition to unit comparison, the complexity of mathematical skills (e.g., understanding of fractions, proportional reasoning, and calculation) required for estimating the measurements of objects may vary depending on the size of the TBEOs . The more complex the mathematical skills needed for processing, the greater the errors that are likely to be made.

Over-and Under-estimates of Measurements
Estimation error patterns (underestimations and overestimations) may vary with the measurement attributes (length, area, and volume) that are altered. Forrester et al. (1990) found that young elementary school students were prone to underestimate length, while students who had learned multiplication tended to use multiplication and overestimate both area and volume. This tendency has been replicated in the findings of Jones et al.'s (2012) study on 39 middle school students' performance of solving length estimation tasks (e.g., dowel rods and a line drawn on paper).
The findings of Forrester and Shire (1994) exhibited that when a single dimension increased in size, the estimates remained mostly correct based on the performance of 67 elementary school students. However, an increase in two or three dimensions resulted in an increasing tendency to underestimate. This tendency was particularly strong in younger students (aged 8-9, n = 24). In contrast, the older students (aged 10-11 years, n = 43) were able to compensate for more than one dimension increase. On the one hand, these findings suggest that grade may play a role in estimation error. On the other hand, object sizes may affect whether students make under-or overestimations. Jones et al. (2008) investigated 16 experienced science teachers' and 50 novice teachers' concepts of spatial scale. The results displayed that teachers more frequently overestimated the size of objects on the small scale than on the large scale, while tending to underestimate the sizes of objects on the large scale compared to the small scale. Jones et al. argued that estimation errors may result from inadequate knowledge of scale for different levels and less experience of various size scales.

Strategies for Measurement Estimation
The use of effective strategies helps reach a reasonable estimation accuracy (Jones et al., 2012;Joram et al., 2005). Hildreth (1983) classified the strategies used in the estimation of length and area measurements by 72 students ranging in grade from fifth grade to college freshmen. Strategies were classified as appropriate or inappropriate. The appropriate strategies, which led to more accurate estimations, included the use of benchmarks, prior information about the TBEOs or measurement units, and area formulas. Inappropriate approaches, which resulted in poor estimations, included the use of unsuitable measurement units, guessing, and incorrect procedures for estimating area measures. (2003) reported that estimation performance was associated with the level of sophistication of estimation strategy. Good estimators tended to use sophisticated strategies, which were similar to the type of appropriate strategies categorized by Hildreth (1983), for example, using benchmarks. In contrast, the less skilled estimators preferred to use less sophisticated strategies, which were similar to the inappropriate strategies in Hildreth's (1983) study, such as using unsuitable units and guessing.

Likewise, Montague and van Garderen
In addition, the use of eyeballing for making measurement estimations, which means the use of visualizing only, is also a strategy employed by elementary school students. For example, "I use my eyes" and "I just looked at it and knew." Visualizing, which is needed for students with normal sight while making estimations, is a perceptual-based approach rather than using benchmarks or a justified strategy (Desli & Giakoumi, 2017). Thus, the use of eyeballing is labeled as a less sophisticated strategy based on Montague and van Garderen (2003). Seeing that the use of appropriate strategies is associated with estimation performance, to investigate strategies used by good estimators and to explore ideas that good estimators hold for making estimations when using eyeballing may aid researchers' understanding of how effective thinkers organize measurement knowledge and use strategies related to expertise.

Research Questions and Hypotheses
The study addressed the following four questions: 1. How do grade and object size influence student performance on measurement estimation? 2. For each grade group, are there differences in the frequency of overestimations and underestimations of the measurement of largesized objects? 3. For each grade group, are there differences in the frequency of overestimations and underestimations of the measurement of smallsized objects? 4. What strategies are used by good estimators across grades for estimating the measurements of objects?
In the present study, for each grade group, the estimation errors were inspected through students' estimated answers to the length and area tasks involving large-sized objects, respectively. Similarly, the estimation errors were examined through students' answers to the length, area, and volume tasks involving small-sized objects, respectively. Three hypotheses were tested. The first postulates that there is an interaction between grade and object size in measurement estimation performance. The second states that there are differences in the overestimation and underestimation patterns in the specific measurement estimations of the large-sized objects in each grade group. The third states that there are differences in the overestimation and underestimation patterns in the specific measurement estimations of the small-sized objects in each grade group.

METHOD Participants
A total of 477 students from three grade groups participated in the study: the fifth graders (n = 198, 101 boys and 97 girls) with a mean age of 11.08 years (M = 133 months, SD = 3.40); the sixth graders (n = 208, 98 boys and 110 girls) with a mean age of 12.18 years (M = 146.20 months, SD = 5.27), and the first-year undergraduate students (n = 71, 15 male and 56 female) with a mean age of 19.31 years (M = 231.72 months, SD = 14.61). The elementary school groups were recruited from 11 public elementary schools in Taipei city, Taiwan. The undergraduate students were recruited from two departments of the Education College of a public university in Taipei city.
Due to the coordination between testing schedules and academic calendars, the data from the undergraduate groups were collected approximately two months earlier than those collected from the elementary school group. After general testing, one-onone interviews were conducted with 26 students who were identified as good estimators, including 14 fifth graders, nine sixth graders, and three undergraduate students. The process for identifying good estimators is described in the following section.

Instrument
A 17-item paper-and-pencil assessment developed by Huang (2016) was used to collect data. The assessment included six length estimations, six area estimations, and five volume estimations. These items were fill-in-theblank, drawing a line according to a prescribed length, and multiple-choice, which required judgements on the reasonableness of answers (see Appendix A).
The estimation assessment included three sections categorized by the size of the TBEOs. (1) The small-sized (S) section. The S section contained six items in which the TBEOs had lengths (for length tasks) or side lengths (i.e., the length of one dimension of the 2-or 3-D TBEOs) between 1 and 10 cm. The section included one length item, two area items, and three volume items. (2) The medium-sized (M) section. The M section consisted of five items in which the TBEOs had lengths or side lengths between 11 and 50 cm. This section included three length items and two volume items. (3) The largesized (L) section. The L section contained six items in which the TBEOs had lengths or side lengths between 51 and 100 cm. The L section included two length items and four area items.
According to Huang's (2016) study, the estimation assessment instrument was examined by a panel of mathematics researchers and educators, including five experts, for validity (content). The split-half reliability of the assessment was 0.74 when tested with a sample of students in the fifth-and sixth grades (N = 201). The average values of task difficulty of the S, M, and L sections were 0.45, 0.51, and 0.38, respectively.
The estimation assessment was presented on an A3sized worksheet and could be completed in approximately 40 minutes. The characteristics of the estimation items, TBEOs, size sections, and answers for the estimation assessment are presented in Appendix B.
The TBEOs (or benchmarks prescribed) were presented in three ways as follows: (1) Pictorial presentation. The TBEOs were presented as pictures with (Q15) or without (Q1, Q2, Q7) a physical demonstration of the benchmarks prescribed. (2) Physical presentation. The objects were presented in two ways: (a) both the TBEOs and benchmarks prescribed (Q5-1, Q5-2, and Q11) or (b) the TBEOs only (Q3, Q4-1, Q6, Q8-1, Q10, Q12-1, and Q14). (3) Other presentations. Neither a TBEO nor a benchmark prescribed was presented in this category. For example, drawing a straight line according to a given length (Q13) or drawing a rectangle (or square) (Q16) that matched a given area measurement, or finding one classroom object with an area of 1-m 2 (Q9). The majority of the TBEOs and some benchmarks prescribed were statically presented 6 / 16 on the front desk or blackboard in the classroom. Students were permitted to touch, but not move the objects under the surveillance of a trained research assistant.
To collect strategies used for estimating the length, area, and volume measurements, there were three openended questions that required students to write down the estimation methods used. The three questions included one each for the length (Q12-1) and area (Q8-1) estimations of large-sized objects, and one for the volume estimation of a medium-sized object (Q4-1).
Good estimators (see definition below) were asked a question at a follow-up interview "What methods do you use most frequently for estimating the length or area or volume of an object?" The interviewees' responses were audio-taped and transcribed for analysis.

Scoring, Identification of Good Estimators, Classifications of Estimations, and Strategies
Scoring. Each item was scored from 0 to 2 points depending on various percent errors of estimation with reference to Swan and Jones' (1980) and Coburn's (1987) suggestions. A score of 2 indicated a "reasonable" estimate within ± 10% of the actual value, whereas a score of 1 indicated an "acceptable" estimate between ± 10% and ± 25% of the actual value. An estimate of greater than +25% or lower than -25% of the actual value was considered "inappropriate" and given a score of 0. All of the items were evaluated by the aforementioned values of percent error, excluding the item (Q9) referring to the classroom object of 1-m 2 in area.
The position of a target item and the distance of the target item from the estimator may affect the accuracy of an estimate (Sternberg & Sternberg, 2017); therefore, the values of percent error of the item (Q9) referring to the classroom object were extended as in Huang's (2016) study as follows: 2 points were given for an estimate within ± 25% of the actual value, while 1 point was given for an estimate between ± 25% and ±50% of the actual value. If an estimate was greater than +50% or lower than -50% of the actual value, the item was scored as 0. The total possible score of the entire measurement estimation assessment was 34 points. The sub-total score of the S, M, and L sections were 12, 10, and 12 points, respectively.
Identification of good estimators. For each grade group, the participants who scored within the top 25% of the total assessment score were identified as good estimators. The number of good estimators in each group was as follows: the fifth-grade group, 49 students who scored ≥ 17 points; the sixth-grade group, 52 students whose scores were ≥ 20 points; and the undergraduate group, 19 students whose scores were ≥ 25 points.
Frequency of overand under-estimation. The frequencies of over-and under-estimations were calculated based on students' responses to the following five fill-in-the-blank items. For the length estimation, over-and under-estimations were examined by two items: Q1 (a small-sized TBEO) and Q12-1 (a large-sized TBEO), respectively. For the area estimation, over-and under-estimations were examined by Q7 (a small-sized TBEO) and Q8-1 (a large-sized TBEO), respectively. For volume estimation, over-and under-estimations were examined by Q5-1 involving a small-sized TBEO.
An estimate ≤ -25% of the actual value was defined as an underestimate, while an estimate ≥ 25% of the actual value was defined as an overestimate. The reasonable and acceptable answers, which were classified as acceptable answers, and non-numeric responses (e.g., a blank) were excluded from the calculation of over-and under-estimate frequencies.
Categorization of estimation strategy. The estimation methods described by the good estimators were classified based on the eight categories of estimation strategies (Hildreth, 1983;Joram et al., 1998). (a) Eyeballing. Using the eyes or looking at objects. (b) Body parts. The use of body parts as references, such as the width of a finger or the length of a hand span. (c) Previous knowledge (or experience). Using a form of judgement based on experience or previously learned measurement knowledge such as the use of formulas for area and volume measurements. (d) Mental ruler. Employing a mental reference unit (e.g., 10 cm or 1 m) for measurement comparison. (e) Object. Using objects that are nearby as measurement reference units. (f) Guessing. Guessing represents a gross estimate without methodological thinking. (g) Mixed methods. Combining two or three of the strategies mentioned above. For example, integrating the use of eyeballing and a body part reference. (h) Others. This category contained a response of "Do not know" or unclear description such as "a unit" or a blank.

Data Analysis
Repeated measures ANOVAs were performed to examine the effects of grade and object size on estimation. For each grade group, χ 2 nonparametric tests were conducted to examine differences in frequencies of over-and under-estimations for each measurement estimation task. Lastly, the types of strategies used by the good estimators were coded and the frequencies of various strategies were calculated.
Two trained raters worked independently to score written answers on the assessment for approximately 11% of the participants (n = 54). Interrater agreement on the scores of the estimation questions reached 100% agreement. Moreover, a Cohen's Kappa analysis testing the reliability of the coding of written strategies was assessed at 0.91, p < 0.001. Table 1 presents the average means and standard deviations of estimation performance on one item from each size section, and overall performance on the entire estimation assessment by grade. As can be seen in Table  1, performances in all three grade groups followed a similar trend. Performance was greatest in the M section of the assessment, followed by the S and then the L sections.

Students' Estimation Performance in Various Size Sections across Grades
The effect of grade and object size on estimation performance was examined by a 3 (grades) × 3 (sizes) ANOVA with repeated measures on object size. A Mauchly's test indicated that the assumption of sphericity had been violated, χ 2 (2) = 7.34, p < 0.05; therefore, the degrees of freedom were corrected using Huynh-Feldt estimates of sphericity (Ɛ = 0.99). The results showed statistically significant main effects of grade (F [2, 474] = 43.96, p < 0.001, η 2 = .16) and object size (F [1.99, 941.44] = 118.49, p < 0.001, η 2 = 0.20) but there was no grade-by-size interaction, (F [3.97, 941.44] = 2.10, p = 0.08). Bonferroni corrected post hoc tests revealed that the undergraduate group outperformed both the fifth-and sixth-grade groups, and the sixthgrade group performed better than the fifth-grade group.
For the effect of object size, Bonferroni corrected post hoc tests revealed that all grade groups performed best on the M section (ps < 0. 001) and performed better on the S section than on the L section (ps < 0. 001). Multiple comparisons showed the following results: (a) the fifthgrade group performed best on the M section, while performances on the S and L sections were not significantly different; (b) both the sixth-grade and undergraduate groups obtained the highest scores on the M section and performed better on the S than on the L section. Table 2 shows the frequencies and percentages of the four types of estimated answers (i.e., acceptable, blank, under-estimate, and over-estimate) in each grade by measurement attribute and object size. For each grade group, x 2 nonparametric tests were conducted to compare the differences between underestimate and overestimate frequencies across measurement attributes and sizes.

Comparisons of the Frequency of Over-and Under-Estimations
The patterns of estimation error for the large-sized TBEOs are as follows: (1) There was a higher frequency of underestimations than overestimations for the length assessment among the fifth graders (x 2 [1, 90] = 30.04, p < 0.001) and sixth graders (x 2 [1, 72] = 26.89, p < 0.001). There were no overestimates for the length estimation assessment in the undergraduate group and therefore statistical analyses were not performed. (2) There was a higher frequency of underestimations than overestimations for the area estimation assessment With respect to the pattern of estimation errors for the small-sized TBEOs, the results of each group are as follows. (1) Table 3 shows the frequencies of various strategies used by the good estimators in each grade group. Across grades, about 80% of the good estimators or more used effective strategies that facilitated estimations, including body parts as references, previous experiences, mental rulers, objects, and a mix of two types of strategies. Furthermore, the use of body parts and objects as references was used more frequently than other strategies. The sixth graders used eyeballing and a mix of two types of strategies more frequently than the other groups. Despite this, some good estimators in the elementary group but not in the undergraduate group used "guessing." Furthermore, a few good estimators in all grades reported the use of strategies that fell into the "others" category which included unclear or ineffective methods.

Strategies Used by Good Estimators
As can be seen in Table 3, the good estimators in the elementary school group used mixed strategies more frequently in the assessments of area and volume estimations than in the assessment of length estimation. In contrast, the frequencies of mixed strategies used by undergraduates for area and length estimations seemed equal. The types of estimation strategies used by the undergraduates were similar to those used by the elementary school groups with the exception that the undergraduates did not use previous experience or guessing.
The follow-up interview data provided further insights into the interviewees' estimation strategies, including how they used measurement units for mental operations when using eyeballing and mental rulers. Ten interviewees who used eyeballing and five interviewees who used mental rulers indicated that they imagined standard lengths (e.g., 1 cm, 10 cm, 15 cm, and 1 m) and used them as measurement units (or mental rulers) in the mental operation of unit iterations. They also expressed that the size of the TBEO affected which mental ruler they would use as a reference, in spite of the frequent use of small units such as 1 cm, 1 cm 2 , and 1 cm 3 . For example, one fifth-grade interviewee who used eyeballing indicated, "If the object looked very big, I would measure it using a mental image of 1 m. I would first (mentally) measure the length and width before applying a formula." Similarly, one sixth-grade interviewee who used eyeballing described, "I usually try to estimate by eyeballing it. I mostly use the images of a 1-m stick and a 10-cm ruler in my head. If I have some objects nearby that are about one meter or something, I would probably use those things to estimate." Most of the interviewees were familiar with the lengths of their body parts and pencils, and they tended to use these references flexibly for comparison. For 19 57 100% Note. LE = length estimation; AE = area estimation; VE = volume estimation example, one sixth-grade interviewee who used body parts expressed, "If the object I see is very long or large, I would estimate using the number of footsteps. If it is something smaller, I would compare it to something around one centimeter… or fifty centimeters, which is half a step. Because one centimeter is about the thickness of my finger now. Or, I would use comparison because I knew the size of a 1-cm3 cube." Furthermore, most of the elementary school interviewees knew the size of a 1-m 2 square. This knowledge was based on previous learning experiences from activities such as measuring and making a 1-m 2 space using sheets of newspaper.
Interestingly, elementary school interviewees tended to describe the use of guessing when they were uncertain about the accuracy of an estimated answer or when a mental ruler was used for estimations without confidence, based on the interview data. For example, a fifth-grade interviewee who used the width of her finger as a measure unit without complete certainty of the unit length expressed "roughly… roughly…, guessing" for estimating measurements.
Additionally, the interviewees stated that they estimated the side lengths of the 2-and 3-D TBEOs and used measurement formulas for estimating area and volume. However, some interviewees said that they did not mention the use of measurement formulas as much when they described their methods for estimating lengths in the written assessment.

DISCUSSION
The current study examined the effects of grade and object size on students' performance of measurement estimations of school objects. The results indicate that grade and object size individually influence estimations, but that there are no interaction effects between the two factors. Overall, the undergraduate students performed significantly better than the elementary school students, while the sixth graders outperformed the fifth graders. All groups performed best in the medium-sized TBEO tasks. Moreover, both the sixth graders and undergraduates performed better in the small-sized TBEO tasks than in those involving large-sized TBEOs, while the fifth graders performed equivalently in the tasks estimating small and large TBEOs. Thus, the current findings did not support the hypothesis that there are interaction effects between grade and object size on students' estimation performance.
For all grade groups, the frequency of underestimations was greater than that of overestimations in tasks involving large-sized TBEOs. The results support the second hypothesis that there are differences in the frequencies of overestimations and underestimations across measurements in each grade group for large objects. However, the frequencies of underestimations and overestimations did not differ for most tasks involving small-sized TBEOs across measurements in each group. There were two exceptions: the sixth-grade group overestimated the length of the item and the fifth-grade group overestimated the volume of the item. Thus, the results partially support the third hypothesis.
Good estimators used similar strategies in their estimations of large-sized objects, such as the use of body parts or objects as references. Guessing was used by some good estimators in the elementary school group. Furthermore, low frequencies of the use of unclear or ineffective strategies were found across grade groups. Most of the interviewees were inclined to use measurement units such as 1 cm, 10 cm, and 1 m for comparing the TBEOs when using the eyeballing or mental ruler strategies. The following discussion centers on specific findings, and instructional implications are suggested to strengthen students' ability to estimate measurements of school objects.

The Role of Grade and Object Size in Estimation Performance
The findings indicate that estimation performance was improved with increase in grade. The effects of grade on estimation performance is consistent with previous results . With an increase in grade comes an increase in base knowledge of scales in conjunction with an understanding of proportional reasoning Jones et al., 2012) and experience with physical measurement. Interviews with the good estimators in the current study provide evidence that experience with real measurement helps the development of measurement estimation skills.
Despite the differences among grade groups, overall, students performed poorly on the estimation assessment in general, given that most of the TBEOs were touchable. The overall percentage correct was approximately 42% for the fifth graders, 47% for the sixth graders, and 61% for the undergraduates. Bright (1979) suggested that touchable TBEOs or measurement units lead to more reasonable estimates due to a decrease in the necessary mental operations (i.e., imagining or recalling the sizes of TBEOs); therefore, the results suggest that the participants in the current study were below average estimators. The conceptual understanding of scale and size (Jones et al., 2008), along with activities that help build a set of scale references  should be highlighted in elementary school mathematics.
All grade groups performed best on estimations of medium-sized TBEOs that had lengths between 11 and 50 cm. According to the scoring scheme used in the current study, all groups were able to estimate within 10% to 25% of the actual measurements of the mediumsized TBEOs, reaching acceptable accuracy. These findings are similar to those obtained by Ruwisch, Heid, and Weiher (2015) who found that fourth-grade students performed better on estimation of medium-sized length measurements (i.e., between 8 and 46 cm).
Furthermore, both the sixth graders and undergraduates in the current study made more accurate estimations of small-sized measurements (i.e., between 1 and 10 cm in length) than large-sized measurements (i.e., between 51 and 100 cm in length). However, this did not hold true for the fifth graders. The possible reasons why the fifth graders performed differently from the other two groups are discussed later. Overall, these findings are in agreement with those of another study , suggesting that estimating large objects is challenging for students.
In the current study, most of the TBEOs were classroom objects that were familiar to students. The effects of object size on estimation performance may result from operations of unit iteration, which are used frequently for measurement estimations (Desli & Giakoumi, 2017;Joram et al., 1998). Calculations for obtaining estimated measurements of the small-sized TBEOs are less complicated than those needed for estimating the measurement of the large-sized TBEOs used in the study. Therefore, the cognitive demands needed for comparing a measurement unit of smallsized objects can be reduced (Joram et al., 1998), which in turn increases estimation performance. The complicated operations of unit iteration and calculations necessary for measurement estimations may lead to students' difficulty in making accurate measurement estimations of large-sized TBEOs.
With respect to the result that measurement estimation performance was more accurate for mediumsized compared to small-sized TBEOs, estimation difficulty is related to the object attribute (length, area, or volume) to be estimated (Jones et al., 2012). As the M section consisted of more length estimation tasks and fewer area (or volume) estimation tasks than the S and L sections and therefore, the task difficulty level of the M section was lower than that of the other sections as shown in the previous section, this may explain why the results were better for this section. A higher level of proportional reasoning is demanded for estimating measurements of 2-and 3-D objects than of objects of just one dimension (Pike & Forrester, 1997). Thus, area and volume measurement estimations are more difficult for elementary school students than length measurement estimation.
In contrast, the total number of tasks that required estimation of area and volume measurements in the S section (five tasks) and L section (four tasks) was close. It can, therefore, be reasoned that the sixth graders and undergraduates performed better in the S section than in the L section due to the size discrepancy of the TBEOs (not due to the attribute to be estimated).
The fifth graders, on the other hand, obtained similar scores on both the S and L sections. One possible explanation for this finding is that more than half of the estimations in the S and L sections pertained to area and volume, which may be equally challenging to fifth graders. Skills involved in volume measurement are formally taught at the fifth-grade level in Taiwan (Ministry of Education, 2010). Due to inexperience in estimating volume measurements, the fifth graders had limited, but equivalent, competencies in the S and L sections. Further studies examining the effects of object size and measurement attributes on measurement estimation performance are needed.

Frequency of Over-and Under-Estimations by Object Size
In agreement with Jones et al. (2008), the current study found a high frequency of underestimations of length measurements of large-sized TBEOs. This tendency to underestimate the area measurement of large-sized TBEOs was also found in the pattern of estimation errors made by all three grade groups. Similarly, Forrester et al. (1990) found that for area (and volume) estimations, as the ratio between size of measurement unit and size of TBEO increased, estimates shifted from overestimations to underestimations. An inclination to underestimate the measurements of largesized objects may possibly result from the use of small measurement units, as revealed in the interviews. Using small measurements units for estimating large objects may easily lead to computational errors, due to the complex processing of unit iterations.
Despite a tendency to underestimate the measurements of large-sized TBEOs, there was no similar evidence for small-sized TBEOs. The findings revealed that the frequencies of under-and overestimations in all three grade groups, for most of the small-sized objects were close, with two exceptions. These results imply that the patterns of estimation error made by the elementary school and undergraduate groups seem alike when they estimated the measurements of small-sized objects, although the performance of the undergraduates was superior to that of the elementary school groups.
There was a greater frequency of overestimations than underestimations among the fifth graders (volume) and sixth graders (length) for the small-sized TBEOs. Overestimations of length measurements by the sixth graders may possibly result from difficulty with the processing of unit iterations. Processing of unit iterations may lead to estimation errors because of inconsistent unit size or overlapping of units (Joram et al., 2005). Overestimations of volume by the fifth graders may have resulted from the use of multiplication computation errors. Forrester et al. (1990) suggested that using multiplication computations for estimating area and volume measurements may lead to overestimates. However, further investigations into strategies used by elementary school students in measurement estimation are necessary to clarify the reasons behind such estimation errors.

Good Estimators' Strategies Used for Estimating Measurements
Good estimators across grade groups described a preference for the use of body parts and nearby objects for estimating the various measurements of school objects in the study. The instruction given prior to task completion recommended the use of body parts or objects in convenience for more accurate measurement estimations (Ministry of Education, 2010), and as such, may have biased the participants.
A few good estimators in the elementary group used the strategy of guessing, which has previously been noted as a less effective strategy (Montague & van Garderen, 2003). As revealed in the interviews, some elementary school students considered guessing when they encountered the following cases: (1) an estimated answer that was not precise, and (2) an estimated outcome obtained by using an inexact length as a measure unit. Additionally, these younger students used other ineffective strategies (e.g., unclear and blank). In contrast, most of the good estimators in the undergraduate group tended to use more effective strategies, such as referencing of body parts, mental rulers, and objects, instead of guessing. The use of more effective strategies may have led to better performance among undergraduates than among the younger groups. The findings imply that enhancements of knowledge of linear size and scale through measurement activities, which is suggested for improving the efficiency of estimation strategy (Tretter at al., 2016), need to be taken into account for strengthening elementary school students' ability to estimate measurements.
Previously, Huang (2016) found that approximately 12% of fifth and sixth graders (N = 948, 483 fifth graders and 465 sixth graders) used ineffective strategies, while only 5% (20/360) of the fifth-and sixth-grade participants who were identified as good estimators in the current study did so. These two samples differed in terms of estimator characteristics; that is, the current sample was identified as good estimators while the previous sample represented general estimators but not good estimators. Such a distinction in estimation characteristics between the two samples may have led to the differences in the percentage of using ineffective strategies. This implies that the use of effective strategies is associated with estimators' performance of measurement estimation.
Approximately half of the interviewees in the current study described using mental measurement units (e.g., 1 cm, 10 cm, 1 m, 1 cm 2 , and 1m 2 ) that they had constructed from previous experience of measurement for making comparisons when they used eyeballing. Mentally constructing measurement units and making good use of them as references facilitate estimation accuracy . These results suggest that one essential skill in good estimation performance is the use of eyeballing in combination with mental rulers.
During the interviews, the interviewees mentioned the use of mental rulers, previous experience, and measurement formulas for estimations, but these strategies were not clearly described in their written assessments. For example, results from the written portion of the undergraduate assessment did not indicate the use of measurement formulas or previous experience. The participants may have simplified their descriptions when answering the written assessment; thus, future studies on strategies used for appropriate estimations should include interviews with participants.
Collectively, the current findings suggest that an increase in grade is associated with an increase in measurement estimation ability. Generally, students performed best in estimating the measurements of medium-sized TBEOs, while estimations of small TBEOs were more accurate than those of large TBEOs. A tendency to underestimate the measurements of the large-sized TBEOs was observed in all groups. Surprisingly, the sixth graders tended to overestimate the length of small-sized TBEOs, while the fifth graders were inclined to overestimate the volume of small-sized TBEOs. Those with good estimation abilities showed a preference for using body parts and convenient objects as references for their estimations. In addition, the integration of measurement units constructed from previous experiences with eyeballing was an essential skill used by good estimators. Findings from the current study should be considered when developing measurement curricula, specifically at the elementary school level.

Limitations and Future Directions
This study had several limitations that impact its generalizability. The majority of TBEOs in the current study were touchable. Reasonable estimates can be obtained by using body parts as measurement tools with touchable objects. Future studies using untouchable TBEOs should be conducted to examine performance in mentally determining the measurements of objects and patterns of estimation error when making measurement estimations. Due to the lack of assessment of the volume estimation measurement of large-sized objects, whether there is a tendency to underestimate the volume of largesized objects, as was found for length and area, is unknown.
Despite the limitations of the current study, the results suggest that experiences with physical measurements facilitate construction of measurement units as references and aid skills in estimating measurements of objects. Investigating how students who are skilled in measurement estimations select effective strategies over less effective strategies (e.g., guessing) for estimating measurements of objects is a promising area for future work.