The Impact of Peer Instruction on Ninth Grade Students’ Trigonometry Knowledge

Abstract In this study, we conducted peer instruction (PI) activities to promote student participation in the learning process and test the hypothesis that PI improves student achievement. Two ninth-grade classes were randomly assigned as treatment and control groups. Pre-test and post-test data were obtained for measuring mathematics achievement in trigonometry. Data were analyzed using analysis of covariance procedures with an alpha significance level of 0.05. Results indicated no significant effects of peer instruction on achievement. This study implies that more robust studies are needed to reveal the real effect of PI.


Introduction
Peer instruction (PI) that has been extensively adopted in science classes is a form of active learning, and the overall procedure is a version of the think-pair-share technique (PRAHL, 2017). It is usually defined as a chance for classmates to discuss concepts or to share answers to questions in the classroom atmosphere, where they have further occasions for additional communication with their teachers (KNIGHT; BRAME, 2018). The overall aim of this study is to compare the effects of PI enhanced with the concept test to regular instruction on 9 th -grade students' achievements in trigonometry.
Therefore, we tested the hypothesis that peer instruction enhances meaningful learning or transfer, defined as the student's ability to solve novel problems or the ability to extend what has been learned in one context to new contexts.
In a typical PI sequence, the instructor poses a conceptual question and allows students the opportunity to think individually and record their answers independently, often by voting using clickers or online response systems (BALTA; PERERA-RODRÍGUEZ; HERVÁS- GÓMEZ, 2018). The answers are shown as a response graph (usually a histogram) on a screen in front of the class. They can see from the response graph that others also have different considerations.
Then, before they submit a new response, students form small groups to discuss their responses with their peers, to clarify their reasoning, and convince each other. After a few minutes, students can answer the question again and perhaps make a different choice. A new response graph is then displayed, which now includes the correct answer and leads to a classroom-wide discussion (with the teacher involved). The discussion with students aims to reason on the correct answer. When the discussion is over, and everyone agrees, a new cycle begins. The whole procedure, from start to finish, takes about five to eight minutes. This is the core of PI, which can improve teaching and promote learning through discussion with peers (CROUCH; MAZUR, 2001).
According to the review of peer instruction by Vickrey et al. (2015), PI (a) can increase students' conceptual understanding, (b) improve their problem-solving skills, (c) is effective in multiple disciplines, and (d) is effective in courses at different levels. In their meta-analysis, Balta et al. (2017) showed that PI has a positive effect on learning, and it has been used in a wide range of courses and countries to create interactivity during lectures.
PI has been shown frequently from research to increase students' conceptual reasoning Student attitudes towards peer instruction are commonly positive; students report that the method helps them learn course material and that the instant feedback it provides is valuable (KNIGHT; BRAME, 2018). A common problem with PI is that one or two students may control the argument, making the rest of the group inactive members. Further, when PI is implemented, the lecture time is reduced, so students may have to spend extra time reading their textbooks to understand course material (LUCAS, 2009). Another significant problem in the implementation of PI was whether displaying the class responses to a question biases their subsequent answer.
The findings of Perez et al. (2010) suggested that seeing the most common answer can bias a student's second choice on a question.

Peer Instruction in Math Classes
Although the initial implementation of PI in classrooms was done in 1991 (MAZUR, 1997)  Akay (2011), in her study, examined the effect of the PI method on mathematics achievement and mathematics attitudes on transformation geometry for eighth-grade students.
Her results indicated that PI's effect on the transformation geometry positively affected the students' mathematics achievement and their attitudes towards mathematics. Another study conducted by  investigated PI's effect on academic achievement in mathematics of undergraduate students in Oman. The result of the study indicated that the PI strategy is an active tool to increase mathematics achievement. Further, Demirel (2013) investigated the effect of using PI in mathematics courses on students' attitude, achievement, and retention of knowledge. Survey results showed that students had significant development in their academic success in mathematic lessons; however, no significant difference in their attitudes towards mathematic lessons was observed.
Several points related to the usage of PI in math classrooms are significant. First, the late usage of PI in math topics can be attributed to the fact that there are few conceptual questions in math topics (SOMASUNDRAM; SYED ZAMRI; LEONG, 2018) when compared to science. Second, most of the publication on PI is related to calculus because calculus includes sufficient conceptual questions when compared to other math topics. Third, many of the PI studies in math are conducted in the USA and Turkey because PI was initially developed in the USA, and researchers in Turkey frequently replicate the publications done in the USA.

The Issue
Existing educational literature has not yet sufficiently addressed the usage of PI in teaching mathematics topics. Previous studies have several topics: systems of equations and ISSN 1980-4415 DOI: http://dx.doi.org/10.1590/1980 inequalities (ALLISON, 2012), functions (ABDELKARIM; ABUIYADA, 2016), and statistics topics (OLPAK; BALTACI; ARICAN, 2018). Yet, the effect of PI on trigonometry has not yet been studied.
Unfortunately, research-based educational methods have not yet been widely used in post-Soviet countries, including Kazakhstan. Researchers need to focus on these countries (1) to the effect of educational methods in new populations, (2) help school development in these countries, and (3) help the improvement of science in these countries. Thus, this study seeks to search the effect of PI on ninth-grade students' trigonometry knowledge on a population from Kazakhstan. To our knowledge, this is the first study investigating the effect of PI on any subject at any level in Kazakhstan. The following research question frames this study: • How does peer instruction affect 9th-grade students' trigonometry learning?

Methodology
This quantitative experimental research was carried out with a sample of Kazakh students from a high school on their experience with PI. An instrument comprising 25 items from the trigonometry unit was developed to gather data, which were collected at the end of the fall semester of the 2018-2019 academic year.

Participants and Context
The participants in this study were 89 ninth-grade students from a population of private high schools in a large city in Kazakhstan. All students were male, and the student population was approximately 90% Kazakh, 4% Turkish, 2% Russian, and 4% other nationalities such as Tajik, Uyghur, and Afghan.
The school is employing an educational system known as a gymnasium. While some of the courses such as physics, mathematics, biology, chemistry, and computer science are taught in English, the rest of the courses are taught in Kazakh. The pairs of classes from this school were selected for convenience due to the first author's role as a mathematics teacher in previous years. All students from this school speak Kazakh and Russian, and they also speak Turkish and English at the upper intermediate level. Students were informed of this study at the start of the autumn semester. The explanation covered the study's aim and scope, including a summary of what students could predict in terms of the curriculum that would be covered. All the participants volunteered to join this study. ISSN 1980-4415 DOI: http://dx.doi.org/10.1590/1980

Instrument
In order to measure students' academic achievement in trigonometry, they developed a Mathematics Achievement Test (MAT) consisting of 39 items. The initial version of the test was prepared by considering the table of specification for the ninth-grade trigonometry unit using Bloom's revised taxonomy (KRATHWOHL, 2002). The initial version of the MAT was checked by two experts who suggested minor changes. The test was administered to 68 tenthyear students as a pilot study. The pilot application was made on tenth-year students because they had learned the trigonometry unit earlier. Item difficulty, item discrimination, point biserial correlation, and KR20 analysis were performed on the data collected from the pilot group. Out of 39 items, 14 were eliminated because of improper statistics, and 25 were retained.
The reason behind the elimination of many items is not the low quality of the instrument but the conduction of high-quality instruments and many statistics. That is, item discrimination, point biserial correlation, and reliability coefficient (KR20) if the item is deleted. If, for example, we had made only item difficulty and item discrimination, we would remain with many items (32). However, initially, we included many questions (39) in our test to safely remove all improper items.

Data Collection and Data Analysis
This study was conducted in the third period of the 2018-2019 academic year (in Kazakhstan, there are four periods in one academic year) with 89 students in Almaty. The ISSN 1980-4415 DOI: http://dx.doi.org/10.1590/1980 implementation of PI lasted 12 weeks, and there were 33 and 34 students in the experimental and control groups, respectively. Ninth-grade students have six periods of mathematics each week, and each lesson period lasts 40 minutes.
The pre-test was administered at the beginning of the semester, and the post-test was administered after the treatment. All tests were administered during math lessons by the course teachers. Students were given 40 minutes to complete the test.
A one-way analysis of covariance (ANCOVA) test was run to examine the difference between peer instruction and regular instruction groups' post-test scores controlling their initial differences using pre-test and GPA scores at the end of the semester. ANCOVA is a popular procedure for removing extraneous error variance and for adjusting pre-existing differences among groups (HARWELL, 2003). Before conducting ANCOVA, we checked if the assumptions were met. All analysis was conducted using the SPSS 21.

Procedures
Initially, training was provided to the treatment group teacher on how to use PI in the classroom. Prior to the implementation of PI, two regular lessons of this teacher were observed by the first author. Except for minor mistakes (such as long student discussions after the first response), the successful application of PI was observed, and they proceeded to the treatment phase. During the treatment stage, an expert (last author) on PI strategy attended one of the lessons and found the procedure's execution successful.
The implementation of PI was carried out as described by Mazur (1997). Each lesson consisted of two parts, the first part of 10-15 minutes of the lesson was a lecture, and the PI was carried out in the remaining part of the course. For the PI, initially, two minutes were given to students to think and solve the questions by themselves. Then, students voted for their answer, and the result was reflected on the screen as a histogram. After examining the histogram, students were given two minutes to discuss their answers with peers. Students were instructed to provide reasons for their answers and to convince their peers that their answers were correct.
In this format, the students had two roles: as a teacher, explaining the rationale for their answer; and as a student, listening to the reasoning of their peers' answers. At the end of the discussion, students were given a second chance to submit another response if desired (1 minute). The second histogram of the answers was also displayed and observed by the students. Finally, the teacher explained (2 minutes) the question along with student discussions, if needed. During voting, students were exhorted to submit their own responses, which they thought were correct. ISSN 1980-4415 DOI: http://dx.doi.org/10.1590/1980 In the PI, students supply answers to each question two times, and PI is not used for all posed conceptual questions. Depending on the percentage of correct responses, three different issues are conducted: If there were less than 30% correct answers at the end of the initial answers, the teacher repeats the related topic. If the correct answers are between 30% and 70%, then the peer teaching method is applied, and if the correct answers are over 70%, the other question is passed (LASRY; WATKINS, 2008).
An example of the question asked and its initial and final histograms are as follows: What is the maximum value of (3cosx-5)/2?
Figure 1 -A typical histogram for students' initial and second responses. The correct answer is B. Source: Prepared by the authors As seen in Figure 1, the frequency of correct response raised from 8 to 18 while the frequency of the distractors decreased considerably.

Results
Pre-and post-test means and students' end-of-year GPA for the peer and regular groups are given in Table 2.   ISSN 1980-4415 DOI: http://dx.doi.org/10.1590/1980 Table 2 indicates that the regular group (that is, the control group) scores are higher than that of the peer instruction group (the treatment group) for all three variables. Initially, groups are not equal, so we carried out ANCOVA to eliminate pre-existing differences.

. R e s p o n s e
Student achievement data was initially analyzed, utilizing a percentage change. The mean of the post-test score was subtracted from the mean of the pre-test score, and the difference was divided by the pre-test score for both groups. The solution results in decimal, and by multiplying the solution by 100, this yields the percentage change (BRULLES; SAUNDERS; COHN, 2010). The trend of the percentage change is displayed for each group in  As seen in Figure 2, the percentage change in the scores of the peer instruction group (67.46) is greater than that of the regular group (36.23). This indicates the superiority of peer instruction; however, descriptive statistics can be misleading. Therefore, we conducted inferential statistics (that is, ANCOVA) to reveal any statistically significant differences.
Prior to conducting an ANCOVA analysis, we performed a test of homogeneity for regression slopes assumption to determine the similarity of slopes. Inequality of slopes is the indication of an interaction between the covariate and the treatment (TABACHNICK; FIDELL, 2007). Analysis assessing the homogeneity of regression slopes assumption showed a significant interaction between the pre-test and the post-test while no significant interaction between the GPA and the post-test (See Figure 3). Thus, we continued with the ANCOVA analysis by only the GPA covariate.  Table   3. There was no significant difference in students' achievement [F(1,63)=0.688, p=0.410] between the groups while adjusting for GPA. The  2 value indicates the effect size. For peer instruction, the effect size is nearly zero (0.011). This value is also used to describe how much of the variance in the dependent variable is explained by the independent variables (1.1%).
Ideally, this number is fairly small.
The estimated marginal means section of the output in SPSS gives the adjusted means (controlling for the covariate GPA) for each instruction group. This simply means that the effects of GPA have been statistically removed. From these adjusted means, initial score differences are reduced, and adjusted scores are remarkably close to each other, which gave no statistical difference between the groups in the ANCOVA analysis. ISSN 1980-4415 DOI: http://dx.doi.org/10.1590/1980

Discussions and Conclusions
This study contributes to the field of existing research by adding an experimental study on the effect of a PI approach on student achievement in mathematics. This study is distinctive from other similar studies on this topic because it is the first study conducted in Kazakhstan, or perhaps the first study in the Commonwealth of Independent States.
In the study, we examined the effect of PI on ninth-grade students' performance on trigonometry at the high school level. The key outcome was that PI did not significantly affect student achievement in trigonometry. Explicitly, there was a 31% (percentage change) improvement in the correct responses with PI when compared to regular instruction. However, when the groups pre-existence variances were considered, this difference was found to be statistically insignificant.
The outcomes of this study do not support other research which finds that PI improves student performance and learning. For instance, Crouch and Mazur (2001) found significant rises in conceptual problem-solving skills along with a 10-year duration of peer teaching experience in physics classes. Likewise, Rao and DiCarlo (2000) reported that PI improved medical student success on quizzes. Similarly, Lucas (2009) stated that PI improved student participation and comprehension. As a final example, Cortright, Collins, and DiCarlo (2005) found that a student's ability to solve novel problems was significantly improved after the PI.
The different finding from this study is that the PI did not improve student achievement on trigonometry. While PI clearly increases students' use of reasoning and discussion skills (KNIGHT; WISE; SIEKE, 2016), it does not consistently raise students' course scores (KNIGHT; BRAME, 2018). Our result about the effect of PI on students' achievement in trigonometry is similar to the above discussion. Further, it is worth mentioning that research in computer science stated less significant increases in final examination marks in a course comparing PI and regular lectures (ZINGARO, 2014).

Group
Mean SD Treatment 13.850 .520 Control 15.274 .511 ISSN 1980-4415 DOI: http://dx.doi.org/10.1590/1980 the studies published between 2000 and 2015-showed that out of the 35 studies, 34 had positive effect size values, and thus found in favor of the experimental groups using the PI technique; only one study had a negative effect size (CENTER, 2004), indicating that the control group using the traditional lecture-based method was more effective. Our result will contribute to future meta-analysis showing that PI is ineffective.
The important finding by Perez et al. (2010) can be an explanation of our result. They suggested that students choose the most common response (during second voting) because more of their classmates had firstly selected this response, and students just change their views based on the agreement of neighboring students, but not by learning through PI. In our PI implementation, in all lessons, we strongly recommended students vote independently. Thus, like Knight, Wise, and Sieke (2016) argued, we also observed the increase of students' reasoning and discussion during PI. However, it did not raise our sample's trigonometry test scores. Another possible reason for the ineffectiveness of PI in our study is the fact that we cannot assume that every student peer is also a good teacher (BÜSCHER et al., 2013).
A problem we think to be important in PI implementation is the duration of the discussion. Usually, 2-3 minutes are given to students to convince their peers. We think that this is a noticeably short duration for students who are not teachers to change their peers' minds.
Thus, we agree with Lucas (2009) and think that smart students control the argument, which subsequently affects independent voting.
A limitation in the current study is that the classes in treatment and control groups were not equivalent in terms of trigonometry achievement. Starting with two equivalent classes could have yielded different results. Another limitation of our study is that we intended to equate students' pre-existing differences with two covariates (pre-test scores and students' GPA).
However, assessing the homogeneity of regression slopes assumption showed a significant interaction between the pre-test and the dependent variable. Thus, we used the GPA as the only covariate, which we think is not as strong a covariate as the pre-test scores. If the homogeneity of regression slopes assumption was also met for the pre-test, it could be possible for the effect of PI to be statistically significant. Finally, all students in this study were male, and if females were added too, the results would have been different.
Future research with PI can be conducted without showing the initial histograms to control possible students' biases by class responses. We recommend the instructors to use histograms during PI judiciously. Future research is also needed to see if the PI is ineffective for other student populations in Kazakhstan. Further research could also be conducted on the influence of PI on student motivation, discussion, problem-solving skills, and retention in