Introduction
In the spring of 2020, most medical school classes switched from face-to-face classes to online classes in Korea due to the coronavirus disease 2019 (COVID-19) pandemic. Within a short period of time, it has become an era of full-scale online classes and professors either have real-time online classes through online-conference platforms such as Zoom or Google Meet, or make a video for the class and distribute it to the students.
Real-time online classes have the advantage of being of enabling interaction between professor and students; however, some teachers complained that it was difficult to check if students were participating in the class or understanding the content as students would either turn off their cameras or the small frames made it hard for them to see the students’ facial expressions or gestures clearly. Serhan [1] in 2020 found that 48.4% of the students displayed a negative attitude toward Zoom classes compared to face-to-face lectures, while 61.3% of the students answered Zoom classes did not help improve their learning. It is especially necessary to focus on the negative answers regarding Zoom classes, such as “difficulty focusing on the lecture”, “dissatisfaction over the quality of interaction and feedback”, and “poor class quality” [1]. Against this background, Bao [2] in 2020 proposed the following five high-impact teaching practice principles to increase the quality of online learning: (1) maintain appropriate relevance between the amount, level, and length of class content and students’ level; (2) control the pace of classes while effectively delivering content; (3) provide sufficient support through timely feedback, including e-mail guidance, from professors and teaching assistants to students after class; (4) heighten the level and depth of student participation during class; and (5) prepare a contingency plan to deal with unexpected problems that may occur on the online education platform. Thus, it has been confirmed through studies of Serhan [1] and Bao [2] that the sudden online classes due to the COVID-19 pandemic require a strengthening of design, operation, and students’ feedback of the classes, in addition to the various efforts to improve the quality of learning.
Recent assessment trends have moved past the summative assessment where the students’ achievements were checked to emphasize the importance of formative assessments, where the assessment itself is used for learning and applied in learning. Formative assessment refers to an assessment to give feedback to students while teaching and learning is in progress and to improve the curriculum and teaching methods. Black and Wiliam [3] in 1998 regarded all feedback activities partaken by the teacher or students during class for the improvement of teaching and learning as formative assessment, and claimed that “assessment for learning” should be emphasized because formative assessments have a significant impact on student achievement. Formative assessment is emphasized and is being implemented and conducted in many classes and subjects. Many studies are also being conducted on the theories, effects, and method of formative assessment [3-9]. However, not many studies have been conducted on the “items” used in the formative assessment, and there exist studies that used the cognitive domain of Bloom’s taxonomy (BT) [9] and applied cognitive diagnostic models [10,11]. Kim et al. [12] in 2017 argued that when developing formative assessment items, all important elements of the learning unit and each stage of taxonomy of educational objectives should be included. In addition, in formative assessment, items of difficulty equivalent to the minimum criterion should be presented to ascertain whether a student has achieved or not [12]. However, there are insufficient studies on this matter.
Based on literature review of Bao [2] and Kim et al. [12], this study aims to exploring that (1) using formative assessment as instruction strategy to increase student’s participation and to enhance understanding of contents in real-time online classes, (2) to achieve the goal of improving student learning and improving classes, which are the main functions of formative assessment, BT is applied to item development and its implications are considered.
Methods
1. Research design
This is the study that developed the items of formative assessment, analyzed the items statistically, and investigated students’ perceptions of formative assessment through the survey.
2. Design for online classes
The Function of Human Body (FHB), Basic Science of Circulatory and Respiratory System (BSCRS), and Basic Science of Urinary and Reproductive System (BSURS) subjects for first-year students at the CHA University School of Medicine are integrated subjects comprising histology, physiology, internal medicine, and emergency medicine classes. One physiology professor taught all three subjects to 43 registered students. In 2020, all classes were conducted online in real-time through Zoom because of the COVID-19 pandemic.
In order to increase the quality of online learning, Bao [2] in 2020 proposed to adjust the level and length of lessons appropriately, and to adjust the speed of lessons in consideration of the level of students and the contents of the lessons. Accordingly, the contents of the existing learning outcomes were adjusted to 2–3 per hour, and this allowed time to participate in online formative assessment and to enhance student participation and the interaction between the professors and students, the class was designed by including a formative assessment on related content, which was presented as the learning outcomes, as illustrated in Fig. 1.
Students took 10–15-minute real-time lectures per learning outcome through Zoom, and then solved 1–2 formative assessment items on the relevant learning outcome after class through Google Classroom. The professor could instantly check the results after the completion of the assessment and explain the correct answers for all the questions. The formative assessment score was not reflected in their grades.
3. Development of formative assessment items
Kim et al. [12] in 2017 said that taxonomy of educational objectives is important when developing formative assessment items. In Korea, primary and secondary school teachers use taxonomy of educational objectives when developing test or exam items, and taxonomy of educational objectives is composed of a two-dimensional matrix of assessment contents and behavioral elements [13-15]. The sub-elements of the behavioral elements may be different for each teacher, but the most used is the cognitive sub-behavioral elements of Bloom’s cognitive domain [12]. BT (1956)1) consisted of six major categories [16]: knowledge, comprehension, application, analysis, synthesis, and evaluation, and each category has a hierarchical structure in cognitive domain. The lowest level of “knowledge” and the highest level of “evaluation” are required for cognitive abilities. In this study, only “knowledge, comprehension, and application” among BT were applied in the behavioral elements. Because there was a study that pointed out that teachers had difficulty distinguishing between knowledge–comprehension, application–analysis, and analysis–synthesis, behavioral element beyond “application” were excluded [13]. And we also thought that the higher-level “analysis, synthesis, and evaluation” require student’s time to internalize the content of the lesson into their own learning.
When developing the items, each item was developed with the content element as a learning outcome and the behavior element as “knowledge, comprehension, and application” were matched, and all items were developed as multiple-choice items. Application items that used physiology knowledge in clinical context were presented as formative assessment items; it was designed so that the students learned how basic medical knowledge could be applied in clinical context by solving these items.
4. Data collection
All correct answers for each item were collected for the matching the categories of BT of cognitive domain used in the formative assessment for FHB, BSCRS, and BSURS subjects in 2020 to calculate the difficulty based on the classical test theory.
After completing the semester, students’ opinions on the class and formative assessments were collected through the survey developed by us. The survey was anonymous and the students’ free response was guaranteed. The survey consisted of 5-point Likert scale questions, multiple-choice questions, and open-ended questions that they could answer freely. This study analyzed only the response data about formative assessment.
5. Data analysis
The number of formative assessments carried out in each physiological class during the three subjects, the frequency of items and analysis of the item difficulty by the categories of BT. For student responses to the questions related to formative assessment in the survey, descriptive statistics and frequency analysis were conducted using JAMOVI ver. 1.6.15 (JAMOVI, Sydney, Australia; https://www.jamovi.org), and the answers to the open-ended questions were summarized accordingly.
6. Ethical considerations
While this study collected formative assessment data from the results of the previous year’s class operation and survey data for the improvement of classes, it did not collect personal identification data of the study subjects. Thus, the institutional review board approved this study to be exempt from deliberation (1044308-202105-HR-032-01).
Results
1. The results of the online classes and formative assessments
Table 1 presents the number of classes, number of formative assessments, and number of formative assessment items per categories of BT for the FHB, BSCRS, and BSURS subjects taken by first-year students in 2020. Each subject gave students 2–4 formative assessment items after each class, which were distinguished into “knowledge, comprehension, and application” items. The instructor maintained 45%–55% inclusion rate for “application” items, so that the basic medical knowledge learned in class have been used in clinical context through formative assessment items.
2. The analysis results for the item difficulty
As mentioned earlier, in formative assessment, items of difficulty equivalent to the minimum criterion should be presented to ascertain whether a student has achieved or not. The instructor expected 80% of the correct answer rate to take each item. However, some Application items were thought to be difficult for students, but they were included in the formative assessment as they thought they were necessary for learning.
The item difficulty (the higher the number, the easier the item) of formative assessment items of FHB, BSCRS, and BSURS subjects was analyzed with the classical test theory. The results are presented in Fig. 2. The average difficulty was 0.70 for FHB, 0.64 for BSCRS, and 0.55 for BSURS. Students were able to easily solve “knowledge” items; however, it was found that the higher the level of cognitive abilities required to solve an item, the lower the average difficulty of that item was. Application items are considered to be of difficulty because students are required to use their application cognitive ability based on their full understanding of the content learned.
Table 1 shows the number of items with a correct answer rate of 80% or higher. FHB, BSCRS, and BSURS were 58.7%, 12.8%, and 0%, respectively. FHB had the most items with a correct rate of 80% or more, and BSURS had none. The difference between the instructor’s prediction and the actual item difficulty is large, so it seems necessary to adjust the item difficulty later. However, the results of the item difficulty analysis by the classical test theory may have different values due to the influence of the learner group, so this should be considered when interpreting. That is, even with the same item, the difficulty of the item is calculated to be high in the group of excellent learners, while a low value is derived for the group of learners with low achievement.
3. Students’ perception on formative assessment
Of the total 43 students, 26 answered the survey for class improvement, 14 were male (53.8%), and 12 were female (46.2%). Their answers to the 5-point Likert scale question to evaluate whether formative assessments helped their learning are presented in Table 2. According to the students, the formative assessment helped them focus on class, understand the learning content, and achieve learning outcomes. In addition, by solving the “application” items and listening to the professor’s explanation of the correct answer, they were able to apply basic medicine to clinical context. No student answered 1 (not at all) or 2 (not really) to all four questions.
In detail, students’ opinion on how the formative assessment helped their learning; whether the formative assessment items for each category of BT helped their learning; the experience of taking a formative assessment after each learning outcome; and open-ended questions are presented in Table 3. Most students answered that they learned what content was most important through formative assessment. It was indicated that “comprehension” items helped the most in helping them understand the class content and the “application” items helped the most in achieving learning outcomes. As for the adequate timing of formative assessments, most students answered that it would be best to conduct formative assessments once at the end of each class. This seems to reflect their burden toward the fact that they experience an assessment once or twice every class. Other opinions included the increase in the number of items and a modification of the difficulty.
Discussion
Online classes have become common in educational institutions around the world due to the COVID-19 pandemic. Agarwal and Kaushik [17] in 2020 proposed in their research that most of the learners of online classes using Zoom will be a part of medical education and believed online classes will be a part of the postgraduation curriculum even after the end of the COVID-19 pandemic. Due to such changes in teaching methods and the learning environment, many schools and professors are seeking ways to design effective online classes and increase student participation.
Against this background, we would like to propose the utilization of formative assessments to increase the students’ concentration during online classes, to allow instructors to immediately check how much the students understood the learning content, and to allow interaction between instructors and students. This study presented 1–2 formative assessment items after one learning outcome lecture to utilize formative assessments as an online instruction strategy. And when developing the items, “knowledge, comprehension, and application”—the categories of BT were matched to each item. The results are as follows:
First, the students focused during class because they had to take formative assessment immediately after the learning outcome lecture, thus being able to utilize the knowledge acquired during the class. Second, instructor was able to immediately check the students’ answers in Google Classroom, thus being able to provide instant feedback. In addition, instructor was able to immediately improve his classes, because he could assess the students’ situation of understanding. Third, “integration of lesson and assessments” was maximized by solving the assessment items as well as through the instructor’s immediate explanation of answers. Students were able to learn through the problem-solving process. This also means that the learning in the existing class unit was further subdivided into the learning outcome unit as the learning process through the lecturer’s class deliveryformation assessment-explanation of correct and incorrect answers and distractors for each learning outcome. Fourth, through formative assessment, the students were able to utilize metacognition by learning what content was important and what content they understood or did not understand. Fifth, the formative assessment items of diverse level of cognitive abilities allowed students to understand the content of the class and apply it to clinical context. Application items themselves became an example of how the knowledge learned was applied; with this, just solving the question became key learning content.
This study is about the timing of formative assessment during class, and the content of item development. Through this study, we would like to consider the method of formative assessment, the cognitive level to be measured during item development and appropriate difficulty, and the effect of formative assessment perceived by students.
First, although formative assessment was not included in grades, it was found that students were stressed just by being frequently exposed to assessment situations. Studies on formative assessment through online platforms or mobile apps like Kahoot! existed before the COVID-19 outbreak [4-6]. In these pre-studies, several methods were suggested so that students could have fun and be interested in participating in assessment. Using a mobile app, students can be entertained like a game, or they can reward a student who answers more or faster than other students. It is beyond the assessment of the student’s understanding of the lesson, and it is seen that evaluation itself becomes a part of learning, that is, lecture and evaluation are integrated.
Second, the feedback provided after formative evaluation enables students’ self-reflection and self-assessment, and through this, “assessment as learning” was possible. The three most common answers by students about why formative assessment helped their learning were as follows: (1) I learned what the important content was. (2) Important content was repeated. (3) I found out what I knew and what I did not. Of these, “(3) I found out what I knew and what I did not” is a response related to students’ metacognition, and students acquire metacognition by self-reflection and self-assessment through commentary and feedback on the items.
Earl [18] in 2013 has presented “assessment of learning”, “assessment for learning”, and “assessment as learning” as the paradigms of student assessment. According to Earl [18], “assessment as learning” is a subset of assessment for learning, emphasizing the role of the students. Learning is the process of combining new knowledge with the structure of the student’s existing knowledge. What is important in this process is the student’s own role. In other words, it is important for students to think about what they knew, what new knowledge is, and how to internalize it by organizing it with existing knowledge. Assessment helps with this. Earl [18] stated that “assessment as learning” must be most widely used, where students become the subjects of assessment and they check and adjust their own knowledge for further learning. Then, “assessment for learning” must be used, to enable instructor obtain information for their instructional judgement in teaching situations and provide effective feedback to students. What these two assessment paradigms have in common is that assessment is done to support learning rather than to provide information about the results, and that formative assessment as an assessment method is preferred [19].
Third, items that can maximize the function of formative assessment should be developed. According to Seong [10] in 2018, the general characteristics of test tools for formative assessment are as follows: (1) It is conducted by teachers, but recently learners can also participate in test production. (2) It has the characteristics of a criterion-referenced test because it aims to analyze how much students understand the contents of teaching-learning. (3) As it has the purpose of criterion-referenced, the difficulty of the test tool should be composed of items that correspond to criterion that can distinguish success and failure of learning, rather than varying the difficulty of the test tool. (4) It should be an item that can continuously arouse learning motivation and interest. (5) Items must contain distractors that contain misconceptions that may cause underachievement [10].
However, although positive functions of formative assessment are expected, it is not easy to put a lot of effort into developing items compared to summative assessment. In this study, BT was applied when developing the items, and 80% of the students were expected to get it right. As a result of analyzing the actual percentage of correct answers, there was a big difference in the number of items that got more than 80% correct for each subject. It is not easy to develop items with the difficulty of 80%, so more related research is needed. And, in order to accurately diagnose a student’s current situation of understanding of class content through formative assessment, it is important to develop distractors. If each distractor is developed based on cognitive factor that students can make mistakes, the correct diagnosis data for students can be based on which distractor is selected. As formative assessment is being emphasized, continuous research is needed to develop items.
This study has a limitation. As a case study on the design, item development, item analysis, and resulting student perception of formative assessment carried out in the real-time online physiology class in one school, generalizing the results was difficult. However, it is based on the data continuously accumulated through 48 hours of classes taught by one professor in one semester. This study is significant in that it suggested the appropriate implementation time for formative assessment in online classes, and it deals with the application of BT when developing formative assessment items.