| Home | E-Submission | Sitemap | Contact us |  
Korean J Med Educ > Volume 36(1); 2024 > Article
Chang, Kim, and Park: Examination of medical students’ opinions on multimedia learning materials according to social cues: focusing on sound principles



Although interest in various forms of learning media is increasing due to the coronavirus disease 2019 (COVID-19) pandemic there is relatively little research on influencing student motivation by intervening in cognitive processing. The purpose of this study was to present the optimal form of learning materials provided to medical students.


This study provided learning materials in class at a level according to social cues (script, video [artificial intelligence (AI) voice], video [professor voice]) based on the principle of voices among the principles of personalization, voices, image, and embodiment of social cues in multimedia learning, and surveyed students’ opinions.


There was no statistically significant difference according to social clues in satisfaction and learning help, but both appeared in the order of silent videos containing the professor’s voice, followed by videos containing the AI voice.


This study is significant in that there is no research on the impact of student motivation on the provision of learning materials for medical school education in Korea, and we hope that it will help provide learning materials for self-directed learning of medical students in the post-COVID-19.


Researchers who study learning using multimedia argue that, in addition to what learning materials should be included, how these learning materials are presented is crucial [1]. Multimedia, which assists in conveying learning materials to students, has diversified and developed as technology has advanced. As technology has made online education possible, educational research on videos that combine various types of multimedia such as text, pictures, animation, and music has been conducted for several years. However, because learning using these new technologies does not necessarily lead to academic achievements or enhanced motivation, educators must continuously consider what types of videos are effective for students’ learning [2].
Due to the coronavirus disease 2019 (COVID-19) pandemic that began at the end of 2019, Korea’s elementary, middle, and high schools, as well as universities, postponed the start of the semester in 2020 to prevent its spread, in addition to implementing online classes [3]. To help students learn, schools and teachers had no choice but to use new communication platforms to provide learning materials in ways that were not previously used offline [4]. Because similar online learning trends were already occurring around the world before the onset of the pandemic, specific groups (e.g., MOOCs [Massive Open Online Courses], Khan Academy, some online courses) may have been interested in this technology before the pandemic started. Afterward, however, online learning had become a hot topic worldwide.
Although there are many studies available on how cognitive processing is stimulated by multimedia learning, there are relatively few studies on how to stimulate students’ motivation by utilizing social cues to enable cognitive processing [2]. Because the use of social cues is a key factor in the schema that determines whether a student perceives a video as a simple information carrier or a social conversation partner [5], it is also a factor that affects a student’s motivation to learn. As social cues motivate students to learn effectively during cognitive processing [6], the instructor’s voice and message delivery style have a significant impact on the student’s knowledge acquisition process [2]. In other words, the presence of social cues helps learners learn effectively by eliciting social responses.
Generally, social cues used in multimedia learning include the principles of personalization, voices, imagery, and embodiment [6]. The principle of personalization dictates that a conversational and polite method of communication is more helpful for learning than one that is formal or directive, while the principle of voices dictates that a human voice is more helpful for learning than a voice reproduced by a machine. Additionally, the image principle dictates that displaying the instructor (or pedagogical agent) on a screen is not necessarily helpful for learning. Finally, the embodiment principle dictates that the instructor (or pedagogical agent) using personable actions on the screen, such as gestures, eye contact, and facial expressions, is helpful for learning.
In this study, which was first applied in Korean medical education, which is focused on the principle of voices, the varying levels of impact on medical students’ learning satisfaction and academic achievement that are the result of the process of learning content that is presented in different formats (text, artificial intelligence [AI]-generated voice, human voice) was confirmed. Accordingly, the purpose of this study was to suggest an optimal form of learning material provided to medical students.


1. Participants

This study was conducted in 2023 on 50 first-year medical students who received non-face-to-face education starting from when they were freshmen due to the COVID-19 pandemic. All participants were informed about the study and informed consent was obtained. All surveys were conducted online using Google Survey (Google LLC, Mountain View, USA). Excluding the four questionnaires that were answered insincerely, 46 questionnaires were analyzed.

2. Design

This study provided learning materials in the “Structure and Function of the Human Body” class at a level according to social cues (script, video [AI voice], video [professor voice]) based on the principle of voices in social cues. This study was conducted about 5 days apart for 2 weeks, and each class lasted about 3 hours. In order to objectively evaluate each student’s level of the three voices principles as much as possible, a survey was conducted immediately after the last class.

3. Development of the survey

Two medical education experts and one medical expert created the survey for the study. In the “Structure and Function of the Human Body” class, learning materials were provided at different formats based on social cues (script without voice, video accompanied by an AIgenerated voice, video accompanied by an instructor’s voice), and students’ opinions were queried using a survey. The survey consisted of questions involving academic achievement, satisfaction, and personal opinions. Regarding the level of academic achievement, students were asked to select between a high group and a low group. At each format according to social cues, satisfaction and learning help were queried about on a 5-point Likert scale, and the strengths and improvements needed for each format according to social cues were asked to be subjectively described.

4. Data analysis

In this study, the following analysis method was used to examine medical students’ opinions about learning materials according to the format of social cues in consideration of how to best provide optimal learning materials. First, a repeated measures analysis of variance was conducted to determine whether there was a difference in the level of satisfaction with scripts, videos accompanied by an AI-generated voice, and videos accompanied by an instructor’s voice and the degree of learning help that each had. Second, a K-independent sample test of non-parametric statistics was conducted to determine whether there was a difference in the level of satisfaction and learning help for each format of social cues depending on academic achievement. Third, multiple regression analysis was conducted to determine the impact of satisfaction and learning help according to the format of social cues on academic achievement. Lastly, content analysis was conducted to confirm the characteristics of the conveyed information and the advantages and areas for improvement regarding the given format of social cues which were queried about in a subjective manner [7].

5. Ethics statement

This study was confirmed to be exempt from review by the Institutional Review Board of Eulji University (management no., EUIRB2023-060).


1. Academic achievement and the level of satisfaction and learning help according to format of social cues

No significant differences were found in satisfaction (p=0.19) and learning help (p=0.29) depending on the type of social cue. But it was confirmed that there was no interaction effect for applying different formats of social cues over time (satisfaction: p=0.52, learning help: p=0.55) (Table 1). Both satisfaction and learning help according to the format of social cues were in the following order: video accompanied by a professor’s voice (average level of satisfaction/learning help), silent video accompanied by a script, and video accompanied by an AI-generated voice) (Table 2).

2. Differences in satisfaction and learning help according to the format of social cues in relation to academic achievement

It was confirmed that there was no difference in the level of satisfaction and learning help according to the format of social cues depending on academic achievement (Table 2).

3. Advantages and improvements to social cue formats

Each time a social cue format was experienced, questions were asked about the strengths of and improvements needed for each social cue format. First, although the script was easy to read and review repeatedly, there were complaints of difficulty in understanding the important parts, diagrams, and pictures. Second, the video accompanied by an AI-generated voice was effective, with accurate pronunciation and speech speed, but it was still perceived that the delivery of information was poor due to the unnatural speaking style. Third, the video accompanied by a professor’s voice was found to be similar to a face-to-face class. In addition, these classes were felt to be immersive, and important parts of the lessons were understandable to the participants. However, there were complaints of some issues such as speaking speed and inaccurate pronunciation (Table 3).


This study found that among the four principles (personalization, voices, imagery, and embodiment) of social cues, no statistically significant difference in terms of efficacy could be identified for the principle of voices. Additionally, no statistically significant difference could be identified in relation to participants’ level of academic achievement. No statistically significant difference could be found in the level of satisfaction and learning help for medical students, regardless of what type of learning materials were provided. However, there is evidence for using video accompanied by a professor’s voice having a higher level of satisfaction for participants, and the format containing a script was deemed to have a higher level of learning help. In fact, in multimedia learning research evidence for the voice principle is limited [6], but this showed similar results to a study human voice more and performed better on subsequent transfer tests [8], a study that confirmed better learning by including machine voices and human voices in an animated pedagogical agent [9]. Based on these results and the opinions collected from students, it is recommended that learning materials based on social cues should include scripts, as well as video accompanied by a professor’s voice. First, the AI-generated voice used in the study was found to have accurate pronunciation and speed (which students have the ability to adjust). However, they may feel uncomfortable with the artificial-sounding voice. In addition, in the cases of silent scripts and AI-generated voices guiding participants through important parts, there are limits to how these formats could be presented unless they are written down or spoken directly. Therefore, the ability to convey information through these formats is poor, making it easier to misunderstand it. Second, presenting learning materials in a variety of formats can help medical students absorb the vast amount of education given to them because these materials can be tailored to each student’s comprehension speed and learning style. For example, after learning with a video accompanied by a human voice, each student can quickly and appropriately re-learn the parts they do not know or are confused about using videos or scripts according to their own situation or learning tendency, utilizing this material for reviewing concepts and clearing misconceptions.
From the instructor’s perspective, providing a script and video is not difficult if they are written and produced, but producing a video without writing a script requires a considerable amount of time and effort to create the script again. However, there are applications available that convert speech into text, so it would not be very difficult to provide both video learning materials and scripts to students.
This study has the following limitations. First, it is difficult to generalize because it was conducted at one medical school. Although it was statistically confirmed that there was no interaction effect depending on time, secondly, several variables were administered to the same group. Third, the survey was not conducted immediately after class. Lastly, other subjects were also implemented during the time the study was conducted. However, it was statistically confirmed that there was no interaction effect over time. This study is significant because there has been no research on the impact of student motivation from providing learning materials for medical education in Korea. Due to the results of this study, we hope that this study will be helpful in providing learning materials for medical students’ self-directed learning in the post-COVID-19 era and in providing flipped learning introductory materials for medical education. We also hope that further research in this field will be actively conducted in the future.




No financial support was received for this study.
Conflicts of interest
No potential conflict of interestrelevant to this article was reported.
Author contributions
Conceptualization: WSC, HJP; data curation: WSC; formal analysis: HJP; funding acquisition: YRK; methodology: YRK, HJP; project administration: YRK; visualization: YRK, HJP; Writing–original draft: WSC, HJP; Writing–review & editing: WSC, HJP; investigation: WSC; resources: WSC; and validation: WSC, HJP, YRK.

Table 1.
Academic Achievement and the Level of Satisfaction and Learning Help at the Level of Social Cues
Variable Sum of squares Degrees of freedom Mean square F p-value
Level of satisfaction
 Level 4.88 2.00 2.44 1.68 0.19
 Level*academic achievement group 1.93 2.00 0.96 0.66 0.52
 Error 127.86 88.00 1.45
Level of learning help
 Level 3.01 2.00 1.51 1.27 0.29
 Level*academic achievement group 1.45 2.00 0.73 0.61 0.55
 Error 104.20 88.00 1.18
Table 2.
Comparison of the Level of Satisfaction and Learning Help at the Level of Social Cues according to Academic Achievement
Variable Total
Top group
Z p-value
Mean±SD M±SD Average rank Rank sum Mean±SD Average rank Rank sum
Level of satisfaction
 Script 2.83±1.47 2.87±1.58 23.13 532.00 2.78±1.38 23.87 549.00 -0.19 0.85
 Video (AI voice) 2.54±1.41 2.39±1.34 24.78 570.00 2.70±1.49 22.22 511.00 -0.67 0.51
 Video (professor’s voice) 3.00±1.01 3.13±1.10 22.02 506.50 2.87±0.92 24.98 574.50 -0.78 0.44
Level of learning help
 Script 2.96±1.35 3.09±1.38 22.24 511.50 2.83±1.34 24.76 569.50 -0.66 0.51
 Video (AI voice) 2.70±1.31 2.61±1.37 24.33 559.50 2.78±1.28 22.67 521.50 -0.43 0.67
 Video (professor’s voice) 2.90±1.25 3.17±1.15 22.15 509.50 2.91±0.95 24.85 571.50 -0.71 0.48

SD: Standard deviation, AI: Artificial intelligence.

Table 3.
Advantages and Improvements to Social Cue Levels
How to provide Advantages - It is difficult to understand complex concepts using only scripts.
Script - Repeatable reading possible - Difficulty understanding diagrams and pictures
- Easy to understand and review (easier to check unfamiliar areas in text than in video) - Difficulty identifying important parts
- Easy to have misconceptions
- Ability to learn at your own pace - Similar to reading a textbook (reading text is difficult to understand)
Video (AI voice) - Correct pronunciation - No pitch of speech
- Consistent speech speed and intensity - Unnatural way of speaking
- Can be heard slowly - Awkward pronunciation due to mixing English and Korean
- Speech speed and intensity can be adjusted - Resistance to artificial voices
- Transmission power is poor.
Video (professor’s voice) - Similar to lecture class - Speech speed is fast
- Can be relearned - Small voice, unclear pronunciation
- Immersive and easier to understand than mechanical sounds - Similar to the lecture format, but the length of the lecture is short and difficult to understand
- You can check important parts
- Easy to understand pictures and graphs

AI: Artificial intelligence.


1. Töpper J, Glaser M, Schwan S. Extending social cue based principles of multimedia learning beyond their immediate effects. Learn Instr 2014;29:10-20.
2. Um HS, Park IW. Effects of showing the video instructor and segmenting the video lectures on learning outcomes. Korean J Educ Methodol Stud 2016;28(2):369-393.

3. Park HJ, Kim BH, Kim YR. Learning contexts of medical students in offline education and online education. J Humanit Soc Sci 2021;12:2179-2190.
4. Park HJ, Woo RS, Song DY, Yoo HI. Exploring medical students’ perception of non-face-to-face theory and face-to-face laboratory classes during COVID-19 pandemic: focusing on anatomy course. Korean J Med Educ 2022;34(3):223-229.
crossref pmid pmc pdf
5. Um HS, Park IW. The effect of social cues in multimedia learning on learning performance. J Educ Inf Media 2016;22(3):583-603.
6. Mayer RE, Fiorella L. The Cambridge handbook of multimedia learning. Cambridge, UK: Cambridge University Press; 2022.

7. Park DS, Kwon JG, Kim JM, Nam HW, Yang GS, Won HH, et al. Research methodology in education. Paju, Korea: Kyoyookkwahaksa; 2020.

8. Atkinson RK, Mayer RE, Merrill MM. Fostering social agency in multimedia learning: examining the impact of an animated agent’s voice. Contemp Educ Psychol 2005;30(1):117-139.
9. Mayer RE, DaPra CS. An embodiment effect in computer-based learning with animated pedagogical agents. J Exp Psychol Appl 2012;18(3):239-252.
crossref pmid
PDF Links  PDF Links
PubReader  PubReader
ePub Link  ePub Link
XML Download  XML Download
Full text via DOI  Full text via DOI
Download Citation  Download Citation
Editorial Office
The Korean Society of Medical Education
(204 Yenji-Dreamvile) 10 Daehak-ro, 1-gil, Jongno-gu, Seoul 03129, Korea
Tel: +82-2-2286-1180   Fax: +82-2-747-6206
E-mail : kjme@ksmed.or.kr
About |  Browse Articles |  Current Issue |  For Authors and Reviewers
Copyright © 2024 by Korean Society of Medical Education.                 Developed in M2PI