A Single-Blinded, Direct Observational Study of PGY-1 Interns and PGY-2 Residents in Evaluating their History Taking and Physical Examination Skills


A Single-Blinded, Direct Observational Study of PGY-1 Interns and PGY-2 Residents in Evaluating their History Taking and Physical Examination Skills
Sandeep Sharma, MD, DrPH(c)

Fall 2011 - Volume 15 Number 4



Background: Internal Medicine residents and interns are often the first contact for newly admitted patients in a teaching hospital. The proper evaluation, diagnosis, and treatment may depend on this initial encounter.
Objectives: To evaluate the history-taking and physical-examination skills of interns/residents on new admissions to the medical floors; to compare data from the patient encounter to the chart for evidence of accuracy; to measure the time spent on the initial encounter.
Methods: An independent medical observer used a yes/no checklist with 60 variables in a single-blinded observational study. Frequency tables were generated and results were based on descriptive statistics.
Results: In 7 categories specifically aimed at chart review for accuracy, discrepancies were found between what medical post-graduate year (PGY)-1 interns and PGY-2 residents (interns/residents) recorded in the patient's chart and the observed actions during the patient encounter. There were 25 encounters observed. In 64%, the time spent on history taking was <7 minutes. In 68%, the time spent for the physical examination was <5 minutes. In 72%, patients were not asked about family medical history. None of the observed interns/residents took their own measurements of the patient's blood pressure. No intern/resident asked about recent weight loss, weight gain, level of salt intake, despite patients with history of hypertension; nor did they perform any examinations of the eye fundi and accommodation, thyroid, carotids, or hearing. The majority of patients were asked about chest pain, cough, nausea, vomiting, chief complaint, and the onset of symptoms.
Conclusions: This study documents the poor overall performance in the quality of history-taking and physical-examination skills on newly admitted patients.


Proper medical care depends not only on the knowledge base of clinicians, but also on their compulsiveness and their integrity. There have been several published studies that evaluate the skills of interns/residents. Evaluation methods used in previous published studies have included direct observation,1-4 mini-clinical evaluation exercises (CEX),5-11 objective structured clinical evaluation (OSCE),12-14 chart review,15 standardized patients and checklists,16-21 a 360-degree evaluation instrument,22 and use of a standardized patient satisfaction questionnaire.23 Each of these evaluation tools is imperfect. Some tools use artificial situations whereas others suffer from the Hawthorne effect, in which clinical performance of the physician is greatly enhanced by knowledge that they are being evaluated. Moreover, none of these techniques has been designed to assess what the physician actually asked and examined compared with the actual work product. In review of these published articles, there is no single-blinded, direct observation of history and physicals conducted during the actual encounter with the patient. Because of a concern that the usual evaluation tools seriously overestimate physician performance, I undertook a single-blinded, direct observational study of Internal Medicine post-graduate year (PGY)-1 interns and PGY-2 residents (interns/residents) to evaluate their history-taking and physical-examination skills as well as to correlate the accuracy of the observed data collection with what they actually reported.


Direct Observation and Chart Review

A health policy doctoral candidate with an Educational Commission for Foreign Medical Graduates (ECFMG) certified medical degree with US clinical experience was recruited to directly observe the initial history taking and physical examinations performed by interns/residents of a New York City teaching hospital. It was imperative to this study that an independent (not affiliated with the study institution) observer was used who was not known to the interns/residents. The observer introduced himself to both the intern/resident conducting the patient encounter and to the patient as a medical researcher who wanted to learn about taking a proper history and physical examination. With the oral consent of both the intern/resident and the patient, the observer was present in the room and did not interfere with the history-taking and physical-examination process. Among the papers in the observer's hand was a thorough checklist with 60 variables that consisted of yes/no answers regarding the history and physical examination. During the actual patient encounter, the observer discreetly marked on the checklist to avoid relying on his memory to complete the checklist afterwards. The intern/resident was completely unaware that s/he was being evaluated by the observer during the patient encounter. The intern/resident had no prior knowledge from colleagues or the Residency Program Director about an evaluation. Hence, this direct observational study was single-blinded. The observer also recorded the length of time used in both the history-taking and physical-examination portions of the examination as an indication of completeness. Another important element of the checklist was the chart evidence section. After the intern/resident note was written from the encounter, the observer reviewed the results of several variables in the patient's chart to determine the degree of accuracy of the recorded information compared with what was actually performed during the encounter. The 7 variables used for chart review were: eye movements, PERRLA (pupils equal, round, reactive to light and accommodation), blood pressure, pulses, reflexes, muscle strength, and rectal examination. These 7 variables were chosen in particular because comments such as: EOMI (extra ocular movements intact), PERRLA, guaiac are regularly seen in interns/residents' notes.

During the two-week period of the study, 15 interns/residents were evaluated in 25 patient encounters (1 to 3 patients per intern/resident). Of the 25 patients, 14 were female and 11 were male. The 25 encounters consisted of abdominal pain (5), chest pain (3), respiratory disorder (6), neurological conditions (4), and "other" (7), consisting of hypokalemia, fever, sepsis, extremity pain, penile pain, and cellulitis.


After the observational part of the study was completed, a questionnaire was distributed to all interns/residents (PGY-1 and PGY-2), which asked them to estimate the average time they spent on history taking and physical examination of a new admission to the medical service. They were also asked to estimate how often (percentage of time) they personally completed 34 separate elements of the medical history and how often (percentage of time) they personally performed 26 elements of a physical examination. These elements were identical to the 60 elements the observer evaluated during the observed history taking and physical examination. The interns/residents were told the survey was anonymous and were encouraged to answer the questions honestly. No identifying information such as name, PGY, or sex was asked on the questionnaire to help ensure anonymity. Of 50 questionnaires distributed 43 were completed. Participation in the survey was voluntary.


The yes/no answers on the checklist were converted into codes (0 = no/not done, 1 = yes/done, 9 = not applicable). The sex of the patient was also coded (0 = female, 1 = male). In the chart evidence section of the checklist, the following codes were designated (1 = completed during encounter, recorded completed in the chart; 2 = completed during encounter, did not record in chart; 3 = did not complete during encounter, did not record in chart; 4 = did not complete during encounter, but, recorded completed in chart). The codes were entered into a software program called SPSS Version 11 (SPSS Inc, Chicago, IL) and statistical analysis used χ2. The identities of the interns/residents, the patients, and the hospital were all kept anonymous.

This research was approved by the institutional review board of the hospital where the study took place.


Direct Observation and Chart Review

History—There were 25 patient encounters. In 36%, interns/residents did not introduce themselves to the patient. In 72%, the intern/resident did not explain what s/he was there to do.

The average length of time, minimum and maximum for both the history-taking and physical-examination portions are seen in Table 1. In 64%, the amount of time spent during history taking was ≤7 minutes. In 32%, the time spent for history taking was ≤5 minutes, and in one case, 2 minutes. However, in 16%, the time spent for history taking was 12 to 15 minutes. Table 2 shows the frequency of occurrence by percentage, for the 36 variables that were asked by the intern/resident during history taking. All patients (100%) were asked about their current medications, however, in 96% of the cases, the patients were not asked if they were taking those medications regularly, as prescribed. Patients were asked about their chief complaint (96%) or when their symptoms started (96%). A majority was asked about symptoms such as chest pain (88%), cough (80%), nausea or vomiting (80%), whereas questions about other symptoms were asked in only a minority of the encounters (ie, urinary problems [36%], visual problems [24%], and joint pain [20%]).

There were 5 variables (level of education, salt intake, weight loss/gain, sexually transmitted infections, or erectile problems) that were not addressed in any of the 25 encounters. Other important historic questions that were asked in <50% of the encounters included: allergies (44%), prior surgeries (32%), family history (28%), dietary history (12%), and occupation history (4%).

Physical Examination—For the physical examination in the 25 encounters, 68% took ≤5 minutes, 84% took ≤6 minutes. In 8%, the physical examination took 3 minutes. On the other hand, in one case, one examiner took 20 minutes and performed a thorough physical examination.

Table 3 shows the number and percentage of cases correctly performed during the physical portion of the examination. No patients were unnecessarily exposed and all patients had cardiac, abdominal, and pulmonary examinations to some extent. In 84% of cases, breath sounds were examined over the gown, and in 76%, cardiac auscultations were performed over the gown. No intern/resident independently took the patient's blood pressure. In 92% of the encounters, pulse was not measured. No intern/resident examined the patient's fundi, felt the carotids, checked the thyroid, or performed a pelvic examination. In a minority of cases, the examiner tested eye movements (4%), tested reflexes (12%), and observed the patient walk (8%). A rectal exam was asked for or performed in only 1 patient (4%) despite 5 patients (20%) presenting with abdominal pain. No pelvic exams were requested despite 2 women patients presenting with abdominal pain. Of the 24 physical exam variables evaluated, 12 were performed <10% of the time and 7 of those variables were never performed during the 25 witnessed examinations.

Chart Review—Figure 1 demonstrates discrepancies in patient chart documentation by the intern/resident between what s/he tested on physical examination and what s/he documented in the written history and physical for each of 7 evaluated variables. A significant number of training physicians misrepresented that they performed tests, when in fact they had not (eye movements 60%, pupils and accommodation 80%; blood pressure 100%, pulses 44%, reflexes 20%, muscle strength 44%, and rectal examination 24%). In no cases did a physician examine a variable and fail to document it. Table 4 shows frequency tables on accuracy of documentation in patients' charts.

Intern/Resident Survey

History—On the survey, interns/residents were asked about the 36 historic variables. In all variables except one (current medications), the estimate of tasks completed by the interns/residents was greater, and sometimes significantly greater than the observed frequency. Questions about 8 variables (hearing, depression, occupational history, weight gain/loss, level of education, salt intake, erectile problems, and sexually transmitted infections) were asked <10% of the time although it was estimated each was asked more often, ranging from erectile problems (46%) to occupational history (74 %).
These results contrast with the estimated length of time interns/residents reported on the survey that they spend. The mean amount of time they estimated spending on history taking was 28 minutes (minimum 8 minutes; maximum 90 minutes) vs actual time 7 minutes (p < 0.001); whereas the mean time they estimated performing a physical examination was 15 minutes (minimum 5 minutes; maximum 45 minutes ) vs actual time of 5 minutes ( p < 0.001).

Physical Exam—On the survey completed by the interns/residents, in 22 out of 24 physical examination variables, estimated compliance was statistically higher than actual compliance. Six elements of the physical examination were never observed although they were reported to have been performed, from testing fundi examination (9%) to testing pupillary accommodation (69%).

A Single-Blinded, Direct Observational Study of PGY-1 Interns and PGY-2 Residents in Evaluating their History Taking and Physical Examination Skills

A Single-Blinded, Direct Observational Study of PGY-1 Interns and PGY-2 Residents in Evaluating their History Taking and Physical Examination Skills

A Single-Blinded, Direct Observational Study of PGY-1 Interns and PGY-2 Residents in Evaluating their History Taking and Physical Examination Skills

A Single-Blinded, Direct Observational Study of PGY-1 Interns and PGY-2 Residents in Evaluating their History Taking and Physical Examination Skills

A Single-Blinded, Direct Observational Study of PGY-1 Interns and PGY-2 Residents in Evaluating their History Taking and Physical Examination Skills


The results obtained during this study demonstrated widespread deficiencies in both completeness of history taking and physical examination, and in the integrity of the written report. The study conducted at this institution was extremely important to elucidate intern/resident practices and the single-blinded nature allowed a level of objectivity in assessing medical care for newly admitted patients.

Although it is expected that interns/residents will read notes written in the Emergency Department before commencing the patient encounter on the medical floor, they are taught to complete a thorough history and physical examination. It is unacceptable that in 36% of patient encounters, the interns/residents did not introduce themselves to the patient, but instead immediately began questioning upon entering the room. Patients are reassured if interns/residents explain what they are there to do. These improvements in communication and bedside manner add expected patient benefit.
As shown in Table 1, the amount of time spent on each portion of the examination appears greatly inadequate. Of note, on the survey, interns/residents estimated that they spend an average of 28 minutes for history taking and 15 minutes for physical examination.

The extent of the inadequacies of the interns/residents in performing basic skills in obtaining histories and physical exams is remarkable. Many of the interns/residents omitted a number of questions, which contributed to less time being spent on history taking of the newly admitted patient. Even though there were some patients with chest pain and shortness of breath, no intern/resident asked about weight gain or salt use. Less than half of the interns/residents inquired into such basic areas as allergies, prior surgeries, or family history of medical problems. About 25% of patients were not asked about basic health issues such as smoking, alcohol use, or illicit drug use. Although nearly all interns/residents asked the patient's chief complaint, when the symptoms started, and what the current medications were, these questions were rarely followed up with detailed questions to fully develop the nature of the patient's present illness. In one case in which the patient complained of penile pain, the intern/resident did not ask any questions concerning sexual activity, sexually transmitted infections, erectile problems, or urinary symptoms.

The deficiencies seen on the physical examinations were even more pronounced. Although every patient had, to some extent, an examination of the abdomen, chest, and heart, those examinations were performed over the gown approximately 80% of the time. No other element of the physical examination (with the exception of listening to bowel sounds) was performed more than 40% of the time. Only 1 of the 6 women patients <55 years of age was asked about last menstrual period. Only 1 of the 14 women patients was asked about most recent mammogram. Of the 4 neurologic cases, minimental status exams were not performed. Twelve of the 26 elements evaluated occurred <10% of the time and no intern/resident measured a patient's blood pressure, examined the carotids or thyroid, or performed a pelvic exam. In fairness, one would only expect a rectal or pelvic exam to be requested under appropriate medical conditions, ie abdominal pain.

These results were in stark contrast to the data obtained in the survey. Although there was direct observation of 30% of the interns/residents, the returned surveys sampled 86% of the interns/residents. Although direct correlation is not possible, agreement was reached between the Department Chairman, the Program Director, and the Medical Researcher that, because of their extensive experience training and managing interns/residents in this program, associations could be suggested and potential explanations offered.

Of the 5 questions in the history that were not asked, the survey reported that the questions are usually asked 46% to 72% of the time. In the physical examination, 7 items were never examined yet the survey reported that they routinely test these items an average of 43% of the time. No intern/resident personally took a blood pressure, and only 8% actually measured the pulse; yet the survey reported that they personally took the patient's blood pressure 49% of the time and measured the pulse 59% of the time. Interns/residents listened to the lungs under the gown only 16% of the time, although they estimated doing so 89% of the time.

Unfortunately the study also confirmed faculty concerns that there are multiple discrepancies in the charts. For the 7 variables in Figure 1, the percentage of discrepancies in documented examinations ranges from 20% to 100%. It may be common practice to record a blood pressure even if you didn't personally measure it, however, the practice could be considered a subtle form of intellectual deception. This misrepresentation can be minimized by documenting the source of the result or finding in the intern/resident note. The interns/residents also miss the opportunity to see if important vital signs change during the hospital course.

Completeness of history taking and physical examination practiced during patient encounters is encouraged so that interns/residents may make their own proper assessment and treatment plan. As well, a more thorough history taking and physical examination would make the intern/resident aware of other significant health issues warranting attention.

What are possible explanations for the performance demonstrated in this study? It is difficult to explain these large differences on perception alone. Other operative factors for the inflated estimates in the survey could include: fear of discovery, subject to more control and/or scrutiny, fear of affecting the program's reputation, a sense of shame about actual performance, and fear of offending the Program Director. To what extent any of these factors (or any other factors) is operative is impossible to determine. In terms of medical knowledge, the interns/residents at the institution have been tested in several ways. The average in training score is several points above the national average (58 percentile vs 55 percentile). The pass rate on the boards is consistently near 100%.

At orientation, the medical leadership emphasizes the importance of compulsiveness; and more importantly, they emphasize the necessity for integrity in every aspect of medical practice. It has been stated multiple times to interns/residents that medical mistakes, although regrettable, will be tolerated, but there is no tolerance for dishonesty. Each week the Department Director conducts chief-of-service rounds in which the major emphasis is on proper history-taking and physical-examination techniques. All interns/residents are observed performing CEX examinations with largely satisfactory results. Frequent departmental chart review and morbidity/mortality reviews have not revealed anything to suggest the problems seen in this study.

Even though in their surveys, many interns/residents wrote that they do things when they feel they are appropriate and perform focused histories and examinations, they have been trained to perform a complete history taking and physical examination. The ability to persuade new interns/residents of the validity of this argument is somewhat diluted by the historic insistence to demand a complete multifaceted history and physical examination that includes some elements that rarely affect patient care. Furthermore, this argument was less persuasive when it was observed that, in patients with congestive heart failure, the intern/resident still didn't ask about salt intake or weight gain, or in patients with abdominal pain, no rectal examination or inquiry concerning last menstrual period was entertained, or in the patient with penile pain, no sexual history was taken. It is unclear what the most effective approach would be to change these behaviors.


Single-blinded studies of interns/residents are difficult to conduct because direct observation of too many encounters over an extended period of time could alert them to a study and be communicated to colleagues, perhaps even jeopardizing future single-blinded studies. It was also important to minimize the Hawthorne effect by using an observer unknown to the interns/residents, and to prevent the examiners from noticing the evaluator's notetaking during the patient encounter. Given this, before this study began, it was determined that 25 patient encounters using 15 different interns/residents would be sufficient to reach valid and reliable conclusions about the attention given to newly admitted patients on the medical floors. The study took place in August of the academic year.


This single-blinded, direct observational study delineated systematic deficiencies in the thoroughness of history taking and physical examinations conducted by interns/residents. The chart review portion provided an accuracy comparison of the observed physical examination to the intern/resident's documentation. The study also demonstrated what real patients, newly admitted to the medical units, faced when encountering interns/residents under everyday, nontesting circumstances. The Hawthorne effect may play a key role in the performances of interns/residents in previously published studies not blinded to the examiner. More studies with a single-blinded approach are needed to get a true picture.

Disclosure Statement

The author(s) have no conflicts of interest to disclose.

1.    Holmboe ES. Faculty and the observation of trainees' clinical skills: problems and opportunities. Acad Med 2004 Jan;79(1):16-22.
2.    Holmboe ES, Hawkins RE, Huot SJ. Effects of training in direct observation of medical residents' clinical competence: a randomized trial. Ann Intern Med 2004 Jun 1;140(11):874-81.
3.    Li JT. Assessment of basic physical examination skills of internal medicine residents. Acad Med 1994 Apr;69(4):296-9.
4.    Jouriles NJ, Emerman CL, Cydulka RK. Direct observation for assessing emergency medicine core competencies: interpersonal skills. Acad Emerg Med 2002 Nov;9(11):1338-41.
5.    Norcini JJ, Blank LL, Arnold GK, Kimball HR. The mini-CEX (clinical evaluation exercise): a preliminary investigation. Ann Intern Med 1995 Nov 15;123(10):795-9.
6.    Norcini JJ, Blank LL, Duffy FD, Fortna GS. The mini-CEX: A method for assessing clinical skills. Ann Intern Med 2003 Mar 18;138(6):476-81.
7.    Hatala R, Ainslie M, Kassen BO, Mackie I, Roberts JM. Assessing the mini-Clinical Evaluation Exercise in comparison to a national specialty examination. Med Educ 2006 Oct;40(10):950-6.
8.    Malhotra S, Hatala, R, Courneya CA. Internal medicine residents' perceptions of the Mini-Clinical Evaluation Exercise. Med Teach 2008;30(4):414-9.
9.    Blank LL, Grosso LJ, Benson JA Jr. A survey of clinical skills evaluation practices in internal medicine residency programs. J Med Educ 1984 May;59(5):401-6.
10.    Nair BR, Alexander HG, McGrath BP, et al. The mini clinical evaluation exercise (mini-CEX) for assessing clinical performance of international medical graduates. Med J Aust 2008 Aug 4;189(3):159-61.
11.    Durning SJ, Cation LJ, Markert RJ, Pangaro LN. Assessing the reliability and validity of the mini-clinical evaluation exercise for internal medicine residency training. Acad Med 2002 Sept.;77(9):900-4.
12.    Sloan DA, Donnelly MB, Johnson SB, Schwartz RW, Strodel WE. Assessing surgical residents' and medical students' interpersonal skills. J Surg Res 1994 Nov;57(5):613-8.
13.    McIlroy JH, Hodges B, McNaughton N, Regehr G. The effect of candidates' perceptions of the evaluation method on reliability of checklist and global rating scores in an objective structured clinical examination. Acad Med 2002 Jul;77(7):725-8.
14.    Wass V, Jolly B. Does observation add to the validity of the long case? Med Educ 2001 Aug;35(8):729-34.
15.    Ognibene AJ, Jarjoura DG, Illera VA, Blend DA, Cugino AE, Whittier FC. Using chart reviews to assess residents' performances of components of physical examinations: a pilot study. Acad Med 1994 Jul;69(7):583-7.
16.    Cohen DS, Colliver JA, Marcy MS, Fried ED, Swartz MH. Psychometric properties of a standardized-patient checklist and rating-scale form used to assess interpersonal and communication skills. Acad Med 1996 Jan;71(1 Suppl):S87-9.
17.    Han JJ, Kreiter CD, Park H, Ferguson KJ. An experimental comparison of rater performance on an SP-based clinical skills exam. Teach Learn Med 2006 Fall;18(4):304-9.
18.    Day RP, Hewson MG, Kindy P Jr., Van Kirk J. Evaluation of resident performance in an outpatient internal medicine clinic using standardized patients. J Gen Intern Med 1993 Apr;8(4):193-8.
19.    Boulet JR, McKinley DW, Norcini JJ, Whelan GP. Assessing the comparability of standardized patient and physician evaluations of clinical skills. Adv Health Sci Edu Theory Pract 2002;7(2):85-97.
20.    Hassett JM, Zinnerstrom K, Nawotniak RH, Schimpfhauser F, Dayton MT. Utilization of standardized patients to evaluate clinical and interpersonal skills of surgical residents. Surgery 2006 Oct;140(4):633-8.
21.    MacRae HM, Vu NV, Graham B, Word-Sims M, Colliver JA, Robbs RS. Comparing checklists and databases with physicians' ratings as measures of students' history and physical-examination skills. Acad Med 1995 Apr;70(4):313-7.
22.    Joshi R, Ling FW, Jaeger J. Assessment of a 360-degree instrument to evaluate residents' competency in interpersonal and communication skills. Acad Med 2004 May;79(5):458-63.
23.    Jagadeesan R, Kaylen DN, Lee P, Stinnett S, Challa P. Use of a standardized patient satisfaction questionnaire to assess the quality of care provided by ophthalmology residents. Ophthalmology 2008 Apr;115(4):738-43.


27,000 print readers per quarter, 9,725 eTOC readers, and in 2016, 1.4 million page views on TPJ articles in PubMed from a broad international readership.

The Permanente Press

Sponsored by the National Permanente Medical Groups, The Permanente Press publishes The Permanente Journal and books related to Kaiser Permanente and health care.


Articles, editorials, letters to the editor, and other material represent the opinion of the authors. Send your comments to permanente.journal@kp.org.

Copyright 2017 Kaiser Permanente - The Permanente Journal. All Rights Reserved.