Journal of Cardiovascular Medicine and Cardiology
1Department of Internal Medicine, Hôpitaux Universitaires de Strasbourg, EA 3072, Université de Strasbourg, Strasbourg, France
2Laboratory of Nanomedicine and Health, UTBM Belfort-Montbéliard, France
Cite this as
Andrès E, Lorenzo-Villalba N, El Hassani AH. Environmental and External Determinants of Human Auscultation and Implications for Digital Stethoscope Development. J Cardiovasc Med Cardiol. 2026;13(3):34-41. Available from: 10.17352/2455-2976.000241
Copyright License
© 2026 Andrès E, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Background: Cardiac and pulmonary auscultation, a cornerstone of clinical examination since Laennec’s invention of the stethoscope in 1816, is profoundly influenced by a constellation of environmental and external determinants that are rarely systematically addressed. Understanding these factors is essential both for optimizing human auscultatory performance and for guiding the rational design of digital stethoscopes incorporating artificial intelligence (AI).
Methods: We conducted a comprehensive narrative review of the PubMed/MEDLINE literature, integrating evidence from acoustic physics, clinical medicine, signal processing, and biomedical engineering to characterize the key environmental, patient-related, operator-related, and technological determinants of auscultatory quality and their implications for digital stethoscope development.
Results: Ambient acoustic noise in clinical environments—routinely exceeding WHO recommendations by 10–50 dB—constitutes the primary extrinsic impediment to reliable auscultation. Patient-related factors (body habitus, chest wall composition, clothing) and operator-related factors (hearing capacity, training level, diagnostic fatigue) further degrade performance. The design of digital stethoscopes must integrate adaptive noise cancellation, broadband frequency capture (20–2000 Hz), and AI-powered signal classification to overcome these limitations. Telemedicine and wearable applications add further constraints including Bluetooth transmission fidelity, sensor-skin coupling, and motion artifact management.
Conclusions: A systematic understanding of environmental and external determinants of auscultation is essential to harness the full diagnostic potential of AI-integrated digital stethoscopes and to standardize their development, validation, and deployment across diverse clinical settings.
Since René Théophile Hyacinthe Laennec fashioned the first stethoscope in 1816, cardiac and pulmonary auscultation has remained a central pillar of clinical medicine [1]. For two centuries, the stethoscope has served as both a diagnostic instrument and a symbolic bond between physician and patient. Yet despite its enduring legacy, the reliability of auscultation is shaped by a complex web of determinants that extend far beyond the clinician’s knowledge base [2]. Environmental noise levels, patient morphology, operator hearing capacity, instrument acoustic properties, and increasingly, the digital infrastructure through which sounds are captured and analyzed, collectively define what is—and what is not—audible at the bedside [3].
The clinical consequences of suboptimal auscultation are substantial. Studies conducted across the past three decades have documented a progressive decline in cardiac auscultatory skills among medical trainees and practicing physicians [3]. In a landmark multicenter study, Vukanovic-Criley et al. found that fewer than half of medical students, residents, and attending physicians could reliably identify common cardiac murmurs under controlled conditions [4]. This erosion of a foundational clinical skill coincides with the increasing medicalization of hospitals—environments that are acoustically hostile, with ambient noise levels routinely exceeding the thresholds at which diagnostic heart sounds can be reliably perceived.
In parallel, the emergence of digital stethoscopes equipped with electronic amplification, active noise cancellation, and AI-based sound classification offers the prospect of overcoming many of these human and environmental limitations [2-4]. However, the rational design of such instruments requires a systematic characterization of the very determinants that degrade conventional auscultation. Without this foundation, digital solutions risk replicating or even amplifying the biases and artifacts that undermine acoustic examination in clinical practice.
This review synthesizes evidence from acoustic physics, clinical medicine, signal processing, and biomedical engineering to provide a comprehensive analysis of the environmental and external factors that determine auscultatory performance, and to delineate their implications for the development and validation of next-generation digital stethoscopes. We draw upon our own experience with the ASAP digital stethoscope project and the eStetho platform evaluated in a 857-patient concordance study at the Hôpitaux Universitaires de Strasbourg (Strasbourg, France).
The clinical environment is among the noisiest non-industrial settings that humans routinely inhabit. As early as 1972, Shapiro and Berland reported that noise levels in operating rooms regularly exceeded 80 dBA—more than double the 35–40 dBA threshold recommended by the U.S. Environmental Protection Agency and the World Health Organization for healthcare facilities [5,6]. Falk and Woods subsequently documented similar levels in intensive care units (ICUs), with median noise levels of 60–70 dBA and frequent transient peaks above 85 Dba [è].
A 2021 systematic review by de Lima Andrade et al., encompassing 39 studies across diverse hospital settings, confirmed that environmental noise in hospitals consistently exceeds international guidelines across all clinical areas—including wards, emergency departments, operating theaters, and ICUs [8]. Qutub and El-Said measured noise levels of 52–68 dBA in a university hospital ICU throughout the day, with nocturnal peaks reaching 80 dBA driven largely by medical device alarms and ventilator noise [9]. A more recent analysis by Koomen et al. demonstrated that alarm-related noise in ICUs creates an acoustic environment in which the detection of subtle heart sounds—particularly the third and fourth heart sounds (S3 and S4)—is effectively impossible without electronic amplification [10].
The spectral composition of clinical noise is as important as its intensity. Mechanically ventilated patients generate continuous broadband noise spanning 60–600 Hz—precisely the frequency range in which the diagnostically critical components of heart sounds (20–500 Hz) and lung sounds (70–2000 Hz) are concentrated. HVAC ventilation systems contribute a persistent low-frequency hum (50–120 Hz) that is particularly insidious because it resides in the same band as S3 and S4 gallop sounds and low-grade diastolic murmurs. Intravenous infusion pumps, electrocardiographic monitors, and pulse oximeters each contribute characteristic spectral signatures that digital stethoscopes must be trained to recognize and suppress [9,10].
Perhaps most underappreciated is the noise introduced directly at the chest-piece interface. Clothing friction, hair resistance, and skin moisture can generate artifact amplitudes that exceed those of the underlying cardiac signals, particularly in patients with soft or muffled heart sounds [10]. This mandates direct skin contact as a non-negotiable protocol requirement for reliable auscultation—a constraint that must be explicitly encoded in telemedicine and remote monitoring workflows (Table 1).
The human thorax constitutes an acoustic transmission medium whose properties vary substantially across individuals. In its simplest formulation, the transmission of cardiac vibrations from the heart to the stethoscope chest-piece obeys the principles of acoustic wave propagation through a multilayered medium: each tissue interface—epicardium, pericardium, lung parenchyma, chest wall—represents a boundary at which partial reflection and absorption occur [3, 9, s10]. Adipose tissue, in particular, has a high acoustic impedance and strong frequency-dependent attenuation, effectively functioning as a low-pass filter that preferentially suppresses the higher-frequency components of murmurs.
Kalinauskienė et al. demonstrated in a controlled study that electronic stethoscopes significantly outperform conventional acoustic devices in the detection of cardiac murmurs in obese patients (BMI >30 kg/m²), with sensitivity improvements of up to 18 percentage points [11]. Atkinson et al. subsequently used computational hemoacoustic modeling to quantify the effect of body habitus on cardiac auscultation, showing that a doubling of subcutaneous adipose tissue thickness reduces the surface intensity of a simulated aortic stenosis murmur by up to 12 dB—equivalent to a four-fold reduction in perceived loudness [12]. These findings are directly relevant to digital stethoscope design: algorithms trained primarily on lean individuals from reference datasets risk systematic underperformance in obese populations that are increasingly prevalent in clinical practice.
Beyond baseline morphology, acute and chronic cardiopulmonary pathologies dramatically alter the acoustic properties of the chest. Pericardial and pleural effusions introduce fluid layers that muffle cardiac vibrations, while pneumothorax and subcutaneous emphysema cause destructive acoustic interference that renders auscultation unreliable [3]. These conditions are precisely those in which accurate auscultatory diagnosis is most urgently needed, yet the physics of sound transmission are most unfavorable. AI-integrated digital stethoscopes must incorporate adaptive models that recognize these pathology-specific acoustic signatures rather than defaulting to population-mean thresholds.
Tachycardia and arrhythmias present a distinct category of challenge: when the cardiac cycle falls below approximately 400 ms (heart rate >150 bpm), the temporal overlap between systolic and diastolic events prevents reliable segmentation of individual heart sounds. Machine learning segmentation algorithms trained on normal sinus rhythm datasets—which represent the majority of publicly available phonocardiogram repositories such as PhysioNet and the CirCor DigiScope dataset—perform substantially worse in arrhythmic patients [13,14] (Table 2).
Te human ear is not a passive receiver of cardiac sound: its sensitivity, frequency response, and susceptibility to fatigue all determine what the clinician ultimately perceives through the stethoscope [3-5,18]. Age-related sensorineural hearing loss (presbycusis) affects the cochlear hair cells responsible for low-frequency transduction—precisely the frequency band most critical for cardiac auscultation. The fundamental components of S1 and S2 span approximately 10–140 Hz; aortic regurgitation produces characteristic low-pitched diastolic murmurs below 250 Hz; and third and fourth heart sounds are predominantly sub-100 Hz phenomena. A physician with moderate presbycusis may therefore be functionally deaf to an entire category of pathological sounds, irrespective of technique or stethoscope quality [3, 20-31]
This problem is compounded by the prevalent use of hearing aids, which are optimized for speech frequencies (500–4000 Hz) and may amplify ambient noise while failing to compensate for the low-frequency auscultatory range. Electronic stethoscopes with programmable frequency enhancement offer a partial solution, allowing individualized amplification profiles calibrated to the operator’s audiogram. The development of AI-mediated visual phonocardiographic displays—translating acoustic events into real-time waveforms visible on a tablet or smartphone screen—represents a more fundamental solution by decoupling diagnostic interpretation from the operator’s auditory capacity entirely [15-32].
Beyond audiological limitations, the cognitive dimension of auscultation—pattern recognition, attention, and interpretive expertise—constitutes a critical and increasingly precarious determinant of diagnostic performance. The progressive decline of cardiac auscultatory skills among medical trainees is now well-documented. Mangione and Nieman’s foundational 1997 study revealed that internal medicine and family practice residents correctly identified only 20% of cardiac sounds in a simulated examination [3]. Gaskin et al. subsequently documented that pediatric residents achieved similar rates of misidentification for common pathological murmurs [23].
A 2025 repeated cross-sectional analysis found that overall cardiac auscultation test scores among medical students declined by 0.15 points per year over an 11-year observation period—a trajectory attributed to the increasing burden of medical knowledge, reduced direct patient contact, and the displacement of bedside examination by imaging. Loong demonstrated that auscultatory proficiency deteriorated significantly after 8 years of post-graduation practice, underscoring the absence of continuous reinforcement mechanisms in postgraduate medical education [22].
Digital stethoscopes with integrated AI decision support represent a structural response to this skills crisis: by providing real-time classification of detected sounds with probability scores and educational annotations, they transform the auscultatory encounter into a self-directed learning opportunity rather than a unidirectional diagnostic event. Nielsen et al. demonstrated that simulation-based auscultation training using electronic platforms significantly improved both screening sensitivity and diagnostic specificity among trainees [31] (Table 3).
The conventional acoustic stethoscope is a coupled mechanical system in which the clinician’s hand, the chest-piece, the tubing, and the ear-canals function as interdependent acoustic elements. The physics of this system—notably the pressure-dependent frequency response of the diaphragm, the resonance characteristics of the air column in the tubing, and the sealing properties of the ear-tips—determine the frequency range and signal-to-noise ratio of the transmitted sound. Abella et al. demonstrated that low-frequency sounds (37.5–112.5 Hz) are amplified by the bell but attenuated by most diaphragms—a finding with direct implications for the detection of S3 gallops in heart failure patients [15].
The coupling between the chest-piece and skin is critically dependent on applied pressure: insufficient pressure allows air leaks that attenuate all frequency bands, while excessive diaphragm pressure converts the chest-piece to a bell-like mode that filters out higher frequencies. Acoustic characterization studies using synchronized phonocardiography and electrocardiography have confirmed that the coupling variability introduced by inconsistent operator technique represents a major source of between-examiner diagnostic discordance [22-33].
The Leng et al. review of electronic stethoscope technology highlighted that ambient noise corruption of heart sounds is the primary technical limitation of conventional devices, and that dual-microphone configurations—with one microphone measuring ambient noise for real-time subtraction—represent the most effective engineering solution currently available [3,16]. This dual-transducer approach, combined with digital adaptive filtering based on the least-mean-square (LMS) or normalized LMS (NLMS) algorithm, achieves ambient noise reduction ratios of 20–35 dB across the clinically relevant frequency spectrum.
A fundamental limitation of both conventional and many early electronic stethoscopes is their inadequate frequency response below 50 Hz. This is precisely the domain in which S3 and S4 sounds—clinically significant markers of ventricular dysfunction—reside. Reichert et al.’s comprehensive analysis of respiratory sound analysis methods demonstrated that the diagnostically critical components of crackles, wheezes, and rhonchi span 100–2000 Hz, while the low-pitched wheeze of fixed airway obstruction occupies the 100–200 Hz band [17]. Digital stethoscopes designed for pulmonary medicine must therefore achieve a flat frequency response across a substantially wider bandwidth than cardiac-optimized devices [3-9].
The integration of MEMS (micro-electromechanical system) microphones—with flat responses from 20 to 20,000 Hz—into digital chest-pieces eliminates the frequency-selective distortion inherent to conventional diaphragms. Park et al. demonstrated that a multimodal smart stethoscope incorporating MEMS sensors with simultaneous photoplethysmographic (PPG) recording achieved reliable detection of S1/S2 across a BMI range of 18–38 kg/m² with intraclass correlation coefficients exceeding 0.92 [34].
The denoising of phonocardiographic (PCG) signals is the foundational pre-processing step upon which all subsequent AI-based classification depends. Two broad classes of algorithm are in current use: transform-based methods (wavelet decomposition, empirical mode decomposition) and adaptive filtering methods (LMS, NLMS, sign-error LMS). Kuresan et al. demonstrated that a multistage cascaded Sign Error LMS adaptive filter achieved SNR improvements of 8–10 dB in Gaussian noise environments and 2–3 dB in pink noise—performance sufficient to render previously inaudible S3 sounds detectable in ICU-equivalent noise conditions [26]. For pediatric auscultation in resource-limited settings, Emmanouilidou et al.’s adaptive noise suppression framework achieved real-time PCG denoising on portable devices without perceptible processing latency [19].
The challenge of separating simultaneously recorded heart and lung sounds—an inevitable feature of thoracic auscultation—requires source separation algorithms based on non-negative matrix factorization or independent component analysis. Fraiwan et al.’s curated dataset of simultaneously recorded cardiac and pulmonary sounds, obtained with an electronic stethoscope from the chest wall, has become a standard benchmark for validating separation algorithms and has demonstrated that blind source separation approaches can achieve mean square errors below 0.01 in controlled conditions [27].
The past five years have witnessed a rapid expansion in deep learning applications to phonocardiographic classification. Chorba et al. developed and validated a deep learning algorithm deployed on a digital stethoscope platform for automated murmur detection, achieving a sensitivity of 76.3% and specificity of 91.4% for clinically significant murmurs across a diverse patient cohort—performance approaching that of expert cardiologists [13]. Zhang et al. described a low-cost AI-empowered stethoscope integrating a lightweight convolutional neural network trained on combined cardiac and pulmonary sound datasets, achieving classification accuracy of 89.2% for abnormal heart sounds and 87.4% for pathological lung sounds in real clinical settings [33].
Huang et al.’s authoritative 2023 review of deep learning-based lung sound analysis emphasized that environmental noise adaptation remains the primary performance bottleneck separating laboratory accuracy from real-world deployment [18]. Their analysis of publicly available datasets—PhysioNet, CirCor DigiScope, ICBHI Challenge—revealed that most published algorithms were trained on signals recorded in controlled, low-noise conditions, with an accuracy drop of 15–25 percentage points when evaluated in simulated ICU noise. The CirCor DigiScope dataset, comprising over 5000 pediatric cardiac recordings from community health screening campaigns in Brazil, offers one of the few large-scale repositories that approximates real-world noise conditions.
The reproducibility and generalizability of AI-based auscultation algorithms depend critically on the quality and diversity of training data. Liu et al. established the gold standard for open-access PCG databases with the PhysioNet Challenge 2016 dataset, which provided annotated recordings from multiple stethoscope types across geographically diverse populations. Goldberger et al.’s PhysioNet infrastructure—now hosting over 80 curated cardiopulmonary signal repositories—has become the de facto benchmark ecosystem for algorithm development and validation. However, the representation of adverse acoustic conditions (high ambient noise, obese patients, pediatric populations, and non-native speaker interactions with AI-guided examination prompts) remains insufficient and constitutes a priority for future dataset curation.
The COVID-19 pandemic catalyzed a paradigm shift in the delivery of clinical examination, with telemedicine transitioning from a supplementary modality to a primary care infrastructure within weeks of the initial outbreak. The simultaneous imperatives of infection control and continuity of cardiopulmonary monitoring created urgent demand for validated remote auscultation solutions. Hirosawa et al. conducted the first randomized controlled pilot trial of real-time remote auscultation using a Bluetooth-connected electronic stethoscope in 2021, demonstrating concordance rates of 82% with in-person examination for pathological lung sounds—sufficient for screening but insufficient for definitive diagnosis without supplementary imaging [20].
Sethi et al.’s comprehensive 2022 review of digital stethoscopes in modern clinical practice documented the rapid proliferation of commercially available remote auscultation platforms—including Eko, Thinklabs, eKuore, and Stemoscope—and highlighted the absence of standardized performance benchmarks across devices and clinical contexts [29]. Seah et al.’s 2023 systematic review confirmed that the acoustic fidelity of Bluetooth-transmitted sounds is adequate for detection of grade II/VI murmurs and coarse crackles but falls below detection threshold for grade I/VI murmurs and fine crackles in the presence of ambient noise above 60 dBA [21].
The feasibility of digital stethoscopes in specialized telecardiology visits was demonstrated by Bhattacharyya et al. in a 2023 study of interstage monitoring in infants with palliated congenital heart disease [28]. Using the Eko CORE digital attachment with a pediatric Littmann stethoscope, the authors conducted 52 telecardiology visits in 16 patients over one year, demonstrating caregiver-acquired sound quality rated as “good” or “excellent” in 81% of transmissions. These results underscore both the potential and the current limitations of home-based auscultation: caregiver training, device placement variability, and home acoustic environments introduce noise sources not encountered in clinical settings.
The miniaturization of acoustic sensors has enabled the development of wearable chest patch stethoscopes capable of continuous, automated cardiopulmonary monitoring without clinical supervision [33-35]. Mohamed et al. described a wearable patch system combining a MEMS microphone array with an adaptive noise cancellation algorithm, achieving continuous recording of heart and lung sounds in postoperative intensive care patients with a signal quality index exceeding 0.85 [35]. Park et al.’s multimodal smart stethoscope integrated PCG, photoplethysmography, and inertial measurement unit data streams to enable simultaneous cardiac sound classification and motion artifact compensation [34].
The Internet of Medical Things (IoMT) framework connects wearable stethoscopes with cloud-based AI engines via low-power wireless protocols (Bluetooth Low Energy, Zigbee, LoRaWAN), enabling real-time arrhythmia and murmur alerts. However, the energy constraints of wearable devices impose computational limits on on-device AI processing, requiring careful architectural trade-offs between model complexity, inference latency, and battery life—trade-offs that are directly influenced by the ambient noise rejection requirements of the deployment environment.
In ambulatory and wearable auscultation, motion artifact supersedes ambient acoustic noise as the dominant extrinsic signal corruption mechanism. Patient movement generates mechanical vibrations transmitted directly through the chest wall to the sensor at intensities that can exceed cardiac sounds by 20–30 dB. Accelerometer-based motion artifact detection, with adaptive gating of acoustic recording windows to quiescent phases of the respiratory and movement cycle, is the standard engineering response, but introduces gaps in coverage that may miss paroxysmal events. Advanced deep learning models trained on motion-corrupted datasets—a category conspicuously absent from most public benchmarks—are required to address this limitation for continuous ambulatory monitoring.
The determinants of auscultatory quality form a hierarchical structure: at the base lie the immutable physical laws of acoustic transmission through biological tissues; above them, the modifiable extrinsic factors of environmental noise and stethoscope design; and at the apex, the cognitive and perceptual capacities of the human observer [36-40]. Digital stethoscopes with AI integration address each of these levels simultaneously—replacing the frequency-limited acoustic channel with a broadband electronic sensor, substituting adaptive algorithms for human noise tolerance, and augmenting pattern recognition with machine learning classifiers.
Four critical areas require prioritized research investment. First, the development of environmental-noise-robust AI training datasets that systematically sample the acoustic conditions encountered in ICUs, emergency departments, ambulances, and community settings is imperative [13,39,40]. Current benchmarks—PhysioNet, CirCor DigiScope, ICBHI—are overwhelmingly derived from controlled, low-noise conditions that misrepresent the operational reality. Second, regulatory harmonization around performance standards for digital stethoscopes—specifying minimum frequency response, noise rejection, and AI classification performance thresholds as functions of the acoustic environment—is urgently required.
Third, the integration of digital stethoscopes into telemedicine workflows must be accompanied by structured training programs for both clinicians and—in home monitoring applications—patients and caregivers [13]. The 18% decline in murmur detection reported in simulated remote auscultation protocols compared to in-person examination reflects not only technical limitations but also the absence of standardized placement protocols and quality feedback mechanisms. Fourth, the particular vulnerability of obese patients, elderly patients with presbycusis, and those with complex thoracic anatomy to diagnostic failure by conventional auscultation must be explicitly addressed in device validation studies—neither patient population is adequately represented in published clinical trials.
The ASAP digital stethoscope project and the eStetho concordance study at Hôpitaux Universitaires de Strasbourg represent one institutional response to these imperatives [39,40]. In a 857-patient real-world study comparing eStetho to the Littmann Classic III, an overall concordance of approximately 94% and a Cohen’s kappa of 0.87 were achieved under standard ward conditions. Ongoing work is extending this validation to ICU environments, remote home monitoring settings, and populations with high BMI—precisely the conditions in which the environmental determinants reviewed here are most consequential.
Auscultation occupies the intersection of physics, human biology, and clinical judgment. Its reliability is governed not merely by physician expertise, but by the acoustic properties of the environment, the body, and the instrument. The principal findings of this review are threefold. First, ambient noise in clinical environments—particularly ICUs and emergency departments—constitutes a fundamental barrier to reliable cardiac and pulmonary auscultation that cannot be overcome by training or expertise alone. Second, patient morphology, especially obesity and pathological modifications of the thoracic acoustic medium, systematically degrades sound transmission in precisely those patients most likely to harbor occult cardiopulmonary pathology. Third, the progressive erosion of auscultatory skills among clinicians, driven by curriculum pressures and technology displacement, underscores the urgency of AI-assisted solutions.
Digital stethoscopes integrating broadband MEMS sensors, adaptive noise cancellation, and deep learning classification represent a technologically mature response to each of these determinants. Their rational development, however, requires a foundational understanding of the environmental and external factors reviewed here—without which AI algorithms risk performing optimally in controlled laboratory conditions while failing in the noisy, morphologically diverse, and logistically complex settings of real clinical practice (Table 4).
Noël Lorenzo-Villalba (N.L.V.) serves on the Scientific Committee of La Revue de Médecine Interne; this role is disclosed in accordance with COPE guidelines. Emmanuel Andrès (E.A.) serves as Chief Editor of the Journal of Clinical medicine; this role is disclosed in accordance with COPE guidelines. E.A, N.L.V. and Amir Hajjam (A.H.) declare no conflicts of interest relevant to this article.
E.A. conceived the review structure, performed literature synthesis, and drafted the manuscript. N.L.V. and A.H. critically revised the manuscript. All authors critically revised the manuscript, approved the final version, and agree to be accountable for all aspects of the work. Generative artificial intelligence tools, including ChatGPT and Claude AI, were used during the preparation of this manuscript for language editing, stylistic improvement, text structuring, and literature synthesis assistance. All AI-assisted content was critically reviewed, revised, and validated by the authors, who take full responsibility for the accuracy, integrity, and originality of the final manuscript. AI tools were not used for data generation, data analysis, or independent scientific decision-making.

PTZ: We're glad you're here. Please click "create a new query" if you are a new visitor to our website and need further information from us.
If you are already a member of our network and need to keep track of any developments regarding a question you have already submitted, click "take me to my Query."