Enhancing Obstructive Sleep Apnea Diagnosis With Screening Through Disease Phenotypes: Algorithm Development and Validation

Background The American Academy of Sleep Medicine guidelines suggest that clinical prediction algorithms can be used in patients with obstructive sleep apnea (OSA) without replacing polysomnography, which is the gold standard. Objective This study aims to develop a clinical decision support system for OSA diagnosis according to its standard definition (apnea-hypopnea index plus symptoms), identifying individuals with high pretest probability based on risk and diagnostic factors. Methods A total of 47 predictive variables were extracted from a cohort of patients who underwent polysomnography. A total of 14 variables that were univariately significant were then used to compute the distance between patients with OSA, defining a hierarchical clustering structure from which patient phenotypes were derived and described. Affinity from individuals at risk of OSA phenotypes was later computed, and cluster membership was used as an additional predictor in a Bayesian network classifier (model B). Results A total of 318 patients at risk were included, of whom 207 (65.1%) individuals were diagnosed with OSA (111, 53.6% with mild; 50, 24.2% with moderate; and 46, 22.2% with severe). On the basis of predictive variables, 3 phenotypes were defined (74/207, 35.7% low; 104/207, 50.2% medium; and 29/207, 14.1% high), with an increasing prevalence of symptoms and comorbidities, the latter describing older and obese patients, and a substantial increase in some comorbidities, suggesting their beneficial use as combined predictors (median apnea-hypopnea indices of 10, 14, and 31, respectively). Cross-validation results demonstrated that the inclusion of OSA phenotypes as an adjusting predictor in a Bayesian classifier improved screening specificity (26%, 95% CI 24-29, to 38%, 95% CI 35-40) while maintaining a high sensitivity (93%, 95% CI 91-95), with model B doubling the diagnostic model effectiveness (diagnostic odds ratio of 8.14). Conclusions Defined OSA phenotypes are a sensitive tool that enhances our understanding of the disease and allows the derivation of a predictive algorithm that can clearly outperform symptom-based guideline recommendations as a rule-out approach for screening.


Background
Obstructive sleep apnea (OSA) is a common sleep-related breathing disorder characterized by clinical symptoms (eg, daytime sleepiness) and at least five events per hour of narrowing (apnea or hypopnea) of the upper airway that impairs normal ventilation during sleep [1]. An apnea consists of a cessation of airflow higher than 90% of the baseline, a hypopnea is a reduction in airflow along with a decreased saturation of 3% from pre-event baseline and/or associated with an arousal, and the apnea-hypopnea index (AHI) is the number of such events per hour of sleep. OSA prevalence has been underestimated, with studies varying significantly, both in the population being studied and in OSA definition. A study using a simpler hypopnea definition (4% desaturation) estimated a prevalence of 14% in men and 5% in women [2]. In 2 other studies, the prevalence was substantially higher but was estimated for specific populations, such as patients being evaluated for bariatric surgery [3] or patients who have had a transient ischemic attack or stroke [4], reaching values of 70% and 72%, respectively. The latest study by Benjafield et al [5] estimated that 936 million adults have OSA; in Portugal, it represents 17%, and approximately 74% have moderate to severe OSA. Overall, this disease is largely unrecognized and undiagnosed, representing a significant burden to the health care system [6], especially for patients who remain untreated or at an increased risk of developing cardiovascular disease, metabolic dysregulation, or diabetes [1,[7][8][9][10][11]. The failure to clinically recognize OSA leads to significant morbidity and mortality, making it essential to anticipate its recognition, diagnosis, and treatment [1]. OSA diagnosis, for which a comprehensive sleep evaluation (sleep history and physical examination) plus polysomnography (PSG) is the gold standard [1], can effectively decrease health care utilization and costs, whereas timely treatment can improve quality of life, lower the rates of motor vehicle crashes, and reduce the risk of chronic health consequences [12].
In 2017, a new clinical practice guideline for diagnostic testing for adults with OSA was issued by the American Academy of Sleep Medicine (AASM) [1], updating 2 previous AASM guidelines from 2005 [8] and 2007 [13]. Of the 9 PICO (patient, population or problem, intervention, comparison, and outcome) questions raised in this new guideline, the task force reported insufficient evidence to directly address the first one: "In adult patients with suspected OSA, do clinical prediction algorithms accurately identify patients with a high pretest probability for OSA compared to history and physical exam?," as no studies comparing the efficacy of clinical prediction algorithms with clinical history and physical examination were identified. Therefore, they compared the efficacy of clinical prediction algorithms with PSG, crafting recommendation 1: "We recommend that clinical tools, questionnaires and prediction algorithms not to be used to diagnose OSA in adults, in the absence of PSG," affirming that clinical prediction algorithms can, however, be used in patients with suspected OSA, as long as not to establish the need for PSG or to become a substitute for PSG. Rather, these tools can be more helpful, in specialties other than sleep-oriented ones, to identify patients with an increased risk for OSA.

Objective
In this study, we aim to establish a new clinical prediction algorithm to allow OSA screening (high pretest probability for OSA) based on demographics, physical examination, clinical history, and comorbidities, using standard OSA definition (AHI ≥5 plus symptoms), extending traditional approaches that assess only preestablished symptoms, such as snoring, witnessed apneas, and excessive daytime sleepiness.

Overview
Using retrospective data from a cohort of patients who underwent PSG, after proper referral by a physician, significant predictive variables were selected and used to compute distances among patients with OSA, which supported a clustering algorithm to derive patient phenotypes from resulting clusters, with missing data being analyzed and imputed as needed. To assess the consistency of our phenotypes, each healthy individual was also tested against the clustering structure, and the resulting phenotyping was analyzed. Then, to assess the benefit of this phenotyping strategy, cluster membership was used as an additional predictive variable and included in a Bayesian network classifier, with validity compared with an equivalent classifier without phenotype information, following the 2015 STARD (Standards for Reporting Diagnostic Accuracy Studies) guideline.

Patients
Data from patients referred to undergo PSG at Vila Nova de Gaia and Espinho Hospital Center Sleep Laboratory were retrospectively collected. Patients who underwent PSG between January and May 2015 were included if they were aged >18 years and were suspected of having OSA. Nonetheless, exclusion criteria included patients already diagnosed (performing positive airway pressure therapies), patients suspected of having another sleep disease, patients with severe lungs or neurological conditions, and pregnant women. In case of multiple examinations of the same patient, the one with the best sleep efficiency was selected. This study was approved by the Ethics Commission of Vila Nova de Gaia and Espinho Hospital Center, in accordance with the Declaration of Helsinki.

Predictive Variables
An author-performed literature review on PubMed (April 19,2015) supported the definition of the relevant variables to be collected from medical and/or sleep laboratory records in which the presence or absence of each information was assessed by a physician, resulting in a total of 47 predictive variables, all in accordance with the current and previous OSA guidelines. The search contained "risk factors," "sleep apnea, obstructive," and "diagnosis" as MeSH terms, obtaining 1397 articles, of which 47 were used for variable definition (full review description and references used in this phase are not shown for space purposes but can be provided on request). Selected variables included basic demographic data (gender and age), physical examination (BMI, neck and abdominal circumferences, modified Mallampati  classification, and craniofacial and upper airway abnormalities),  clinical history (daytime sleepiness, snoring, witnessed apneas,  gasping and/or choking, sleep fragmentation, nonrepairing sleep,  behavior changes, decreased concentration, morning headaches,  decreased libido, sleeping body position, sleep efficiency,  participation in vehicle crashes, truck driver activity, driving sleepiness, nocturia, alcohol consumption, smoking, coffee intake, use of sedatives before sleep, family history or genetic evidence, and Epworth Sleepiness Scale), and comorbidities and cointerventions (stroke, myocardial infarction, pulmonary infarction, arterial or pulmonary hypertension, congestive heart failure, arrhythmias, respiratory changes, diabetes, dyslipidemia, renal failure, hypothyroidism, gastroesophageal reflux, anxiety and/or depression, insomnia, glaucoma, pacemaker or implantable cardioverter-defibrillator, and bariatric surgery).

Data Set Description
Clinical data from each patient (47 predictive variables plus the outcome) were extracted from the central clinical data registry (all records were fulfilled by a physician) along with sleep laboratory data and adequately anonymized to ensure patient privacy. Original files included structured demographic data, structured PSG reports, and unstructured textual annotations from the medical records, with many abbreviations and short-form text. The outcome measure was obtained from the AHI, categorized as mild (AHI between 5 and 14), moderate (AHI between 15 and 29), and severe (AHI >30). Given the categorical characteristic of our modeling strategies, all continuous variables were discretized, and the following common cutoffs were extracted from the literature: (1) age (20-44 years, 45-64 years, and 65-90 years), (2) BMI (<25 kg/m 2 as normal weight, 25-30 kg/m 2 as overweight, and ≥30 kg/m 2 as obese), (3) female neck circumference (≤37 cm as normal and >38 cm as increased), (4) male neck circumference (≤41 cm as normal and >42 cm as increased), (5) female abdominal circumference (≤80 cm as normal and >81 cm as increased), (6) male abdominal circumference (≤94 cm as normal and >95 cm as increased), (7) Epworth Sleepiness Scale (0-10 as normal and 11-24 as excessive daytime sleepiness), and (8) AHI (0-4 as normal, 5-14 as mild, 15-29 as moderate, and ≥30 as severe).

Missing Data Imputation
Although we had all the electronic clinical records from the included patients, after screening all unstructured text reports, some predictive variables were not fully present or described, as physicians normally do not mention the absence of a disease or it could only be noted in paper records (missing data proportions ranged from 0% for gender to 97% for bariatric surgery). In our previous study [14], we studied the impact of missing data imputation, using nearest neighbor (NN) strategies, on the structure learning of Bayesian network classifiers for OSA diagnosis, concluding that it can expand the body of evidence for modeling without compromising validity. In this study, we followed the same strategy: (1) variables with more than 80% missing values were removed from the analysis (ie, behavior changes, decreased libido, decreased concentration, pulmonary infarction, glaucoma, and bariatric surgery); (2) remaining variables were ranked by the proportion of missing values; (3) data imputation started using only complete and outcome-wise statistically significant variables (P<.20), imputing incomplete likewise significant variables; and (4) remaining incomplete variables were then imputed stepwise by increasing the proportion of missing values per variable. All imputations were performed using majority voting from the 10 NNs/patients.

Clinical Prediction Algorithm
Aspiring to a more personalized approach to evaluate patients with OSA and targeting to recognize high pretest probability for OSA, cluster analysis (a statistical approach for studying the relationship present among groups of patients or variables [7]) was applied to distinguish whether there are different subgroups of patients with different clinical presentations, that is phenotypes. Clustering has been widely used in health research, particularly in the analysis of gene expression [15], asthma [16], chronic obstructive pulmonary disease [17], fibromyalgia [18], Parkinson disease [19], and sleep apnea [20][21][22]. The aim is to identify clusters of patients who are similar among themselves, although significantly different from patients of other clusters [7]. As expected, different clusters created from predictive variables express different disease risks, hence defining risk-aligned phenotypes.

Connectivity-Based Clustering
In this study, we applied a hierarchical clustering algorithm to obtain a hierarchy of possible solutions, ranging from one single group with all patients to having every single patient separated from each other. This process, where a cluster hierarchy is created, is based on the distance between data observations (ie, patients), giving as output a dendrogram (a tree diagram that presents different clustering definitions for all possible numbers of clusters, from which the user might choose the desired number of clusters after inspecting the intracluster and intercluster distances of each possible cut point). Therefore, the definition of the distance function is a crucial step in the application of this technique, especially in categorical data, as an incorrect distance can easily lead to biased results with potentially serious consequences to the conclusions drawn.
In this study, we computed the distance measure between 2 patients, a and b, based only on significant variables (univariate significant association with the outcome, for a 20% significance level in both the original and imputed data sets, using chi-square and Fisher exact tests), and each variable was weighted according to the corresponding crude odds ratio for the severe level, as follows: This distance encoded the similarity between patients weighted by the contribution of each variable toward the outcome, regularized for significant variables only, and was subsequently used in hierarchical clustering with Ward linkage, leading to a complete dendrogram. Afterward, the obtained OSA clusters were defined by inspecting the outcome proportion by cluster and the corresponding 95% CIs.

Phenotypes Consistency
To assess whether predetermined phenotypes would also help in segmenting healthy patients, each healthy patient was assigned to the closest phenotype using the aforementioned distance measure and the same significant variables, determining the distance between each healthy patient and obtained OSA cluster. The resulting clustering definition was then described and analyzed, as was done for the cohort of patients with OSA.

Phenotypes Predictive Value
To assess whether the phenotypes could encode any predictive value, Bayesian network classifiers were built with and without cluster information as a predictive variable. First, a naïve Bayesian network classifier was induced using the selected variables. Then, assigned cluster was also included in the model as a parent node of all independent variables. Validity was then assessed and compared using leave-one-out and 10 times twofold cross-validation strategies, comparing validity measures, such as sensitivity, specificity, accuracy, predictive values, area under the receiver operating characteristic (ROC) curve, likelihood ratios, posttest odds and posttest probabilities, and diagnostic odds ratio. [23] was used on every statistical step of this work: discretization of continuous variables (package car [24]), descriptive and comparative analyses (packages gmodels [25] and epitools [26]), missing data analysis (package summarytools [27]), missing data imputation (package DMwR [28]), hierarchical clustering (package stats [23]), Bayesian network inference (packages bnlearn [29] and gRain [30]), and ROC curve analysis (package pROC [31]). Bayesian networks were visually inspected using SamIam software (developed by the University of California, Los Angeles) [32].

Baseline Characteristics
Of the 318 patients included, 207 (65.1%) had OSA. Of these 207 patients, 111 (53.6%) were classified as mild, 50 (24.2%) as moderate, and 46 (22.2%) as severe. Baseline characteristics of patients with OSA and the proportion of missing values for each predictive variable are described below in Table 1 (original data) and in Multimedia Appendix 1 (for the curated data, after missing data imputation).
Patients with OSA had a mean age of 61 (SD 11) years, being slightly older in the moderate subgroup (24/50, 48%; aged >65 years), whereas the proportion of males was higher in the moderate (40/50, 80%) and severe (35/46, 76%) subgroups. Beyond these 2 variables, only sleep efficiency was found to be complete (no missing data), and no differences were found across OSA levels (P=.65). For the remaining variables, distributions were computed before and after data imputation.

OSA Clusters
Using the 14 variables significantly associated with the outcome, a hierarchical clustering structure was derived, where, given the resulting clustering structure, a 10-cluster cutoff point was chosen (following the hierarchical structure of the clustering in the dendrogram). The resulting clusters had median AHI values of 8, 10 (4 clusters), 12, 13, 14, 31, and 34. As 10 clusters are difficult to interpret in a medical context, we chose to aggregate the 10 created clusters into 3 clusters according to their median values: (1) clusters with median 8 and 10, (2) clusters with median 12, 13, and 14, and (3) clusters with median 31 and 34.
The OSA cluster characteristics of the 14 predictive variables are described below and listed in Table 2. The witnessed apneas variable was also statistically significant in both the original and the curated data but was not considered for the cluster hierarchy, as it depends on third-party reporting, which might create a strong bias in the analysis.  In contrast to cluster 1, but in concordance with cluster 2, nocturia was described in all patients in cluster 3. In addition, arterial hypertension, diabetes, and dyslipidemia were observed in all the patients. The median AHI was 31 (range 21-60); therefore, it was the highest in all 3 clusters. Witnessed apneas were found with the highest proportion of all clusters (13/169, 81.2%).
Age strata and BMI were found to be different among clusters (P<.001). Comorbidities, such as stroke, arterial hypertension, diabetes (P<.001), and dyslipidemia (P=.003), were increasingly more prevalent from cluster 1 to clusters 2 and 3. Only male sex (P=.32) and nonrepairing sleep (P=.13) were not found to be significantly different.
On the basis of the description of clusters mentioned earlier, the OSA phenotypes can be defined. We classified patients into low (cluster 1), medium (cluster 2), and high (cluster 3) severity phenotypes, as their median AHI corresponded to mild, moderate, and severe levels respectively, defined in PSG for OSA diagnosis. The low severity phenotype includes age >45 years, a fair distribution in normal and overweight patients, accentuating obesity, and low prevalence of symptoms and comorbidities, except for dyslipidemia and arterial hypertension. The medium severity phenotype has almost the same distribution in age as the low severity phenotype, but less normal-weight patients and more overweight patients. Symptoms and comorbidities were higher, with stroke, arterial hypertension, dyslipidemia, and nocturia appearing in more than 85% of the patients with this phenotype. The high severity phenotype presents older and obese patients, with additional comorbidities (congestive heart failure and diabetes) beyond those present in the medium severity phenotype. The foremost difference between our phenotypes and AHI alone is that we considered the risk and diagnostic factors associated with the patient and not only a single value or a counting of events.

Affinity Between Healthy Patients and OSA Phenotypes
Given that our data set included patients who are healthy and with OSA (a total of 318 individuals), we focused our attention on exploring whether the determined OSA phenotypes could also help to segment healthy patients. To do so, we computed the aforementioned distance measure between 2 individuals using the same 14 significant variables. Table 3 describes the baseline characteristics of healthy patients for each OSA phenotype.
As expected, a high severity phenotype was less common in healthy patients (7/111, 6.3%), including older (P<.001), females (P=.49), and obese individuals (P=.50), with a lower proportion of individuals reporting nonrepairing sleep (P=.36). This phenotype also presented the highest proportion of reported nocturia, stroke, arterial hypertension, congestive heart failure, and diabetes (P<.001); pulmonary hypertension and arrhythmias (P=.01); and respiratory changes (P=.11). The medium severity phenotype had the highest proportion of overweight males aged between 45 and 64 years. Although comorbidities such as pulmonary hypertension, congestive heart failure, arrhythmias, and pacemaker or implantable cardioverter-defibrillator do not reach proportions higher than 1%, others such as stroke, arterial hypertension, diabetes, and dyslipidemia present proportions higher than 70%. The low severity phenotype is similar to the medium severity phenotype in terms of the proportion of overweight males, but individuals are younger. Nocturia, pulmonary hypertension, congestive heart failure, arrhythmias, pacemaker or implantable cardioverter-defibrillator, and diabetes have not been reported in this phenotype. Dyslipidemia was the most common comorbidity (16/25, 64%), followed by arterial hypertension (14/25, 56%) and respiratory changes (7/25, 28%).

Beyond OSA Phenotypes
OSA is a systemic disorder that remains underdiagnosed. Physicians, particularly nonspecialists in sleep disorders, urgently need a simple yet complete tool that allows them to identify a high pretest probability for OSA. This ability, which could enhance current screening, could lead to personalized treatment by additionally improving the understanding of OSA mechanisms and the risk for adverse events.
Our clinical prediction algorithm, that is, previously described OSA phenotypes, is a new way to screen patients, extending traditional approaches. To implement this new strategy, we need a simple, understandable, and updatable tool that can be used daily and that takes into account the knowledge of experts, the literature evidence, and the clinical data.
Belief or Bayesian networks [33] are probabilistic graphical models used to represent knowledge about an uncertain domain; each node represents a random variable, whereas directed edges between the nodes represent probabilistic dependencies among the corresponding variables. Bayesian networks are both mathematically rigorous and intuitively understandable, as they reflect a simple conditional independence statement, that is, each variable is independent of its nondescendants in the graph, given the state of its parents. The Bayesian network thus consists of both a qualitative model (which shows the relationship among variables) and a quantitative model (the joint probability distribution is expressed as conditional probabilities).
Initially, we created the simplest Bayesian classifier (naïve Bayes; Figure 1, Model A), which assumes independence among predictive variables and conditional independence, given the outcome. Subsequently, we extended the model (Figure 2, Model  B), adding the defined phenotypes as a parent node of all predictors, thereby adjusting the model by capturing possible interactions among them, expressed by the corresponding phenotype associated with the tested individual. To evaluate the benefits of including OSA phenotypes in the clinical risk assessment tool, it was necessary to estimate the overall performance of each model. The ROC curves of each model (for both leave-one-out and cross-validation estimates) are presented in Figure 3, assessing the discriminative power of both models. As shown in Table 4, the derivation sample (area under the curve [AUC]) improved from 72% (95% CI 66-78) for model A to 84% (95% CI 80-89) for model B. The validity assessment confirmed the improvement achieved by the inclusion of OSA phenotypes, with leave-one-out estimates of 68% to 78%, respectively, from model A to model B and with 10 times twofold cross-validation averaging 67% and 77%, respectively. In addition, the diagnostic odds ratio, as a measure of the effectiveness of a diagnostic test, was 3.55 for model A and 2 times more for model B Figure 1. Naïve Bayesian network representation of the relationships between the outcome (obstructive sleep apnea) and each of the 14 significant predictive variables. The bars within each variable represent the prior marginal probabilities for the category of each variable. CDI: implantable cardioverter-defibrillator; CHF: congestive heart failure; OSA: obstructive sleep apnea.   Aiming at a 95% sensitivity target (screening strategies look for rule-out approaches), cutoff points were defined based on the derivation sample ROC curve, and the corresponding validity assessment results for cross-validation are displayed in Table  4, presenting an increase of specificity (26%-38%) for the desired level of sensitivity and presenting a posttest odds of 3 to 1 for the positive result and almost 1 to 5 for the negative result.
On the basis of the model with OSA phenotypes, OSA probabilities >22% were considered a positive result. The application of this cutoff resulted in a sensitivity value of 93% (95% CI 91-95) and 73% (95% CI 73-74) of positive predictive value, managing to provide a sensitive tool that prevents 1 out of 5 healthy individuals from unnecessarily undergoing PSG.
In our sample, the pretest probability was 65%, whereas the posttest probability increased to 75% using model B, with a posttest negative probability of 18%, as shown in Figure 4. These results highlight the value of using defined OSA phenotypes as predictors of OSA risk in referred individuals.

Principal Findings
Understanding OSA patterns is important, particularly in the diagnosis of OSA. The AASM task force affirmed that the evaluation with clinical tools, such as clinical prediction algorithms, was less burdensome to the patient and physicians when compared with PSG. However, their low levels of accuracy and the likelihood of misdiagnosis must be weighted. Therefore, they proposed a clinical algorithm for the implementation of clinical practice guidelines for OSA. In the second step of this algorithm, the increased risk of moderate to severe OSA is measured by the presence of excessive daytime sleepiness and at least two of the following 3 criteria: habitual loud snoring, witnessed apnea or gasping or choking, or diagnosed hypertension. When we applied this moderate to severe risk in our data set (n=318), we found a sensitivity of 29%, a specificity of 68%, a positive predictive value of 50%, and a positive likelihood ratio of 0.875, showing possible benefits for a rule-out approach. However, considering the target of moderate to severe OSA identification, this approach revealed a very low level of sensitivity for a rule-in approach, which would be expected in this case.
To the best of our knowledge, this study is the first attempt to explore different clinical phenotypes of patients with OSA using categorical cluster analysis combined with Bayesian networks. We applied a hierarchical clustering procedure using Ward linkage on 14 significant predictive variables (out of the tested 47) that were grouped into 3 clusters: low, medium, and high severity phenotypes. These phenotypes were then used to expand a clinical prediction algorithm based on Bayesian networks, creating a simple but complete and updatable tool for OSA screening that can deal with missing information, based only on clinical and demographic variables, which have the main advantage of being easily available and quickly acquired by physicians.
Cluster analysis has been used in many medical conditions aiming to identify clinical phenotypes, as in the case of patients with asthma [16], where 5 clinical phenotypes illustrated the heterogeneity of the disease and relevant differences in treatment. Regarding OSA, clustering had been discussed as a possible helpful tool back in 1992, where the work of Tsuchiya et al [34] tried to apply cluster analysis in patients with OSA to overcome the stated overemphasis regarding obesity, which may have caused some physicians to overlook other potential factors that predispose this condition. They considered the apnea index (the standard at the time) and applied hierarchical clustering with average linkage, resulting in 2 clusters. The authors highlighted the controversy on the number of clusters, stating that "it should be essential to determine the number of clusters in a realistic way, and also to interpret the structures of clusters from a biologic standpoint." Ye et al [35] collected demographic and survey data about sleep-related health issues (using numeric predictive variables) identifying 3 clusters: cluster 1 as disturbed sleep group, cluster 2 as minimally symptomatic group, and cluster 3 as excessive daytime sleepiness group. Although we have studied predictive variables related to daytime sleepiness, none were considered statistically significant; therefore, it is difficult to compare the results of the study by Ye et al [35] with the results of this study. Lacedonia et al [7] developed the work of Ye et al [35], enhancing the results using instrumental data, such as blood gas analysis and spirometry parameters (unavailable to us), to identify clinical presentations of patients with OSA. The authors used 2 approaches: a first one with hierarchical clustering revealing 3 clusters and the second one expanding it to 8 clusters with local optimization through principal component analysis.
Other studies are recently being developed, namely, the broad one in sleep apnea from the Sleep Apnea Network or European Sleep Apnea Database (ESADA) group. In 2016, Saaresranta et al [22] hypothesized that distinct OSA phenotypes should be present when discussing comorbidities and adherence to nasal continuous positive airway pressure (CPAP) therapy. This study has 3 main differences from ours: the ESADA database accepted PSG and cardiorespiratory polygraphy, whereas we only accepted PSG results; they accepted CPAP therapy and divided their patients into categories based only on subjective daytime sleepiness and nocturnal complaints. Regarding this last aspect, in our study, both subjective excessive daytime sleepiness and Epworth Sleepiness Scale were not considered in the cluster analysis. In 2020, a study by Bailly et al [21] applied latent class analysis to identify OSA phenotypes while reflecting geographical variations, resulting in 8 distinct clusters that were divided into 2 main categories: gender-based phenotypes (clusters 2 and 6 with only men and clusters 7 and 8 with only women) and men with various combinations (clusters 1, 3, 4, and 5), with which we can compare results. Cluster 3 of the study by Bailly et al [21] is described as obese comorbid patients, being the most similar to our low severity OSA cluster, presenting almost the same percentage of males (69% vs 73%) and higher levels of metabolic comorbidities.
Our results suggest 3 OSA phenotypes that can help in the screening, diagnosis, and later treatment of patients with OSA, capturing the full OSA spectrum of patients, focusing our attention on a detailed description of patients with OSA and not on a stereotypical one, where only a few typical symptoms such as snoring or daytime sleepiness are analyzed. To augment awareness of this prevalent disease, we even analyzed healthy patients to determine whether we could use the created phenotypes as identifiers of precursors of OSA.

Strengths and Limitations
This study had a modest number of patients, mainly because of the short period for data collection, which was performed in a small district hospital. Nevertheless, we believe that the procedure and the results are relevant. We also acknowledge that our phenotypes are not fully in accordance with the clinical phenotyping experience, particularly those regarding upper airway morphology. We suppose that the inclusion of other relevant outcome data could create a more robust analysis of the determined phenotypes. The inclusion of more patients and even dissociating variables, such as craniofacial upper airway abnormalities, could benefit future research.
The major strengths of this study are the study of a clinical cohort representing patients with OSA with all levels of severity and the inclusion of a comprehensive number of risk and diagnostic factors that enhance our understanding of OSA diagnosis, with an overall cross-validated discriminative power of AUC of 77%, improving the specificity of a (designed) 95% sensitivity rule-out clinical prediction algorithm (3 to 1 odds for a positive result and 1 to 5 odds for a negative result). In addition, a diagnostic odds ratio higher than 1 was observed for models A and B, supporting the effectiveness of both models, with model B (inclusion of the disease phenotypes) doubling the diagnostic model performance. To assess the validity of our approach, we evaluated a logistic regression model in the derivation cohort, with and without predefined clusters, which highlighted the added discrimination value of using OSA phenotypes as a predictive variable (81% vs 83%). Moreover, we are aware that several clinical questionnaires (Berlin, STOP-BANG [snoring, tiredness, observed apnea, blood pressure, body mass index, age, neck circumference and gender], and NoSAS [neck, obesity, snoring, age, sex]) are helpful in identifying patients who are at risk of OSA. The Berlin questionnaire, when applied to the general population, reaches values of 37% for sensitivity and 84% for specificity, whereas when applied to primary care patients, the values are 86% and 77% [36], respectively. If we look at the STOP-BANG questionnaire, validation was performed in preoperative patients; the sensitivity and specificity values are 84% and 39%, respectively, for OSA diagnosis [37]. Finally, the NoSAS score was validated for the general population; the sensitivity values varied between 79% and 85%, the specificity varied between 69% and 77%, and AUC varied between 74% and 81% [38]. Comparing these results with our results, we can see that our sensitivity has the highest value, as we aim to establish a rule-out approach. On the other hand, our values for specificity and AUC were lower, only comparable with the value obtained for STOP-BANG.

Conclusions
We can affirm that using OSA phenotypes as predictors allows the creation of sensitive tools, with the defined phenotypes being a reflection of the early expression and the natural history of OSA. Nevertheless, OSA and individual responses are not static and evolve with time, creating the need for further studies on evaluating the phenotyping fluctuations and determining their long-term diagnosis implications.