Background: Fluid biomarker detection at a stage of disease growth is essential to implement preventive treatment so patients can experience enhanced results. The non-invasive biofluid saliva contains multiple biomarkers which reflect systemic health conditions. The development of artificial intelligence (AI) has brought forward substantial interest in using salivary biomarkers for predictive diagnostics tests. A research objective established the creation and validation of artificial intelligence-driven diagnostic software built from salivary biomarkers to identify systemic conditions starting from diabetes mellitus, cardiovascular disease and ending with chronic kidney disease. Materials and Methods: Three hundred participants received equal distribution based on systemic disease diagnosis and healthy participants in this cross-sectional study. The study collected unstimulated saliva samples which were sent to ELISA kits for biomarker analysis of glucose, cortisol, CRP, creatinine, and IL-6. A data preprocessing step followed by normalization processes enabled the transfer of data to Random Forest and Support Vector Machine and Artificial Neural Network machine learning models. The evaluation of models used accuracy, precision, recall and AUC-ROC metrics to determine their performance. Results: The Artificial Neural Network achieved best results from the tested models which produced a 91.3% diagnostic accuracy alongside 89.8% precision and 92.5% recall and 0.95 AUC-ROC. The RF model reached 88.7% diagnostic accuracy coupled with 0.91 AUC-ROC value yet the SVM model only attained 85.4% accuracy. The connection between systemic inflammation-linked conditions and salivary IL-6 and CRP functioned as strong indicators and salivary glucose mirrored diabetic disease with a robust correlation value of 0.78 at p<0.001. Conclusion: The diagnostic model built with artificial intelligence relies on saliva samples to identify systemic diseases before symptoms appear through non-harmful testing methods. The incorporation of these tools into standard clinical work environments has the potential to strongly boost preventive healthcare services. Additional extended research focusing on different population groups should follow to verify these discovery results.
Systemic disease detection at its early phase stands as a vital healthcare element which produces both better treatment results and advanced disease control. Diagnostic procedures that exist today need invasive medical techniques along with high prices and restricted access which affects especially resource-limited settings. Saliva stands out as a preferred diagnostic fluid because it is non-invasive to obtain and collection is easy and contains multiple biomarkers which describe oral and systemic health conditions (1,2). Saliva contains all the necessary elements needed for diagnosis including hormones, enzymes, antibodies, cytokines as well as metabolic by-products that reveal health conditions in organs outside the oral cavity (3,4).
Multiple studies have proven that biomarker profiles identified in saliva match up with the progression and severity levels of diabetes mellitus together with cardiovascular diseases and chronic kidney disease (5,6). Salivary monitoring benefits from non-invasive tests because researchers have already detected higher IL-6 levels together with elevated glucose concentrations in diabetic and inflammatory cases (7,8). Current tests of saliva diagnostics for medical use face limitations because medical professionals must deal with intricate assessment of multiple biomarkers.
Artificial intelligence through machine learning models brings a revolutionary method to analyze sophisticated biological data for uncovering masked patterns that look beyond traditional statistics (9,10). Research shows that artificial intelligence systems demonstrate extraordinary diagnostic precision across radiology, pathology and genomic domains and start to gain popularity in salivary diagnosis developments (11). The disease prediction capabilities of machine learning algorithms consist of random forest, support vector machines as well as neural networks which analyze multidimensional data sets precisely.
Scientists created this research project to build and establish diagnostic validity for an artificial intelligence-based system which detects systemic diseases early through saliva-based biomarkers. Research uses artificial intelligence power along with salivary diagnostic ability to achieve an approachable noninvasive system for disease screening at an early stage.
A cross-sectional analysis served to build an AI-based diagnostic system that depended on salivary biomarkers to diagnose systemic diseases at an early stage. Three hundred subjects aged 25 to 65 years made up the research participant group. There were four distinct groups composed of 75 participants each: health controls matched the other three disease groups which included diabetes mellitus, cardiovascular disease and chronic kidney disease. The medical team based systemic condition diagnoses on clinical measures together with documented patient histories and laboratory results. The research excluded all individuals with present oral infections alongside autoimmune diseases or patients undergoing immunosuppressive treatment to prevent confounding test results.
The researchers aimed to limit circadian variations by acquiring the 9:00 AM to 11:00 AM unstimulated whole saliva samples. The participants received instructions to avoid any eating, drinking, smoking, oral hygiene procedures or fewer than 90 minutes of food consumption beginning before their sample collection session. The researchers immediately collected saliva through sterile polypropylene tubes and maintained the samples at ice temperature. The researchers carried out 3,000 rpm centrifugation of samples during 15-minute cycles at 4°C before extracting the supernatant which subsequently received storage at −80°C until additional evaluation procedures.
Commercially available enzyme-linked immunosorbent assay (ELISA) kits allowed the measurement of salivary glucose while detecting cortisol as well as C-reactive protein (CRP) and interleukin-6 (IL-6) and creatinine through kit protocols. Each measurement occurred in duplicate in order to guarantee accuracy and reproducibility.
Our research team processed the obtained data to manage incomplete values and inconsistent data points. We transformed all biomarker levels into z-scores as normalization measure. Members of the team used mutual information along with recursive feature elimination techniques to identify essential biomarkers for building models.
The research adopted and trained these three supervised machine learning methods: Random Forest (RF), Support Vector Machine (SVM) and Artificial Neural Network (ANN). Strategic sampling enabled the distribution of data into training (80%) and testing (20%) parts with balanced proportions between classes. Programming was executed using Python version 3.9 together with Scikit-learn and TensorFlow academic libraries for model evaluation and training purposes. The hyperparameter optimization through grid searching was integrated with five-fold cross-validation. A comparison of the models occurred through assessment with accuracy, precision, recall and F1 score along with area under the receiver operating characteristic curve (AUC ROC). The classification performance of disease categories was displayed through the use of confusion matrices during the analysis.
A total of 300 participants were included in the analysis, equally divided into four groups: healthy controls, diabetes mellitus, cardiovascular disease, and chronic kidney disease (n = 75 per group). The mean age of participants was 47.6 ± 10.2 years, with no significant difference in age distribution among the groups (p = 0.12). The gender distribution was approximately balanced, with 154 males (51.3%) and 146 females (48.7%).
Salivary Biomarker Levels
The mean salivary concentrations of key biomarkers across groups are presented in Table 1. Diabetic patients had significantly elevated salivary glucose (mean 10.4 ± 2.8 mg/dL), while cardiovascular disease patients showed higher levels of CRP (6.7 ± 1.9 mg/L) and IL-6 (18.3 ± 3.5 pg/mL). Chronic kidney disease patients exhibited elevated creatinine (0.85 ± 0.21 mg/dL) compared to controls (0.32 ± 0.11 mg/dL). All biomarker differences between control and disease groups were statistically significant (p < 0.001).
Table 1: Mean Salivary Biomarker Concentrations Across Groups
Biomarker |
Healthy Controls |
Diabetes Mellitus |
Cardiovascular Disease |
Chronic Kidney Disease |
Glucose (mg/dL) |
3.1 ± 1.2 |
10.4 ± 2.8 |
4.2 ± 1.5 |
4.5 ± 1.3 |
CRP (mg/L) |
1.2 ± 0.6 |
3.9 ± 1.0 |
6.7 ± 1.9 |
5.1 ± 1.7 |
IL-6 (pg/mL) |
4.8 ± 1.9 |
12.5 ± 3.2 |
18.3 ± 3.5 |
15.7 ± 2.8 |
Creatinine (mg/dL) |
0.32 ± 0.11 |
0.54 ± 0.14 |
0.60 ± 0.17 |
0.85 ± 0.21 |
Cortisol (ng/mL) |
4.1 ± 1.5 |
6.9 ± 2.0 |
6.1 ± 1.8 |
5.8 ± 1.6 |
Machine Learning Model Performance
Among the three machine learning algorithms evaluated, the Artificial Neural Network (ANN) outperformed the others in terms of diagnostic accuracy and generalizability. Table 2 summarizes the comparative performance metrics for each model. The ANN model achieved the highest accuracy (91.3%), precision (89.8%), and area under the ROC curve (AUC = 0.95). The Random Forest (RF) model also performed well, with an accuracy of 88.7% and AUC of 0.91. The Support Vector Machine (SVM) model recorded the lowest performance with an accuracy of 85.4% and AUC of 0.88.
Table 2: Performance Metrics of Machine Learning Models for Disease Classification
Model |
Accuracy (%) |
Precision (%) |
Recall (%) |
F1 Score (%) |
AUC-ROC |
ANN |
91.3 |
89.8 |
92.5 |
91.1 |
0.95 |
Random Forest |
88.7 |
86.4 |
89.3 |
87.8 |
0.91 |
SVM |
85.4 |
83.1 |
86.2 |
84.6 |
0.88 |
The ANN model also demonstrated superior classification performance in the confusion matrix, with most misclassifications occurring between cardiovascular disease and chronic kidney disease cases. This overlap was primarily attributed to overlapping CRP and IL-6 levels in these groups (Table 1).
These findings suggest that AI algorithms, especially ANN, are effective in utilizing salivary biomarker data to distinguish among multiple systemic diseases with high accuracy (Table 2).
Research work demonstrates how artificial intelligence analysis of mouth saliva can detect systemic diseases at early stages. Salivary analysis as a non-invasive technique together with machine learning technologies demonstrates effective discrimination between diabetes patients and patients with cardiovascular disease and chronic kidney disease along with healthy participants. Diagnostic accuracy reached its peak with the artificial neural network (ANN) model in the present evaluation which demonstrated deep learning's substantial worth in medical diagnostics.
Salivary diagnostics have gained clinical approval through diagnostic fluid validation because of low-cost collection methods, low-cost price structure, and identified diagnostic proteins, metabolites, and nucleic acids (1,2). Scientific research confirms the diagnostic significance of elevated salivary glucose levels and IL-6 levels in patients with diabetes and cardiovascular diseases (3,4). The results showing increased CRP levels in cardiovascular and renal disease patients correspond to known relationships between systemic inflammation and organ damage that exist in the literature (5,6).
Numerous research reports show that ANN demonstrates strong ability to recognize patterns and adapt to high-dimensional datasets (7,8) as observed in this study. Researchers have established neural networks as successful tools for identifying cancers and predicting diabetes conditions while detecting sepsis through heterogeneous biological information (9-11). The support vector machines algorithm demonstrates its worth although it falls short of harmonizing non-linear classification patterns which biological system requirements demand (12).
The model performance benefited significantly through feature selection because it removed variables that were irrelevant or redundant. Records have documented IL-6 and CRP along with glucose and creatinine as the leading biomarkers to diagnose diseases according to previous scientific reports (13). The combination of z-score normalization with cross-validation methods operated to reduce overfitting and improve model generalization toward new data points which are essential for real-life clinical utilization (14).
This research demonstrates how point-of-care diagnostic systems using AI should be deployed for population screening in low-resource settings. The collection of saliva provides a pleasant alternative to blood draws which increases patient participation especially in pediatric, geriatric and medically compromised patient groups (15). The research study recognizes multiple constraints while presenting its findings such as its cross-sectional setup and restricted sample size data as well as missing information on other systemic medical conditions.
The research will advance by adding new biomarkers to the model while following patient health trajectories and through testing with different demographic groups including ethnicities. Integrating AI-based saliva diagnostics with wearable technology and mobile applications allows for real-time monitoring of conditions and personalized disease predictions to patients.
The research confirms that artificial intelligence models analyzing salivary biomarkers present compelling potential for detecting systemic diseases at their early stages through non-invasive testing. The future of preventive healthcare delivery appears promising because these technologies will continue developing and achieving their validation standards.