英语轻松读发新版了,欢迎下载、更新

Computational machine learning estimation of digitoxin solubility in supercritical solvent at different temperatures utilizing ensemble methods

2025-08-10 07:14:35 英文原文

作者:Garg, Gunjan

Scientific Reports volume 15, Article number: 29248 (2025) Cite this article

Abstract

The solubility of medications in supercritical solvent is the most important factor that can be determined via appropriate computational tools. This work explores the modeling of digitoxin solubility as the case study in supercritical CO2 and solvent density utilizing ensemble methods. Temperature and pressure are the input parameters, while solvent density and digitoxin solubility are the output parameters. Several machine learning models along with optimizer were used for correlation of the dataset. Employing AdaBoost as an ensemble method, predictions from Bayesian Ridge Regression (BRR), Gaussian process regression (GPR), and K-nearest neighbors (KNN) are amalgamated. Sailfish Optimizer (SFO) is utilized for hyper-parameter tuning to enhance model performance. Results reveal that AdaBoost combined with ADA-GPR exhibits the lowest Average Absolute Relative Deviation (AARD%) values, with solubility achieving 7.74 and solvent density reaching 2.76, respectively. This underscores the efficacy of ensemble methods and hyper-parameter tuning in accurately predicting complex chemical properties in supercritical CO2 systems.

Introduction

To enhance the solubility of medicines, size reduction is the most commonly used method which can be implemented for various solid-dosage formulations. Those drugs that are classified as BCS class II, possess low solubility in aqueous solutions. The drugs with low solubility and high permeability are classified in this group according to Biopharmaceutical Classification System (BCS)1. Size reduction will enhance the solubility and consequently the bioavailability owing to the higher surface energy of drug powder in the nanosize2. Therefore, this approach is suitable in the pharmaceutical industry for improving the properties of medications with poor solubility. Another approach for enhancing the solubility is amorphization by which a drug is transformed from crystalline to amorphous state with higher solubility. However, the problem with the amorphous state is its poor stability which recrystallizes over time and loses the properties of amorphous state. As such, it must be stabilized which can be done via distribution of drugs inside a polymeric matrix to prevent mobility and recrystallization3,4.

For size reduction, supercritical method can be employed, which is a newly developed process for pharmaceutical processing where it has the ability to reduce the size of particles in a continuous model of operation. The method is also mentioned as green processing due to the lack of organic solvent in the size reduction operation, thereby offering sustainable processing5,6,7. For size reduction using this method, the drug must be dissolved in the solvent at an acceptable value, thus the solubility of drug in supercritical solvent is the limiting step for development of this green process. As such, the solubility of medicine in the supercritical solvent should be estimated via robust methods to ensure the reliability of method in precise determination of solubility prior to the operation. Computational techniques can be used as screening tools for assessment of drugs nanoparticle production in this process. Thermodynamic approach is a reliable tool for estimating drug solubility in supercritical solvents (e.g., CO2), which is built based on phase equilibria, such as solid-liquid equilibrium8,9.

Despite the advantages of thermodynamic models for correlating medicines solubility in supercritical solvents, the models are not easy to be applied and there is sometimes complexity in implementing these models. As such, development of models with less complexity and generality would be preferred for estimating drugs solubility in supercritical solvents. Data-driven approach is an alternative for correlation of drugs solubility which rely on the availability of measurements and dataset. Some machine learning and optimizer algorithms have been implemented for estimation of drugs solubility in supercritical carbon dioxide, and reports have shown the models are of great accuracy in this field10,11,12,13. One of the data-driven models is machine learning (ML) where its domain is currently undergoing significant growth and has been widely employed in various applications, including the analysis of measured data for the purpose of optimization, regression, and prediction14. AI (Artificial Intelligence) based models and deep learning (DL) have been recently studied for pharmaceutical and energy applications15,16,17 which can be further explored for solubility analysis of medications.

Machine learning techniques facilitate the creation of accurate and reliable models that can identify patterns within complex datasets and produce predictions based on those patterns18,19,20. This study utilizes AdaBoost as an ensemble method to incorporate predictions from Gaussian process regression (GPR), Bayesian Ridge Regression (BRR), and K-nearest neighbors (KNN). The Sailfish Optimizer (SFO) is employed for the purpose of hyper-parameter tuning in to enhance the performance of the model.

In the field of Bayesian theory and statistical learning theory, GPR has emerged as a useful machine learning technique. The framework provides a flexible structure for executing probabilistic regression and is widely utilized for tackling regression issues marked by high dimensionality, limited sample sizes, or nonlinearity21. Bayesian Ridge Regression models are a combination of Bayesian inference and regression techniques. This method provides more reliable and efficient estimation of coefficients in linear models compared to least squares regression22. The KNN regression model is a data-driven technique employed for the purpose of predicting short-term traffic flow. The algorithm operates by identifying the K most analogous instances to a provided input and calculating their mean as the prediction23.

So, the current study develops a methodology based on ML for modeling drug solubility in supercritical CO2 and makes correlation with pressure and temperature. The models included GPR, BRR, and KNN which are optimized using Sailfish Optimizer. Digitoxin is selected in this study for implementing methodology and assessing the reliability of models and optimizer.

Materials and methods

Data of digitoxin solubility

A number of measured data was applied for developing and testing the models which are collected from reference24 and its statistical analysis is presented in Table 1 which are based on raw data. The drug used for the analysis is digitoxin, and its solubility in supercritical CO2 was collected and used in this work for building ML models and optimizer. The same data was already used by Li et al.25 to build several ML techniques in estimation of digitoxin solubility. Temperature and pressure are the two input features in this dataset owing to their significant impact on the solubility of digitoxin. As the solvent is compressible, the influence of pressure should be evaluated to understand its effect on digitoxin solubility changes. Figure 1 shows histograms of all parameters of the dataset. Thus, the machine learning models were developed utilizing two inputs and two outputs which are solubility of drug and the density of solvent.

Table 1 Statistical overview of dgitoxin solubility data used in this study.
Fig. 1
figure 1

Histograms of all columns for digitoxin solubility.

Sailfish optimizer (SFO)

Enhancing the effectiveness of machine learning algorithms hinges crucially on optimizing hyper-parameters. The choice of appropriate hyper-parameters profoundly influences both the accuracy and the generalization capabilities of these algorithms26. In the last several years, there has been escalating interest in nature-inspired optimization methods, owing to their efficacy in fine-tuning hyper-parameters27. The Sailfish Optimizer (SFO) is an algorithm that is influenced by the synchronized swimming behavior observed in sailfish28.

The SFO algorithm is a metaheuristic method inspired by the hunting and cooperative behavior of marine sailfish29. Mathematically, the Sailfish Optimizer functions by iteratively adjusting the hyper-parameters through a series of equations that mimic the motion of the sailfish. The equation governing the update of the position of the i-th sailfish in D-dimensional space can be expressed as follows28:

$$\:{X}_{i}^{t+1}={X}_{i}^{t}+{V}_{i}^{t+1}$$

where \(\:{X}_{i}^{t}\) stands for the position of the i-th sailfish at t-th iteration, and\(\:{V}_{i}^{t+1}\) denotes the velocity vector of the i-th sailfish at the next iteration28.

ADABOOST

AdaBoost, an abbreviation for “adaptive boosting,” is a widely utilized machine learning approach that surpasses simpler algorithms in predictive accuracy. This method involves adjusting a weighted combination of functions to align with the aggregated data, determining the total error, and then modifying an initial function accordingly. At each step, multiple base models are implemented on the adjusted dataset, and their respective errors are assessed. Consequently, the mistake from one iteration influences the subsequent model in every cycle. The final prediction is derived by assigning appropriate weights to each function and summing their contributions once the adjusted error is minimized to zero30,31.

Base models

We utilized Gaussian Process Regression (GPR) as one of the base models. A Gaussian process (GP) is a collection of random variables some of which have Gaussian distributions (GDs)32,33. The covariance and mean functions serve as effective metrics for evaluating the performance of a GP. In the GPR framework, GDs are extended, with the mean represented as a vector and the covariance as a matrix34,35.

What distinguishes GPR from other regression models is its lack of necessity for a precise specification of a fitting function. Instead, it utilizes statistical models that approximate a random sample from a multidimensional GD to interpret field data34.

The second base model is Bayesian Ridge Regression (BRR). BRR combines Bayesian inference with regression models. This method estimates coefficients in linear models more robustly and efficiently than least squares regression. BRR estimates linear model coefficients using prior data beliefs. A prior distribution encodes prior beliefs, which are integrated with the likelihood function to create the posterior distribution. Estimating linear model coefficients using this distribution allows data estimation. We assume a normal distribution of regression coefficients with a zero mean and an alpha hyper-parameter that determines accuracy (inverse variance).

The likelihood function follows a normal distribution, where the linear regression model predicts the mean and the additional hyperparameter, lambda, determines the variance. The aim is to ascertain the probable values of the regression coefficients β by leveraging existing data and prior information. The subsequent equation defines the posterior distribution of β36:

$$\:p(\beta\:\mid\:X,y,\alpha\:,\lambda\:)=\text{N}(\beta\:\mid\:\mu\:,\varSigma\:)$$

The mean vector represented by \(\:\mu\:\) and the covariance matrix \(\:\varSigma\:\) in the provided equation denote the statistical properties of the posterior distribution. Analytical calculations are performed using the Bayesian formula to determine the values of these parameters37:

$$\:\mu\:={(\lambda\:\cdot\:{X{\prime\:}X}^{\cdot\:}+\alpha\:\cdot\:I)}^{-1}\cdot\:{X{\prime\:}y}^{\cdot\:}$$

$$\:\varSigma\:={(\lambda\:\cdot\:{X{\prime\:}X}^{\cdot\:}+\alpha\:\cdot\:I)}^{-1}$$

In the context mentioned, \(\:X{\prime\:}X\) stands for the result of transposing the matrix of independent variables and then multiplying it by itself, while \(\:X{\prime\:}y\) denotes the product of transposing the matrix of independent variables (input) and multiplying it by the matrix of dependent variables (output). Additionally, the symbol \(\:I\) denotes the identity matrix. More details are reported elsewhere about the model38,39,40.

Results and discussion

The models implementation was performed using Python software, 3.8 version, accessible at: https://www.python.org. The hyperparameters of the proposed models were optimized via SFO algorithm as previously outlined. The optimized models were used for evaluation of drug solubility and finding which model can best predict digitoxin solubility in the supercritical solvent. The results obtained for each regression model in terms of Average Absolute Relative Deviation (AARD%), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE) are summarized in Tables 2 and 3 for both solubility and density, respectively. The errors between the estimated and measured values have been calculated and used as the main metrics for evaluation of the models’ precision.

The findings reveal that the ADA-GPR model outperforms the other models in terms of solubility and solvent density prediction. Figures 2 and 3 show a comparison of expected and predicted values for both outputs using this model. The ADA-GPR model achieved the lowest AARD% values for both predictions. Specifically, for solubility prediction, ADA-GPR achieved an AARD% of 7.74721E + 00, while for solvent density prediction, it attained an AARD% of 2.76323E + 00. Additionally, the ADA-GPR model demonstrated the lowest RMSE and MAE values, further highlighting its robust performance for estimation of digitoxin solubility.

The comparative analysis of the best model in this work is indicated in Table 4. ADA-GPR model optimized in this study has AARD% of 7.74 which is lower than the traditional thermodynamic models based on Equation of State (EoS) developed by Sheikhi-Kouhsar et al.24.

Table 2 Solubility predictions.
Table 3 Density of solvent predictions.
Table 4 Comparison between this work and previous study for solubility estimation.
Fig. 2
figure 2

Experimental–predicted density values comparison (ADA-GPR model).

Fig. 3
figure 3

Experimental–predicted digitoxin solubility values comparison (ADA-GPR model).

The exceptional efficacy demonstrated by the ADA-GPR model can be ascribed to its adeptness in capturing intricate correlations within the dataset and its adaptability in depicting nonlinear patterns. The ADA-BRR model also performed reasonably well, particularly for solubility prediction, but it is behind the ADA-GPR model in terms of fitting accuracy. On the other hand, the ADA-KNN model exhibited the highest AARD% values and the poorest performance among the three models. The final analysis was conducted using the ADA-GPR model based on the available data. Final prediction surfaces are shown in Figs. 4 and 5. Also, Figs. 6, 7, 8 and 9 are the partial dependencies between inputs and outputs. Confirmation is observed for the results obtained in this research through comparing with the previous studies on computation of pharmaceutical solubility, while the variations have been seen to be similar25. The pressure and temperature effects on the solubility and density in this work show agreement with previous works which reported the use of machine learning in drugs solubility correlation25,41,42,43.

The results confirm the validity of the methodology designed in this work, as the observations match the experimental trends. For the drug solubility versus temperature, a decreasing trend is seen in Fig. 9 for low pressure, while for higher pressure, the solubility increases with temperature. This observation in the drug solubility is related to the cross-over pressure zone which can change the path of solubility variations by enhancing the temperature. As for density, the trend is justified, and one can observe that the density is increased with rising pressure and decreased with increasing T. The trend of pressure is owing to the compressibility of the solvent that is supercritical CO225,44. Thus, it is clearly revealed by the models in this study that the digitoxin solubility in the solvent can be significantly varied by adjusting the pressure as well as temperature to achieve the target value for solubility42. A function can be then defined to reversely estimate the values of pressure and temperature for a given digitoxin solubility by the aid of optimized ML model. Furthermore, the combined effects of density variations on the solubility can be analyzed as the models are able to determine density values versus temperature and pressure.

Fig. 4
figure 4

Simulated 3D plot of solvent density variations using ADA-GPR model. Created by Python software, 3.8 version, accessible at: https://www.python.org.

Fig. 5
figure 5

Simulated 3D plot of digitoxin solubility variations using ADA-GPR model. Created by Python software, 3.8 version, accessible at: https://www.python.org.

Fig. 6
figure 6

Change of density of solvent with P estimated by ADA-GPR model.

Fig. 7
figure 7

Change of density of solvent with T estimated by ADA-GPR model.

Fig. 8
figure 8

Change of digitoxin solubility with P estimated by ADA-GPR model.

Fig. 9
figure 9

Change of digitoxin solubility with T estimated by ADA-GPR model.

Conclusion

In this research work, we investigated the prediction of solubility of digitoxin and the density of the solvent in supercritical CO2 utilizing ensemble methods and regression models. We employed the AdaBoost ensemble method to combine the predictions from GPR, BRR, and KNN regression models. Hyper-parameter optimization was conducted utilizing the Sailfish Optimizer (SFO) to optimize the performance of the models. Our findings reveal that the AdaBoost ensemble method combined with GPR yielded the most accurate predictions for both solvent density and solubility, as evidenced by the lowest Average Absolute Relative Deviation (AARD%) values. Specifically, for solubility prediction, the AdaBoost-GPR combination achieved an AARD% of 7.74, while for solvent density prediction, it attained an AARD% of 2.76. These results underscore the effectiveness of ensemble methods and hyper-parameter tuning methods in accurately predicting complex chemical properties in supercritical CO2 systems. The findings obtained from this study have the potential to contribute valuable knowledge to future research efforts focused on enhancing hyper-parameters and refining predictive modeling methods for analogous chemical systems.

Data availability

The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Csicsák, D. et al. The effect of the particle size reduction on the biorelevant solubility and dissolution of poorly soluble drugs with different acid-base character. Pharmaceutics 15(1), 278 (2023).

    PubMed  PubMed Central  Google Scholar 

  2. Kumar, R. et al. Particle size reduction techniques of pharmaceutical compounds for the enhancement of their dissolution rate and bioavailability. J. Pharm. Innov. 17(2), 333–352 (2022).

    Google Scholar 

  3. Rams-Baron, M. et al. Amorphous Drug Solubility and Absorption Enhancement, in Amorphous Drugs: Benefits and Challenges 41–68 (Springer International Publishing, 2018).

  4. Zhuo, X. et al. Mechanisms of drug solubility enhancement induced by β-lactoglobulin-based amorphous solid dispersions. Mol. Pharm. 20(10), 5206–5213 (2023).

    PubMed  Google Scholar 

  5. Abourehab, M. A. S. et al. Theoretical investigations on the manufacture of drug nanoparticles using green supercritical processing: Estimation and prediction of drug solubility in the solvent using advanced methods. J. Mol. Liq. 120559 (2022).

  6. Faris Alotaibi, H. et al. Pharmaceutical nanonization by green supercritical processing: investigation of exemestane anti-estrogenic medicine solubility using machine learning. J. Mol. Liq. 392, 123353 (2023).

    Google Scholar 

  7. Zhang, Y. Analysis of nanonization and purification of organic compounds via green supercritical processing: model development using advanced hybrid techniques. Case Stud. Therm. Eng. 55, 104159 (2024).

    Google Scholar 

  8. Faraz, O. et al. Thermodynamic modeling of pharmaceuticals solubility in pure, mixed and supercritical solvents. J. Mol. Liq. 353, 118809 (2022).

    Google Scholar 

  9. Zarei, A., Haghbakhsh, R. & Raeissi, S. Overview and thermodynamic modelling of deep eutectic solvents as co-solvents to enhance drug solubilities in water. Eur. J. Pharm. Biopharm. 193, 1–15 (2023).

    PubMed  Google Scholar 

  10. Alanazi, M. et al. Development of a novel machine learning approach to optimize important parameters for improving the solubility of an anti-cancer drug within green chemistry solvent. Case Stud. Therm. Eng. 49, 103273 (2023).

    Google Scholar 

  11. Cenci, F. et al. Predicting drug solubility in organic solvents mixtures: A machine-learning approach supported by high-throughput experimentation. Int. J. Pharm. 660, 124233 (2024).

    PubMed  Google Scholar 

  12. Ghazwani, M. et al. Development of advanced model for understanding the behavior of drug solubility in green solvents: machine learning modeling for small-molecule API solubility prediction. J. Mol. Liq. 386, 122446 (2023).

    Google Scholar 

  13. Wang, C. et al. Prediction of enhanced drug solubility related to clathrate compositions and operating conditions: machine learning study. Int. J. Pharm. 646, 123458 (2023).

    PubMed  Google Scholar 

  14. Rabbani, Y. et al. Application of artificial neural networks and support vector regression modeling in prediction of magnetorheological fluid rheometery. Colloids Surf., A. 520, 268–278 (2017).

    Google Scholar 

  15. Togun, H. et al. Advancing organic photovoltaic cells for a sustainable future: the role of artificial intelligence (AI) and deep learning (DL) in enhancing performance and innovation. Sol. Energy. 291, 113378 (2025).

    Google Scholar 

  16. Halawani, R. F. et al. An advanced heat design-CO2 capture network for an oxy-biogas fuel combustion cycle combined with a CAES-based method for peak shaving: An artificial intelligent-driven optimization. Renew. Energy. 242, 122474 (2025).

    Google Scholar 

  17. Saini, J. P. S., Thakur, A. & Yadav, D. AI-driven innovations in pharmaceuticals: optimizing drug discovery and industry operations. RSC Pharm. 2(3), 437–454 (2025).

    Google Scholar 

  18. Alpaydin, E. Introduction To Machine Learning (MIT Press, 2020).

  19. Graish, M. S. et al. Prediction of the viscosity of iron-CuO/water-ethylene glycol non-Newtonian hybrid nanofluids using different machine learning algorithms. Case Stud. Chem. Environ. Eng. 11, 101180 (2025).

    Google Scholar 

  20. Hajinajaf, N. et al. Integrated CO2 capture and nutrient removal by microalgae chlorella vulgaris and optimization using neural network and support vector regression. Waste Biomass Valoriz. 13(12), 4749–4770 (2022).

    Google Scholar 

  21. Zhikun, H. et al. Overview of Gaussian process regression. Control Decis. 28(8), 1121–1129 (2013).

    Google Scholar 

  22. Shi, Q., Abdel-Aty, M. & Lee, J. A Bayesian ridge regression analysis of congestion’s impact on urban expressway safety. Accid. Anal. Prev. 88, 124–137 (2016).

  23. Kang, S. K-nearest neighbor learning with graph neural networks. Mathematics 9(8), 830 (2021).

    Google Scholar 

  24. Sheikhi-Kouhsar, M. et al. Solubility of digitoxin in supercritical CO2: Experimental study and modeling. Eur. J. Pharm. Sci., 106731. (2024).

  25. Li, M. et al. Employment of artificial intelligence approach for optimizing the solubility of drug in the supercritical CO2 system. Case Stud. Therm. Eng. 57, 104326 (2024).

    Google Scholar 

  26. Shang, Y. et al. Artificial neural network hyperparameters optimization for predicting the thermal conductivity of mxene/graphene nanofluids. J. Taiwan Inst. Chem. Eng. 164, 105673 (2024).

    Google Scholar 

  27. Zhou, H. et al. Combination of group method of data handling neural network with multi-objective Gray Wolf optimizer to predict the viscosity of MWCNT-TiO2 -oil SAE50 nanofluid. Case Stud. Therm. Eng. 64, 105541 (2024).

    Google Scholar 

  28. Shadravan, S., Naji, H. R. & Bardsiri, V. K. The sailfish optimizer: A novel nature-inspired metaheuristic algorithm for solving constrained engineering optimization problems. Eng. Appl. Artif. Intell. 80, 20–34 (2019).

    Google Scholar 

  29. Srivastava, A. & Das, D. K. A sailfish optimization technique to solve combined heat and power economic dispatch problem. In 2020 IEEE Students Conference on Engineering & Systems (SCES). (IEEE, 2020).

  30. Schapire, R. E. The strength of weak learnability. Mach. Learn. 5(2), 197–227 (1990).

    Google Scholar 

  31. Freund, Y. & Schapire, R. E. Experiments with a new boosting algorithm. In icml. ( Citeseer, 1996).

  32. Grbić, R., Kurtagić, D. & Slišković, D. Stream water temperature prediction based on Gaussian process regression. Expert Syst. Appl. 40(18), 7407–7414 (2013).

    Google Scholar 

  33. Ma, X., Xu, F. & Chen, B. Interpolation of wind pressures using Gaussian process regression. J. Wind Eng. Ind. Aerodyn. 188, 30–42 (2019).

    Google Scholar 

  34. Rasmussen, C. E. Gaussian processes in machine learning. In Summer School on Machine Learning (Springer, 2003).

  35. Song, H. et al. Advancing nanomedicine production via green method: modeling and simulation of pharmaceutical solubility at different temperatures and pressures. J. Mol. Liq. 411, 125806 (2024).

    Google Scholar 

  36. Williams, P. M. Bayesian regularization and pruning using a Laplace prior. Neural Comput. 7(1), 117–143 (1995).

    Google Scholar 

  37. Kruschke, J. K. Bayesian data analysis. Wiley Interdisciplinary Reviews: Cogn. Sci. 1(5), 658–676 (2010).

    Google Scholar 

  38. Kudraszow, N. L. & Vieu, P. Uniform consistency of kNN regressors for functional variables. Stat. Probab. Lett. 83(8), 1863–1870 (2013).

    Google Scholar 

  39. Cover, T. Estimation by the nearest neighbor rule. IEEE Trans. Inf. Theory. 14(1), 50–55 (1968).

    MATH  Google Scholar 

  40. Chen, C. R. & Three Kartini, U. K-nearest neighbor neural network models for very short-term global solar irradiance forecasting based on meteorological data. Energies. 10(2), 186 (2017).

  41. Li, M. et al. Optimization of drug solubility inside the supercritical CO2 system via numerical simulation based on artificial intelligence approach. Sci. Rep. 14(1), 22779 (2024).

    PubMed  PubMed Central  Google Scholar 

  42. Almehizia, A. A. et al. Numerical optimization of drug solubility inside the supercritical carbon dioxide system using different machine learning models. J. Mol. Liq. 392, 123466 (2023).

    Google Scholar 

  43. Meng, D. & Liu, Z. Machine learning aided pharmaceutical engineering: model development and validation for estimation of drug solubility in green solvent. J. Mol. Liq. 392, 123286 (2023).

    Google Scholar 

  44. Aldawsari, M. F., Mahdi, W. A. & Alamoudi, J. A. Data-driven models and comparison for correlation of pharmaceutical solubility in supercritical solvent based on pressure and temperature as inputs. Case Stud. Therm. Eng. 49, 103236 (2023).

    Google Scholar 

Download references

Acknowledgements

The authors extend their appreciation to Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia for funding this work under researcher supporting project number (PNURSP2025R205).

Funding

This work was supported by Princess Nourah bint Abdulrahman University researchers supporting project number (PNURSP2025R205), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Author information

Authors and Affiliations

  1. Department of Pharmaceutical Sciences, College of Pharmacy, Princess Nourah Bint AbdulRahman University, 11671, Riyadh, Saudi Arabia

    Hadil Faris Alotaibi

  2. University of Warith Al-Anbiyaa, Kerbala, 56001, Iraq

    Waqed H. Hassan

  3. Department of Civil Engineering, College of Engineering, University of Kerbala, Kerbala, 56001, Iraq

    Waqed H. Hassan

  4. Al-Manara College for Medical Sciences, Amarah, Maysan, Iraq

    Ahmed Kateb Jumaah Al-Nussairi

  5. Faculty of Data Science and Information Technology, INTI International University, Persiaran Perdana BBN, Putra Nilai, 71800, Nilai, Malaysia

    Narinderjit Singh Sawaran Singh

  6. Department of Chemical Engineering and Petroleum Refining, Kut University College, Kut, Wasit, 52001, Iraq

    Ahmed Salah Al-Shati

  7. Department of Chemistry and Biochemistry, School of Sciences, JAIN (Deemed to be University), Bangalore, Karnataka, India

    M. M. Rekha

  8. Department of Biochemistry, IMS and SUM Hospital, Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, Odisha, 751003, India

    Subhashree Ray

  9. School of Applied and Life Sciences, Division of Research and Innovation, Uttaranchal University, Dehradun, Uttarakhand, India

    Aashna Sinha

  10. Centre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, 140401, India

    Gunjan Garg

Authors

  1. Hadil Faris Alotaibi
  2. Waqed H. Hassan
  3. Ahmed Kateb Jumaah Al-Nussairi
  4. Narinderjit Singh Sawaran Singh
  5. Ahmed Salah Al-Shati
  6. M. M. Rekha
  7. Subhashree Ray
  8. Aashna Sinha
  9. Gunjan Garg

Contributions

Hadil Faris Alotaibi: Writing, Methodology, Investigation, Validation, Supervision, Funding acquisition.Waqed H. Hassan: Writing, Modeling, Investigation, Validation, Software.Ahmed Kateb Jumaah Al-Nussairi: Writing, Methodology, Resources, Software.Narinderjit Singh Sawaran Singh: Writing, Resources, Investigation, Software.Ahmed Salah Al-Shati: Writing, Methodology, Formal analysis, Software.Rekha M. M.: Writing, Methodology, Investigation, Validation, Conceptualization.Subhashree Ray: Writing, Conceptualization, Investigation, Validation, Software.Aashna Sinha: Writing, Methodology, Investigation, Validation, Resources.Gunjan Garg: Writing, Resources, Validation, Conceptualization.All authors reviewed the manuscript.

Corresponding author

Correspondence to Hadil Faris Alotaibi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alotaibi, H.F., Hassan, W.H., Al-Nussairi, A.K.J. et al. Computational machine learning estimation of digitoxin solubility in supercritical solvent at different temperatures utilizing ensemble methods. Sci Rep 15, 29248 (2025). https://doi.org/10.1038/s41598-025-15049-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41598-025-15049-x

Keywords

关于《Computational machine learning estimation of digitoxin solubility in supercritical solvent at different temperatures utilizing ensemble methods》的评论


暂无评论

发表评论

摘要

Your request seems to involve summarizing or providing information related to a research article on the computational machine learning estimation of digitoxin solubility in supercritical solvents at different temperatures using ensemble methods. Here's a concise summary and key points from the provided content: ### Summary The study focuses on estimating the solubility of digitoxin, a cardiac glycoside drug, in supercritical carbon dioxide (CO2) solvent at varying temperatures and pressures. The research employs machine learning techniques to predict these values accurately. ### Key Points 1. **Introduction**: - The article introduces the importance of understanding drug solubility in supercritical fluids for enhanced pharmaceutical applications. - Utilizes computational methods, particularly ensemble machine learning algorithms, to enhance prediction accuracy. 2. **Materials and Methods**: - Data collection on digitoxin's properties at different temperatures and pressures in supercritical CO2. - Application of various machine learning models including Gaussian Process Regression (GPR), Bayesian Regularization Neural Networks (BRNN), k-Nearest Neighbors (kNN), and boosting algorithms like AdaBoost and Gradient Boosting. 3. **Results**: - Presentation of predictive models' performance in terms of accuracy, precision, and robustness. - Visualization of results using plots and graphs to demonstrate solubility trends across different conditions. 4. **Discussion**: - Interpretation of findings, highlighting the impact of temperature and pressure on digitoxin solubility. - Comparison of ensemble methods against traditional machine learning models in terms of predictive power and computational efficiency. 5. **Conclusion**: - Summarizes the effectiveness of ensemble methods for predicting drug solubility in supercritical fluids. - Highlights implications for future research, emphasizing the potential for optimizing pharmaceutical processes using advanced computational techniques. ### Contributions - **Hadil Faris Alotaibi**: Writing, methodology, investigation, validation, supervision, funding acquisition. - **Waqed H. Hassan**: Writing, modeling, investigation, validation, software development. - **Ahmed Kateb Jumaah Al-Nussairi**: Writing, methodology, resources, software support. - **Narinderjit Singh Sawaran Singh**: Writing, resources, investigation, software support. - **Ahmed Salah Al-Shati**: Writing, methodology, formal analysis, software implementation. - **Rekha M. M.**: Writing, methodology, investigation, validation, conceptualization. - **Subhashree Ray**: Writing, conceptualization, investigation, validation, software development. - **Aashna Sinha**: Writing, methodology, investigation, validation, resources. - **Gunjan Garg**: Writing, resources, validation, conceptualization. ### Acknowledgements and Funding Acknowledgment is given to Princess Nourah bint Abdulrahman University for funding the research under project number PNURSP2025R205. ### Ethical Considerations The authors declare no competing interests in their research. This summary provides an overview of the computational approach used, the results obtained, and the implications of this study for predicting digitoxin solubility in supercritical CO2. For a detailed understanding, the original article should be consulted, as it contains comprehensive data analysis, methodological details, and extensive graphical representations.