作者:Justin KhasentinoAnastasiya BelyaevaCory Y. McLean
We created and curated three benchmark datasets to assess large language model (LLM) performance on sleep and fitness tasks ranging from answering expert questions to real-world coaching scenarios. Fine-tuning the Gemini LLM on real-world coaching tasks and self-reported sleep-quality outcomes improved its performance and provided a benchmark for further development.
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
27,99 € / 30 days
cancel any time
Subscription info for Chinese customers
We have a dedicated website for our Chinese customers. Please go to naturechina.com to subscribe to this journal.
Buy this article
Prices may be subject to local taxes which are calculated during checkout
Zheng, N. S. et al. Sleep patterns and risk of chronic disease as measured by long-term monitoring with commercial wearable devices in the All of Us Research Program. Nat. Med. 30, 2648–2656 (2024). A research article showing that sleep patterns captured by wearable devices are associated with chronic disease incidence.
Gemini Team, Google. Gemini: A family of highly capable multimodal models. Preprint at https://doi.org/10.48550/arXiv.2312.11805 (2023). A technical report introducing the Gemini 1.0 model family.
McDuff, D. et al. The Google Health digital well-being study: protocol for a digital device use and well-being study. JMIR Res. Protoc. 13, e49189 (2024). A research protocol detailing the patient-reported outcomes dataset comprising self-reported sleep disturbance outcomes paired with daily-resolution numerical sensor data.
Cosentino, J. et al. Inference of chronic obstructive pulmonary disease with deep learning on raw spirograms identifies new genetic loci and improves risk models. Nat. Genet. 55, 787–795 (2023). A research article demonstrating that predicting health outcomes from raw high-dimensional data outperforms phenotyping that relies on a smaller set of hand-crafted features.
Belyaeva, A. et al. Multimodal LLMs for health grounded in individual-specific data. ML4MHD 14315, 86–102 (2024). A research article demonstrating that multimodal LLMs can estimate personal disease risk from high-dimensional clinical modalities.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This is a summary of: Khasentino, J. et al. A personal health large language model for sleep and fitness coaching. Nat. Med. https://doi.org/10.1038/s41591-025-03888-0 (2025).
Improving AI coaching with Gemini using real-world Fitbit data. Nat Med (2025). https://doi.org/10.1038/s41591-025-03988-x
Published:
DOI: https://doi.org/10.1038/s41591-025-03988-x