Improving AI coaching with Gemini using real-world Fitbit data

2025-09-17 12:57:31 英文原文

作者：Justin KhasentinoAnastasiya BelyaevaCory Y. McLean

We created and curated three benchmark datasets to assess large language model (LLM) performance on sleep and fitness tasks ranging from answering expert questions to real-world coaching scenarios. Fine-tuning the Gemini LLM on real-world coaching tasks and self-reported sleep-quality outcomes improved its performance and provided a benchmark for further development.

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

27,99 € / 30 days

cancel any time

Subscription info for Chinese customers

We have a dedicated website for our Chinese customers. Please go to naturechina.com to subscribe to this journal.

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

References

Zheng, N. S. et al. Sleep patterns and risk of chronic disease as measured by long-term monitoring with commercial wearable devices in the All of Us Research Program. Nat. Med. 30, 2648–2656 (2024). A research article showing that sleep patterns captured by wearable devices are associated with chronic disease incidence.
Article CAS PubMed PubMed Central Google Scholar
Gemini Team, Google. Gemini: A family of highly capable multimodal models. Preprint at https://doi.org/10.48550/arXiv.2312.11805 (2023). A technical report introducing the Gemini 1.0 model family.
McDuff, D. et al. The Google Health digital well-being study: protocol for a digital device use and well-being study. JMIR Res. Protoc. 13, e49189 (2024). A research protocol detailing the patient-reported outcomes dataset comprising self-reported sleep disturbance outcomes paired with daily-resolution numerical sensor data.
Article PubMed PubMed Central Google Scholar
Cosentino, J. et al. Inference of chronic obstructive pulmonary disease with deep learning on raw spirograms identifies new genetic loci and improves risk models. Nat. Genet. 55, 787–795 (2023). A research article demonstrating that predicting health outcomes from raw high-dimensional data outperforms phenotyping that relies on a smaller set of hand-crafted features.
Article CAS PubMed Google Scholar
Belyaeva, A. et al. Multimodal LLMs for health grounded in individual-specific data. ML4MHD 14315, 86–102 (2024). A research article demonstrating that multimodal LLMs can estimate personal disease risk from high-dimensional clinical modalities.
CAS Google Scholar

Download references

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is a summary of: Khasentino, J. et al. A personal health large language model for sleep and fitness coaching. Nat. Med. https://doi.org/10.1038/s41591-025-03888-0 (2025).

About this article

Cite this article

Improving AI coaching with Gemini using real-world Fitbit data. Nat Med (2025). https://doi.org/10.1038/s41591-025-03988-x

Download citation

Published: 17 September 2025
DOI: https://doi.org/10.1038/s41591-025-03988-x

关于《Improving AI coaching with Gemini using real-world Fitbit data》的评论

暂无评论

发表评论

摘要

Three benchmark datasets were created to evaluate large language models (LLMs) on sleep and fitness tasks, including expert question answering and real-world coaching scenarios. Fine-tuning the Gemini LLM on these datasets improved its performance, particularly with self-reported sleep quality outcomes, setting a new benchmark for further development.