MIRA: Medical Time Series Foundation Model for Real-World Health Data
Source
- Raw Markdown: paper_mira-2025.md
- PDF: paper_mira-2025.pdf
- Preprint: arXiv 2506.07584
- Official code: microsoft/MIRA
- Official Microsoft Research publication page: https://www.microsoft.com/en-us/research/publication/mira-medical-time-series-foundation-model-for-real-world-health-data/
- Local repository snapshot retained in
papers/mira-2025/source_github_readme.md. - Local repository metadata retained in
papers/mira-2025/source_repo_metadata.json.
Status And Credibility
MIRA was first posted to arXiv on 2025-06-09 and the inspected arXiv metadata reports version 7, updated 2025-12-16. The arXiv comment states NeurIPS 2025 Main Conference, and the official Microsoft Research publication page lists it as a June 2025 NeurIPS 2025 publication. The official repository is under the Microsoft GitHub organization, MIT-licensed, and the inspected default-branch commit is 4648346290311227208162e0813c1a38624d8a73, committed on 2026-01-30.
Authenticated X API searches for the exact title and arXiv id on 2026-06-15 did not return a verified official X announcement, so no X thread is cited here.
Core Claim
MIRA is a medical time-series foundation model for irregular clinical data. It combines continuous-time positional encoding, frequency-specific sparse expert routing, and Neural ODE extrapolation so a pretrained model can forecast heterogeneous medical observations at arbitrary target timestamps.
Key Contributions
- Introduces Continuous-Time RoPE for real-valued timestamps rather than fixed token positions.
- Adds a frequency-specific mixture-of-experts block so different experts can specialize to temporal regimes.
- Uses a Continuous Dynamics Extrapolation Block based on Neural ODEs to evolve latent state to target timestamps.
- Pretrains on a large medical corpus reported as more than 454 billion time points from public and ethics-approved clinical sources.
- Evaluates out-of-distribution and in-distribution clinical forecasting tasks, reporting average forecasting-error reductions over zero-shot and fine-tuned baselines.
Evidence And Results
The paper evaluates MIRA on irregular clinical data and originally regular datasets with simulated missingness. Downstream datasets include CinC 2012, MIT-BIH, Johns Hopkins COVID-19, CDC Influenza Hospitalizations Admissions, heart-rate, and illness data, while in-distribution tests include pretraining-source families such as MIMIC and PTB-XL.
The paper reports that MIRA is strongest on out-of-distribution forecasting and that domain-specific medical pretraining beats larger general-domain time-series foundation models in this setting. Ablations identify the Continuous Dynamics Extrapolation Block as the largest single component contributor, with CT-RoPE and MoE also improving performance.
Limitations
- MIRA is a passive forecasting model; it does not model treatments, interventions, medications, procedures, or clinician actions as action-conditioned next-state dynamics.
- The paper is healthcare-specific. Its pretraining corpus is a strength for clinical forecasting but not evidence that the architecture transfers unchanged to observability, finance, energy, or robotics.
- The benchmark focuses on RMSE and MAE forecasting, not calibrated decision utility, counterfactual prediction, or treatment-effect modeling.
- Medical data sources are public and approved, but the page should still treat privacy, cohort shift, missingness mechanisms, and institutional deployment as unresolved risks.
Foundation TSFM Relevance
| Agenda slot | Verdict | Evidence | Missing pieces |
|---|---|---|---|
| Irregular time and continuous state | partially closes | CT-RoPE encodes irregular timestamps and Neural ODE extrapolation predicts at arbitrary target times. | Needs streaming state-update tests and non-healthcare transfer. |
| Heterogeneous medical corpus and frequency-aware routing | partially closes | MIRA trains over ICU, waveform, EHR, sleep, and public-health time series with frequency-specific expert routing. | Does not natively model cross-channel dynamics; the paper uses channel-independent univariate forecasting. |
| Scaling and domain-specific pretraining | partially closes | The paper reports 454B medical time points and better zero-shot clinical forecasting than general TSFMs. | Needs matched leakage audits and broader benchmark harnesses. |
| Control and counterfactuals | insufficient evidence | No treatment, intervention, or clinician-action channel is evaluated as a controllable input. | Needs action-conditioned medical trajectories and confounding-aware evaluation. |
Links Into The Wiki
- MIRA
- Time-Series Foundation Models
- Latent-State Time-Series Modeling
- Streaming Latent-State Updates
- Mixture Of Experts
- Time-Series Benchmark Hygiene
- Foundation Time-Series Model Research Agenda
Open Questions
- Can MIRA’s continuous-time latent extrapolation be reused for non-medical irregular time series?
- How would MIRA change if treatments, medication doses, procedures, or care-team decisions were explicit action or intervention channels?
- Does frequency-specific expert routing preserve rare clinical regimes, or mostly separate sampling-rate regimes?