Marchuk: Efficient Global Weather Forecasting from Mid-Range to Sub-Seasonal Scales via Flow Matching

Arsen Kuzhamuratov^1,2,*, Mikhail Zhirnov^1,2, Andrey Kuznetsov^1,2, Ivan Oseledets¹, Konstantin Sobolev^1,2

AXXX¹, FusionBrain Lab²
^*kuzhamuratov@fusionbrainlab.com

Paper Code

We present Marchuk, a generative latent flow-matching model for global weather forecasting spanning mid-range to subseasonal timescales, with prediction horizons of up to 30 days.

q850 (Specific Humidity at 850 hPa), z850 (Geopotential at 850 hPa), 10u (10 meter u wind component), and 2t (2 meter temperature) 6h forecasts initialized 2021-06-20T12.

Method Overview

General Scheme of Marchuk.

Marchuk operates in the latent space learned by the DC-AE model introduced in LaDCast;
Within this latent space, we train a flow-matching Diffusion Transformer (DiT) to model the conditional distribution of future weather fields given the current state;
Marchuk is conditioned on weather maps from the previous K days and is provided with noised weather maps for the subsequent N days as input;

Architecture of Marchuk DiT.

Key components of Marchuk DiT:

Cross-DiT architecture;
Timestamp embeddings conditioning within the Cross-Attention blocks;
LoRA-based timestep modulation;
New strategy for positional embeddings: learnable 2-D spatial positional embeddings together with 1-D rotary positional embeddings (RoPE) applied along the temporal dimension.

WeatherBench-2 Metrics

The 276M-parameter Marchuk model consistently outperforms the LaDCast 375M baseline across the evaluated metrics and attains performance comparable to the much larger LaDCast 1.6B model. The 276M variant of Marchuk achieves an approximately 6x speedup relative to LaDCast 1.6B while maintaining similar quantitative accuracy.

Inference speed: the model performs predictions entirely in the latent space for a 30-day forecast horizon with an ensemble size of 50, executed on an H100 GPU.

RMSE comparison. We evaluate LaDCast and Marchuk on the WeatherBench-2 benchmark over a 30-day prediction horizon.

CRPS Ensemble Metrics Comparison. Figure illustrates the evolution of CRPS over a 30-day forecast horizon.

RMSE metrics at 30-day forecast horizons. The evaluated variables include atmospheric fields – UW500 (u-component of wind at 500 hPa), T500 (temperature at 500 hPa), G500 (geopotential at 500 hPa), and SH500 (specific humidity at 500 hPa) – as well as surface fields – SLP (sea level pressure), 10m-UW and 10m-VW (u- and v-components of wind at 10 meters), and T2M (temperature at 2 meters).

CRPS metrics at 30-day forecast horizons. The evaluated variables include atmospheric fields – UW500 (u-component of wind at 500 hPa), T500 (temperature at 500 hPa), G500 (geopotential at 500 hPa), and SH500 (specific humidity at 500 hPa) – and surface fields – SLP (sea level pressure), 10mUW (u-component of wind at 10 meters), T2M (temperature at 2 meters), and TP-6h (total accumulated precipitation over the last 6 hours).

BibTeX

@misc{kuzhamuratov2026marchukefficientglobalweather,
      title={Marchuk: Efficient Global Weather Forecasting from Mid-Range to Sub-Seasonal Scales via Flow Matching}, 
      author={Arsen Kuzhamuratov and Mikhail Zhirnov and Andrey Kuznetsov and Ivan Oseledets and Konstantin Sobolev},
      year={2026},
      eprint={2603.24428},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2603.24428}, 
}