RUDAR: WEATHER RADAR DATASET FOR PRECIPITA-TION NOWCASTING WITH GEOGRAPHICAL AND SEA-SONAL VARIABILITY

Abstract

Precipitation nowcasting, a short-term (up to six hours) rain prediction, is arguably one of the most demanding weather forecasting tasks. To achieve accurate predictions, a forecasting model should consider miscellaneous meteorological and geographical data sources. Currently available datasets provide information only about precipitation intensity, vertically integrated liquid (VIL), or maximum reflectivity on the vertical section. Such single-level or aggregated data lacks description of the reflectivity change in vertical dimension, simplifying or distorting the corresponding models. To fill this gap, we introduce an additional dimension of the precipitation measurements in the RuDar dataset that incorporates 3D radar echo observations. Measurements are collected from 30 weather radars located mostly in the European part of Russia, covering multiple climate zones. Radar product updates every 10 minutes with a 2 km spatial resolution. The measurements include precipitation intensity (mm/h) at an altitude of 600 m, reflectivity (dBZ) and radial velocity (m/s) at 10 altitude levels from 1 km to 10 km with 1 km step. We also add the orography information as it affects the intensity and distribution of precipitation. The dataset includes over 50 000 timestamps over a two-year period from 2019 to 2021, totalling in roughly 100 GB of data. We evaluate several baselines, including optical flow and neural network models, for precipitation nowcasting on the proposed data. We also evaluate the uncertainty quantification for the ensemble scenario and show that the corresponding estimates do correlate with the ensemble errors on different sections of data. We believe that RuDar dataset will become a reliable benchmark for precipitation nowcasting models and also will be used in other machine learning tasks, e.g., in data shift studying, anomaly detection, or uncertainty estimation. Both dataset and code for data processing and model preparation are publicly available 1 .

1. INTRODUCTION

Precipitation nowcasting is the task of forecasting a rainfall situation (precipitation location and strength) for a short period of time, usually up to six hours. Due to climate change the frequency and magnitude of extreme weather events, e.g. sudden downpours, increase, and the techniques for forecasting such events are needed. Precipitation nowcasting can provide information about such events with a high spatiotemporal resolution. Such kind of weather forecasting plays an essential role in resource planning in the agricultural industry, aviation, sailing, etc. as well as in daily life. Incorrect precipitation forecasting could have a negative impact on human life activity, and data with diverse meteorological and geographical characteristics are needed for improving precipitation nowcasting models. The different benchmark dataset usage could improve the quality of precipitation nowcasting models to minimize the risk of forecasting error. weather radars. Some of those datasets only contain information about precipitation intensity, others provide vertically integrated liquid value (VIL) or maximum reflectivity on the vertical section. However, a single measurement type is often not enough for extreme weather events forecasting. Thereby, we propose a RuDar dataset that contains several measurement products: reflectivity (dBZ) and radial velocity (m/s) on ten altitude levels from 1 km to 10 km with 1 km step and intensity (mm/h) on a 600 m altitude level. Each measurement was carried out with a 2 km spatial resolution and a 10 minute temporal resolution. The dataset from 30 dual-pol Doppler weather radars were collected and processed at the Radar Center of the Central Aerological Observatory (CAO) of the Russian Federal Service for Hydrometeorology and Environmental Monitoring (ROSGIDROMET) and is used by our team within the conditions of commercial contract. For each radar, we additionally provide information about the surrounding orography Becker et al. ( 2009). The radars are located mostly in the European part of Russia as shown in Figure 1 , therefore, a wide range of geographical and climatic conditions is considered. The proposed dataset includes more than 50 000 timestamps over a two year period from 2019 to 2021, allowing to investigate the effect of seasonality on rainfall forecast. We illustrate the applicability of our dataset to the nowcasting task by benchmarking the current state-of-the-art optical flow approach Ayzel et al. ( 2019 The main paper contributions are (i) published weather radar dataset with different geographical and climatic conditions (provided under the CC BY NC SA 4.0 license) together with the accompanying exploratory data analysis, (ii) evaluations of common simple precipitation nowcasting models and its extenstion to support additional data, (iii) uncertainty estimation and its connection to the error for the nowcasting ensemble case, and (iv) accompanying source code for data processing and experiments. The structure of the paper is as follows: Section 2 covers previously published datasets for the precipitation nowcasting task, Section 3 describes the proposed dataset, Section 4 introduces evaluated nowcasting benchmarks, Section 5 explores the uncertainty estimation scenario for the ensemble of models, and Section 6 concludes the paper.

2. RELATED WORK

Doppler weather radar is the most effective tool for detecting precipitation. The radar measures reflectivity of radio waves from precipitation drops, which can then be converted into precipitation intensity using Z-R relation Marshall & Palmer (1948) . Standard ways of obtaining a single reflectivity measure from the different heights is either taking measurements from only the lower level (base reflectivity), or aggregating these measurements by the maximum value (composite reflectivity). In addition, a Doppler radar can detect movement towards or away from itself, which allows measuring the speed of precipitation movement along or against the direction of the radar. The latter type of measurement is called radial velocity.



URL is hidden for the blind review.



Previously published benchmarks Holleman (2007); Shi et al. (2017); Ansari et al. (2018); Ramsauer et al. (2018); Franch et al. (2020); Veillette et al. (2020) provide data collected with one or several

Figure 1: The geographical area covered by the proposed weather radar dataset (light areas). The covered area has a variety of geographic and climatic characteristics. The color indicates the height above sea level.

) and neural network models Shi et al. (2015); Veillette et al. (2020); Ravuri et al. (2021) on it. Experiments show a seasonal dependence having effect on the algorithm performance due to different precipitation intensity rates and differences between adjacent timestamps in different months.

