Gas Storage Data: Introducing the {giedata} Package for R

With gas storage data gaining increased attention due to the consequences of Russia’s war against Ukraine, I have written a package to access the API of Gas Infrastructure Europe (GIE).

Yannik yannikbuhl.de
2022-07-30

It has not been until Russia’s brutal attack on Ukraine in February of 2022 that daily data on gas storage facilities in Europe would be an interesting topic to deal with for a broader public. Yet, in response to the sanctions put in place by the European Union, Russia has reduced the flow of natural gas to EU countries (thereby exploiting their huge dependency on Russian gas, probably for political reasons). In many countries, it is now questionable as to whether the gas storage facilities will hold enough gas to get through winter time.

This is why politicians, journalists, managers and statisticians are now in need for high frequency, reliable data on gas storage facilities. In Europe, they are provided by Gas Infrastructure Europe (GIE). For their AGSI+/ALSI+ transparency platform they offer an API that provides all available data on gas storage, the most important data being filling level as well as inflow and outflow.

Right after people started searching for gas storage data with the start of the war, GIE changed the architecture of their API fundamentally, introducing pagination and updating most endpoints and parameters. Since there was no pre-existing R package that could work with the new architecture, I decided to write it myself. As a result, {giedata} was published on CRAN in July 2022. Here, I want to give a short overview of what it offers:

The first step in a possible workflow would be to fetch the metadata of the available gas storage companies and their facilities in a given region or country. In this example, I will fetch all gas storage companies in Germany including their facilities, using the first of three main functions of the package, get_gielisting():

storage <- giedata::get_gielisting(region = "Europe",
                                   country = "Germany",
                                   facilities = TRUE)

head(storage, n = 5)
# A tibble: 5 × 9
  facility_eic     facility_name     facility_type company_eic country
  <chr>            <chr>             <chr>         <chr>       <chr>  
1 21Z000000000271O UGS Rehden        Storage Faci… 21X0000000… Germany
2 21W0000000001148 UGS Jemgum H (as… Storage Faci… 21X0000000… Germany
3 21W0000000001261 VSP NORD (Rehden… Storage Group 21X0000000… Germany
4 21W0000000000184 UGS Wolfersberg   Storage Faci… 37X0000000… Germany
5 21W0000000001083 UGS Berlin        Storage Faci… 37X0000000… Germany
# … with 4 more variables: country_code <chr>,
#   company_shortname <chr>, company_name <chr>, company_url <chr>

Based on this information, we can then use the function get_giedata() to download the data from AGSI+/ALSI+ (ALSI+ for liquefied natural gas is not yet supported for the new API architecture). From the data set above we see that the unique ID (EIC) of the company “astora” is 21X000000001160J, we can use that now (from the data set above we, too, get the EIC of the storage unit in Rehden):

data <- giedata::get_giedata(country = "DE",
                             company = "21X000000001160J",
                             facility = "21Z000000000271O", 
                             from = "2022-01-01",
                             to = "2022-01-05")

head(data)
# A tibble: 5 × 13
  name       code  url   gasDayStart gasInStorage injection withdrawal
  <chr>      <chr> <chr> <date>             <dbl>     <dbl>      <dbl>
1 UGS Rehden 21Z0… 21Z0… 2022-01-01          2.86     229.         0  
2 UGS Rehden 21Z0… 21Z0… 2022-01-02          3.00     138.         0  
3 UGS Rehden 21Z0… 21Z0… 2022-01-03          3.08      79.7        0  
4 UGS Rehden 21Z0… 21Z0… 2022-01-04          3.16      86.1        0  
5 UGS Rehden 21Z0… 21Z0… 2022-01-05          3.15       0         12.6
# … with 6 more variables: workingGasVolume <dbl>,
#   injectionCapacity <dbl>, withdrawalCapacity <dbl>, status <chr>,
#   trend <dbl>, full <dbl>

The last of three main functions is get_giedata2() and is a generalised version of get_giedata() that allows you to download data for multiple countries, companies or facilities at once so you do not have to loop over get_giedata() yourself:

data2 <- giedata::get_giedata2(countries = c("DE", "NL", "AT"),
                               date = "2022-07-01")

head(data2)
# A tibble: 3 × 15
  name        code  url   gasDayStart gasInStorage consumption
  <chr>       <chr> <chr> <date>             <dbl>       <dbl>
1 Germany     DE    DE    2022-07-01         149.        995. 
2 Netherlands NL    NL    2022-07-01          75.6       420. 
3 Austria     AT    AT    2022-07-01          43.2        98.1
# … with 9 more variables: consumptionFull <dbl>, injection <dbl>,
#   withdrawal <dbl>, workingGasVolume <dbl>,
#   injectionCapacity <dbl>, withdrawalCapacity <dbl>, status <chr>,
#   trend <dbl>, full <dbl>

Note that due to the design of the API the functionality of get_giedata2() is - as of yet - a tiny bit complicated: You can specify multiple countries and get the data on the country level. Once you want to get data on companies, you can only specify one country, and all EIC codes provided must be of this country’s origin. The same holds for facilities: You can get data for mulitple facilities at once, but the country and company EIC must be of length one.

Case 1: Country + companies

data3 <- giedata::get_giedata2(countries = "DE",
                               companies = c("21X000000001160J", 
                                             "37X0000000000151"),
                               date = "2022-01-01")

head(data3)
# A tibble: 2 × 13
  name       code  url   gasDayStart gasInStorage injection withdrawal
  <chr>      <chr> <chr> <date>             <dbl>     <dbl>      <dbl>
1 astora (G… 21X0… 21X0… 2022-01-01          8.28      241.        0.5
2 bayernugs  37X0… 37X0… 2022-01-01          1.82        0        10.5
# … with 6 more variables: workingGasVolume <dbl>,
#   injectionCapacity <dbl>, withdrawalCapacity <dbl>, status <chr>,
#   trend <dbl>, full <dbl>

Case 2: Company + facilities

data4 <- giedata::get_giedata2(countries = "DE",
                               companies = "21X000000001160J",
                               facilities = c("21Z000000000271O", 
                                              "21W0000000001148"),
                               date = "2022-01-01")

head(data4)
# A tibble: 2 × 13
  name       code  url   gasDayStart gasInStorage injection withdrawal
  <chr>      <chr> <chr> <date>             <dbl>     <dbl>      <dbl>
1 UGS Rehden 21Z0… 21Z0… 2022-01-01          2.86     229.         0  
2 UGS Jemgu… 21W0… 21W0… 2022-01-01          5.42      11.5        0.5
# … with 6 more variables: workingGasVolume <dbl>,
#   injectionCapacity <dbl>, withdrawalCapacity <dbl>, status <chr>,
#   trend <dbl>, full <dbl>

Lastly, I hope this package is of use to you and if you have any query or suggestion, do not hesitate and hop over to Github and create an issue.