Skip to contents

R-CMD-check.yaml Coverage Status lifecycle
CRAN_Status_Badge

Author: Diego Valle-Jones
License: MIT + file LICENSE
Website: https://hoyodesmog.diegovalle.net/rsinaica/
Help: https://github.com/diegovalle/rsinaica/discussions

Easy-to-use functions for downloading air quality data from the Mexican National Air Quality Information System (SINAICA). This R package allows you to download pollution and meteorological data from more than one hundred monitoring stations across Mexico. You can query crude real-time data, validated data, or manually collected data.

Installation

To install the most recent package version from CRAN, run:

install.packages("rsinaica")

You can always install the development version from GitHub:

if (!require(devtools)) {
    install.packages("devtools")
}
devtools::install_github("diegovalle/rsinaica")

Example

Suppose you want to download pollution data from the Centro station in Guadalajara. First, load the required packages and look up the station’s numeric code in the stations_sinaica data.frame:

## Auto-install required R packages
packs <- c("ggplot2", "maps", "mapproj", "rsinaica")
success <- suppressWarnings(sapply(packs, require, character.only = TRUE))
if (length(names(success)[!success])) {
  install.packages(names(success)[!success])
  sapply(names(success)[!success], require, character.only = TRUE)
}

knitr::kable(stations_sinaica[
  which(stations_sinaica$station_name == "Centro"),  1:6
  ])
station_id station_name station_code network_id network_name network_code
28 33 Centro CEN 30 Aguascalientes AGS
29 54 Centro CEN 38 Chihuahua CHIH1
30 102 Centro CEN 63 Guadalajara GDL
31 170 Centro CEN 78 Zona Metropolitana de Querétaro ZMQ

There are three stations named Centro. The one we want is located in Guadalajara and has a numeric code (station_id) of 102. The stations_sinaica data.frame also includes the latitude and longitude of all monitoring stations in Mexico (including some that have never reported data)..

mx <- map_data("world", "Mexico")
stations_sinaica$color <- "Others"
stations_sinaica$color[stations_sinaica$station_id == 102] <- "Centro (102)"
## There seems to be a mistake in some of the air quality stations longitudes
## having been assigned positive values (longitudes to the west of the
## prime meridian are supposed to be negative)
stations_sinaica <- subset(stations_sinaica, lon < 0)
ggplot(stations_sinaica[order(stations_sinaica$color, decreasing = TRUE),], 
       aes(lon, lat)) + 
  geom_polygon(data = mx, aes(x= long, y = lat, group = group)) +
  geom_point(alpha = .9, size = 3, aes(fill = color), shape = 21) + 
  scale_fill_discrete("station") +
  ggtitle("Air quality measuring stations in Mexico") +
  coord_map() + 
  theme_void()

Next, query the start and end dates for which SINAICA has received data from this station:

sinaica_station_dates(102)
#> [1] "1997-01-01" "2026-03-06"

The station is currently reporting data (this document was built on 2026-03-06), and it has been active since 1997. You can also query which parameters (pollution, wind, solar radiation, etc.) the station measures. the station has sensors for. The package also includes a parameters data.frame containing the complete set of supported parameters; however, not all stations report all of them.

cen_params <- sinaica_station_params(102)
knitr::kable(cen_params)
param_code param_name
CN Carbono negro
SO2 Dióxido de azufre
NO2 Dióxido de nitrógeno
DV Dirección del viento
HR Humedad relativa
IUV Índice de radiación ultravioleta
CO Monóxido de carbono
NO Óxido nítrico
NOx Óxidos de nitrógeno
O3 Ozono
PM10 Partículas menores a 10 micras
PM2.5 Partículas menores a 2.5 micras
PP Precipitación pluvial
PB Presión Barométrica
RS Radiación solar
TMP Temperatura
TMPI Temperatura interior
VV Velocidad del viento

Finally, download and plot hourly concentrations of particulate matter with a diameter smaller than 10 micrometers (μm) (PM10), for the month of January.

# Download all PM10 data for January 2018
df <-  sinaica_station_data(102, # station_id
                         "PM10", # parameter
                         "2018-01-01", # start_date
                         "2018-01-31", # end_date
                         "Crude" # Crude, Manual or Validated
                         )

ggplot(df, aes(hour, value, group = date)) +
  geom_line(alpha=.9) +
  ggtitle(expression(paste(PM[10],
                           " pollution during January 2018 at the",
                           " Centro station in Guadalajara"))) +
  labs(subtitle="Each line corresponds to a day") +
  xlab("hour") +
  ylab(expression(paste(mu,"g/", m^3))) +
  theme_bw()

The hours are shown in the local Guadalajara time zone (UTC−6), since we plotted January 2018 data.

stations_sinaica$timezone[which(stations_sinaica$station_id == 102)]
#> [1] "Tiempo del centro, UTC-6 (UTC-5 en verano)"

You can find a handy map of Mexico’s time zones from Wikipedia to help you with any time conversions you might need.