The National Weather Service site https://w1.weather.gov/xml/current_obs/ provides access to observed current weather conditions for about 2,367 locations across 58 US States and Territories.
With the XML link for an airport code (e.g., KAMW), users can access current weather report for the airport location. Different kinds of weather associated data are shown on the webpage. For instance, the page displays
The XML file contains more than 20 different types of weather data, as well as the location and time information of the observations. Furthermore, a link to “2 Day History” gives access of an HTML page containing weather history data for the past 72 hours.
Thanks to the well-organized weather data structure from the National Weather Service website, the user can easily explore both the spatial and temporal aspect of weather of selected locations. The site also provides the Zip files containing all XML files of the current and updated hourly. However, people may be only interested in some selected sites and do not bother to download big whole data. Also, the XML and HTML formats need to be converted to data frames before downstream analysis. Web scraping is a natural idea to handle this. The package airportweather
is designed as an efficient tool to scrape, extract, integrate and visualize the weather data of airport locations. The advantage of the package is it scrapes real-time data from the web when calling functions in the package, so the most up-to-date data is obtained.
The primer key of the weather data is 4-letter airport code of airport locations. Scraping from the root page https://w1.weather.gov/xml/current_obs/ , we have collected all current airport code together with the name of the location and the state/territory abbreviations correspond to the code. The data set all_code.rda
contains 2,367 US airport codes from 58 states/territories. This data is used as the package example data to provide currently valid codes of US airports. However, if new codes are added in the National Weather Service site, the airportweather
package will still work as long as the format of webpage and structure of the site keeps the same.
State | Code | Name |
---|---|---|
AK | PADK | Adak Island, Adak Airport |
AK | PAUT | Akun Airport |
AK | PAFM | Ambler, Ambler Airport |
AK | PAKP | Anaktuvuk Pass, Anaktuvuk Pass Airport |
AK | PAED | Anchorage, Elmendorf Air Force Base |
AK | PALH | Anchorage, Lake Hood Seaplane Base |
You can download the package airportweather
from github , then install (build) it with ctrl+shift+B in Rstudio;
Or you can also install the package using devtools
:
The current_weather
function is the basic function for data retrieving. It reads the weather information of an airport from XML file on National Weather Service website (e.g., https://w1.weather.gov/xml/current_obs/KAMW.xml), and output a single row data frame of location, station_id, latitude, longitude, observation_time, and selected types of the weather elements.
Description of input parameters is as follows:
Parameter | Description | Type |
---|---|---|
id | one 4-letter airport code | chr |
type | a vector of weather elements | chr vector |
The possible types of weather elements the function can deal with are
Below is an example of calling this function to retrieve current “wind_mph”, “temp_f”, “relative_humidity” in “Ames, Ames Municipal Airport, IA (airport code”KAMW“)”.
location | station_id | latitude | longitude | observation_time | wind_mph | temp_f | relative_humidity |
---|---|---|---|---|---|---|---|
Ames, Ames Municipal Airport, IA | KAMW | 41.99056 | -93.61889 | Last Updated on May 7 2019, 1:53 pm CDT | 11.5 | 60.0 | 60 |
The current_weather_more
function calls the current_weather
function to retrieve weather elements from multiple airports, and output integrated data frame.
Description of input parameters is as follows:
Parameter | Description | Type |
---|---|---|
id | a vector of 4-letter airport code | chr vector |
type | a vector of weather elements | chr vector |
Below is an example of calling this function to retrieve current “wind_mph”, “temp_f”, “relative_humidity” in three locations:
“Ames, Ames Municipal Airport, IA (airport code”KAMW“)”, “Cedar Rapids Airport, IA (airport code”KCID“)”, and “Des Moines International Airport, IA (airport code”KDSM“)”.
location | station_id | latitude | longitude | observation_time | wind_mph | temp_f | relative_humidity |
---|---|---|---|---|---|---|---|
Ames, Ames Municipal Airport, IA | KAMW | 41.99056 | -93.61889 | Last Updated on May 7 2019, 1:53 pm CDT | 11.5 | 60.0 | 60 |
Cedar Rapids Airport, IA | KCID | 41.88295 | -91.72456 | Last Updated on May 7 2019, 1:52 pm CDT | 13.8 | 57.0 | 64 |
Des Moines International Airport, IA | KDSM | 41.53399 | -93.65307 | Last Updated on May 7 2019, 1:54 pm CDT | 13.8 | 61.0 | 58 |
Theplot_weather_cat
function plot one categorical weather component of several airports on the map of US.
Description of input parameters is as follows:
Parameter | Description | Type |
---|---|---|
id_vector | a vector of multiple airport codes | chr vector |
type | a weather element | chr |
label | label= TRUE: label airport code on the plot | logical |
Below is an example to plot a categorical weather component, weather.
Theplot_weather_num
function plot one numeric element weather component of several airports on the map of US.
Description of input parameters is as follows:
Parameter | Description | Type |
---|---|---|
id_vector | a vector of multiple airport codes | chr vector |
type | a weather element | chr |
label | label= TRUE: label airport code on the plot | logical |
number | number= TRUE: this element is numeric | logical |
Below is an example to plot a numerical/continuous weather component, such as “temp_f”.
The plot_weather_us
function plot one component of all airports and the user’s location on the map of the US with contour lines. The component has to be continuous, such as “temp_c”.
Description of input parameters is as follows:
Parameter | Description | Type |
---|---|---|
type | a weather elements | chr |
you.long | user provided longitude | num |
you.lat | user provided latitude | num |
bin | the binwidth of the contour lines | num |
It takes a relatively long time to run this function. Below is a previous result as a demonstration.
The function obhistory
is to retrieve 3 days history of weather and temperature of one location, as in general, the “weather condition” (Sunny, Fog, etc..) and “air temperature” ( in degrees Fahrenheit) are the two most important elements in weather data. The input of this function is the id
, one 4-letter airport code; the output is a data frame of 9 columns.
code | loc_name | date | time | localtime | time_UTC | Weather | Temperature | hday |
---|---|---|---|---|---|---|---|---|
KAMW | Ames, Ames Municipal Airport, IA | 04 | 14:53 | 2019-05-04 14:53:00 | 2019-05-04 19:53:00 | A Few Clouds | 67 | first24hours |
KAMW | Ames, Ames Municipal Airport, IA | 04 | 15:53 | 2019-05-04 15:53:00 | 2019-05-04 20:53:00 | Partly Cloudy | 68 | first24hours |
KAMW | Ames, Ames Municipal Airport, IA | 04 | 16:53 | 2019-05-04 16:53:00 | 2019-05-04 21:53:00 | A Few Clouds | 69 | first24hours |
KAMW | Ames, Ames Municipal Airport, IA | 04 | 17:53 | 2019-05-04 17:53:00 | 2019-05-04 22:53:00 | Partly Cloudy | 69 | first24hours |
KAMW | Ames, Ames Municipal Airport, IA | 04 | 18:53 | 2019-05-04 18:53:00 | 2019-05-04 23:53:00 | Fair | 67 | first24hours |
KAMW | Ames, Ames Municipal Airport, IA | 04 | 19:53 | 2019-05-04 19:53:00 | 2019-05-05 00:53:00 | Fair | 61 | first24hours |
The description of each column is as follows:
Variable | Description | Type |
---|---|---|
code | one 4-letter airport code | chr |
loc_name | name of the airport location | chr |
date | the same as in the original html table | chr |
time | the same as in the original html table | chr |
localtime | the local time | chr |
time_UTC | UTC time | POSIXct |
Weather | the Weather | factor |
Temperature | air temperature | num |
hday | define the 1st, 2nd and 3rd 24 hours | factor |
Note that the returned data frame has a various number of rows, as the observation frequency is not fixed at each airport site, and the observation time points are not evenly distributed. So it will be important that the table contains detailed time information. A POSIXct formatted time variable (in UTC) is also generated based on data. If the user needs to compare data from different time zones, the UTC time will make things tidier.
The function plot_temp_history
is to generate the plot of air temperature for 3 days until the current time. The resulting plot has the airport location name in the title, local hours labeled on the X axis, and 3 lines each for one 24-hour.
A shiny app was designed to demonstrate selected functions in the airportweather
package. The spatial visualization tab, “Weather information on map”, call current_weather_more
to show selected weather variables for single or multiple airports on interactive Maps with leaflet
, and generate an integrated table for multiple locations as well.
Launch shiny spp,
and you should see the panel as below:
Choose the airport code by typing the letter or just choose from the menu. When you type letter, the menu will show all choice with the letters. You can choose multiple airports.
Choose the weather feature by clicking on the box. You can also choose several weather feature on this panel.
After selection of airport code and weather feature, a map with the airport position will be shown below the tab Weather information on map. The map shows the airports you selected, with the selected weather information on the points. You can also drag, zoom in or zoom out the map.
Below the map, you can also see a table with selected weather information as well as airport location information for the selected airport.
When you switch to the tab History weather at one location, you will see a reminder if you choose more than one location. The tab shows the latest 72h temperature for an airport, so you can only choose one location.
The history temperature plot shows 3 days temperature records, which presented as three different colors.
A detailed temperature history table is shown below the temperature plot. It includes all temperature records for every hour. You can also search a certain time, weather or temperature in the search box.
Combined with flights information, our package can be used to evaluate the performance of airports and airlines. By analyzing the airport weather data and flight information data, we can find out which airport and which airline has the most delays due to weather, which weather components cause the most flight delays, and so on.