InForMID
Tufts Initiative for the Forecasting and Modeling of Infectious Diseases
Tufts Initiative for the Forecasting and Modeling of Infectious Diseases

An analecta of visualizations for foodborne illness trends and seasonality

Data Sources

The FoodNet Fast platform provides publicly available data for laboratory-confirmed diseases caused by seven bacteria (Campylobacter, Listeria, Salmonella, Shigella, Shiga toxin-producing Escherichia coli O157 and non-O157 (STEC), Vibrio, and Yersinia enterocolitica) and two protozoa (Cryptosporidium and Cyclospora) in counties from 10 states: California (CA), Connecticut (CT), Georgia (GA), Minnesota (MN), and Oregon (OR) since 1996; Maryland (MD) and New York (NY) since 1998; Tennessee (TN) since 2000, Colorado (CO) since 2001, and New Mexico (NM) since 2004. National estimates are generated from the summation of data across these 10 states, which represent approximately 15% of the US population [1-2].

FoodNet Fast allows data download and visualization creation of these diseases for a user-specified time period. For multi-year periods, the portal aggregates totals and monthly percentages into single statistics for the full time period selected rather than showing individual years. The portal restricts the user from selecting more than one state at a time. To calculate monthly percentages of confirmed cases for all diseases in one year and one location, users had to download each state-year combination individually, for a total of 221 files in MS Excel format. To create a time series of total monthly cases by pathogen and location, we used multiplied the monthly percentages of confirmed cases by the annual counts of confirmed cases. Since the provided monthly percentages are rounded to 1 digit in the data download, calculated counts slightly under- or over-estimate annual totals.

We downloaded county-level population estimates from the 1990, 2000, and 2010 US Census Bureau interannual census reports [3-5]. We then estimated state-level FoodNet population catchment area by adding all mid-year (July 1st) populations of surveyed counties monitored in each year. Next, we calculated the United States population catchment area by summating all state-level estimates for all surveyed counties for each year. Finally, we developed a time series of monthly rates per 1,000,000 persons for each pathogen and location by dividing monthly counts by annual population estimates and multiplying this quotient by 1,000,000. In addition to monthly rates, we calculated yearly rates by summating all monthly counts in a given year, dividing by the annual population, and multiplying this quotient by 1,000,000.

The dataset including counts, rates, and population estimates can be downloaded here.

Methods for Calculating Trend and Seasonality

We calculated trend and seasonality using Negative Binomial Harmonic Regression (NBHR) models. These models allow us to calculate important characteristics of seasonality: when the maximum rate occurs (peak timing) and the magnitude of that peak (amplitude). Using the δ-method, we can calculate peak timing, amplitude, and their confidence intervals from NBHR model coefficients [6-8]. We fit a NBHR model for each study year and location with the length of the time series set to 12 to represent the months of the year [8]. To examine average trends across the entire 22-year period, we fit a NBHR model with three trend terms (linear, quadratic, and cubic) where the length of the time series varied according to when FoodNet began surveying that location (168-264 months).

Multi-Panel, Multi-Axis Plot Terminology

A multi-panel plot, as defined by our previous work, “involves the strategic positioning of two or more graphs sharing at least one common axis on a single canvas [9].” These plots can effectively illustrate multiple dimensions of information including different time units (e.g. yearly, monthly), disease statistics (e.g. rates, counts), seasonality characteristics (e.g. peak timing, amplitude), and locations (e.g. state-level, national). We use common, standardized terminology across visualizations to ensure comprehension:

  • Disease – each of the nine reported FoodNet infections, including campylobacteriosis (Camp), listeriosis (List), salmonellosis (Salm), shigellosis (Shig), infection due to Shiga toxin-producing Escherichia coli O157 and non-O157 (Ecol), vibriosis (Vibr), infection due to Yersinia enterocolitica (Yers), cryptosporidiosis (Cryp) and cyclosporiasis (Cycl)
  • Monthly Rate – monthly confirmed cases per 1,000,000 persons
  • Yearly Rate – total confirmed cases in a year divided by the mid-year population of all surveyed counties in that location (cases per 1,000,000 persons)
  • Frequency – the number of months reporting the same quantity of disease rates
  • Peak Timing – the time of year according to the Gregorian calendar that a disease reaches its maximal rate; for monthly time series peak timing is ranging from 1.0 (beginning of January) to 12.9 (end of December)
  • Absolute Peak Intensity – the difference between the disease maximum rate at peak  and disease minimum rate at nadir
  • Relative Peak Intensity – the ratio between the disease maximum rate at peak and disease minimum rate at nadir
  • Amplitude – the mathematical amplitude, or the midpoint of disease rate between the peak (maximum rate) and nadir (minimum rate)
  • FoodNet Surveyed County – the counties under FoodNet surveillance as of 2017
  • Non-Surveyed County – all remaining counties within a surveillance state as of 2017

Visualizations of Trend

These multi-panel plots combine information on monthly rates, inter-annual trends, and the frequency distribution of rates by utilizing the shared axes of individual plots. The right panel provides a time series of monthly rates where a NBHR model fit with seasonal oscillators and three trend terms (linear, quadratic, and cubic). Predicted trend line is shown in blue and its 95% confidence interval is in grey shades. The estimated median monthly rate is shown in red. The left panel depicts a rotated histogram of rate frequencies indicating the right-skewness of the monthly rate distribution. The histogram shares the vertical monthly rate-axis with the time series plot. Two pictograms refer to the selected pathogen and location.

CampylobacterListeriaSalmonellaShigellaSTECVibrioYersinia Cryptosporidium Cyclospora
CaliforniaCA_CampCA_ListCA_SalmCA_ShigCA_STECCA_VibrCA_YersCA_CrypCA_Cycl
ColoradoCO_Camp CO_ListCO_SalmCO_ShigCO_STECCO_Vibr CO_Yers CO_Cryp CO_Cycl
ConnecticutCT_Camp CT_List CT_Salm CT_Shig CT_STEC CT_Vibr CT_Yers CT_Cryp CT_Cycl
GeorgiaGA_Camp GA_List GA_Salm GA_Shig GA_STEC GA_Vibr GA_Yers GA_Cryp GA_Cycl
MarylandMD_CampMD_List MD_Salm MD_Shig MD_STEC MD_Vibr MD_Yers MD_Cryp MD_Cycl
MinnesotaMN_Camp MN_List MN_Salm MN_Shig MN_STEC MN_Vibr MN_Yers MN_Cryp MN_Cycl
New MexicoNM_Camp NM_List NM_Salm NM_Shig NM_STEC NM_Vibr NM_Yers NM_Cryp NM_Cycl
New YorkNY_Camp NY_List NY_Salm NY_Shig NY_STEC NY_Vibr NY_Yers NY_Cryp NY_Cycl
OregonOR_Camp OR_List OR_Salm OR_Shig OR_STEC OR_Vibr OR_Yers OR_Cryp OR_Cycl
TennesseeTN_Camp TN_List TN_Salm TN_Shig TN_STEC TN_Vibr TN_Yers TN_Cryp TN_Cycl
United StatesUS_Camp US_List US_Salm US_Shig US_STEC US_Vibr US_Yers US_Cryp US_Cycl

To download all images, please see the attached zip file here.

Visualizations of Seasonal Signatures

These multi-panel plots incorporate annual seasonal signatures, summary statistics of monthly rates, and radar plots. The top-left panel provides an overlay of all annual seasonal signatures, a set of curves depicting characteristic variations in disease incidence over the course of one year, where line hues become increasingly darker with more recent data and a red line indicates median monthly rates. The bottom-left panel provides a set of box plots for each month that aggregates information over the study period and provides essential summary statistics, including the median rate values and the measures of spread. The shared horizontal axis allows the two plots to be compared across the years using identical scales. The background colours illustrate the four seasons: winter solstice to vernal equinox (blue), vernal equinox to summer solstice (green), summer solstice to autumnal equinox (yellow), and autumnal equinox to winter solstice (orange). The right panel provides overlaying monthly rates using a radar plot where time is indicated on the rotational axis and rates are indicated on the radial axis. The radar plot reminds to the reader about the periodic nature of seasonal variations and connect the oscillations in one continuous line with graduating colours. The colour hue of the lines, background colour, median line colour and the axis scales are uniform across all three panels. We also repeat the pictograms to refer to the selected pathogen and location.

CampylobacterListeriaSalmonellaShigellaSTECVibrioYersinia Cryptosporidium Cyclospora
CaliforniaCA_CampCA_ListCA_SalmCA_ShigCA_STECCA_VibrCA_YersCA_CrypCA_Cycl
ColoradoCO_Camp CO_ListCO_SalmCO_ShigCO_STECCO_Vibr CO_Yers CO_Cryp CO_Cycl
ConnecticutCT_Camp CT_List CT_Salm CT_Shig CT_STEC CT_Vibr CT_Yers CT_Cryp CT_Cycl
GeorgiaGA_Camp GA_List GA_Salm GA_Shig GA_STEC GA_Vibr GA_Yers GA_Cryp GA_Cycl
MarylandMD_CampMD_List MD_Salm MD_Shig MD_STEC MD_Vibr MD_Yers MD_Cryp MD_Cycl
MinnesotaMN_Camp MN_List MN_Salm MN_Shig MN_STEC MN_Vibr MN_Yers MN_Cryp MN_Cycl
New MexicoNM_Camp NM_List NM_Salm NM_Shig NM_STEC NM_Vibr NM_Yers NM_Cryp NM_Cycl
New YorkNY_Camp NY_List NY_Salm NY_Shig NY_STEC NY_Vibr NY_Yers NY_Cryp NY_Cycl
OregonOR_Camp OR_List OR_Salm OR_Shig OR_STEC OR_Vibr OR_Yers OR_Cryp OR_Cycl
TennesseeTN_Camp TN_List TN_Salm TN_Shig TN_STEC TN_Vibr TN_Yers TN_Cryp TN_Cycl
United StatesUS_Camp US_List US_Salm US_Shig US_STEC US_Vibr US_Yers US_Cryp US_Cycl

To download all images, please see the attached zip file here.

Heap Maps of Annual Time Series

To capture the advantage of a multi-panel plot (Figure 6), we incorporate the boxplot from the lower left panel of the figures above with a calendar heatmap containing 264 monthly rate values. In the heatmap, information for each individual year is shown as stacked rows of width 12 (for each month of the year) where cell colour intensity represents the magnitude of monthly rates. Compared to stacked line plots, these figures provide an individual row for each year of the time series, allowing for greater decomposition, differentiation, and comparison of seasonal signatures across years. Seasonal changes are shown horizontally from left to right while yearly trend transition can be observed in a vertical view from bottom to top. Yearly rates provide a bar graph for comparing fluctuations in inter-annual rates while the adjacent heatmap indicates the month(s) driving these fluctuations. In doing so, the calendar heatmap identifies whether inter-annual changes are driven by sporadic outbreaks or increased seasonal magnitude of rates. At the same time, the shared axis box plot provides an overview of the average seasonal signature for the entire time series.

CampylobacterListeriaSalmonellaShigellaSTECVibrioYersinia Cryptosporidium Cyclospora
CaliforniaCA_CampCA_ListCA_SalmCA_ShigCA_STECCA_VibrCA_YersCA_CrypCA_Cycl
ColoradoCO_Camp CO_ListCO_SalmCO_ShigCO_STECCO_Vibr CO_Yers CO_Cryp CO_Cycl
ConnecticutCT_Camp CT_List CT_Salm CT_Shig CT_STEC CT_Vibr CT_Yers CT_Cryp CT_Cycl
GeorgiaGA_Camp GA_List GA_Salm GA_Shig GA_STEC GA_Vibr GA_Yers GA_Cryp GA_Cycl
MarylandMD_CampMD_List MD_Salm MD_Shig MD_STEC MD_Vibr MD_Yers MD_Cryp MD_Cycl
MinnesotaMN_Camp MN_List MN_Salm MN_Shig MN_STEC MN_Vibr MN_Yers MN_Cryp MN_Cycl
New MexicoNM_Camp NM_List NM_Salm NM_Shig NM_STEC NM_Vibr NM_Yers NM_Cryp NM_Cycl
New YorkNY_Camp NY_List NY_Salm NY_Shig NY_STEC NY_Vibr NY_Yers NY_Cryp NY_Cycl
OregonOR_Camp OR_List OR_Salm OR_Shig OR_STEC OR_Vibr OR_Yers OR_Cryp OR_Cycl
TennesseeTN_Camp TN_List TN_Salm TN_Shig TN_STEC TN_Vibr TN_Yers TN_Cryp TN_Cycl
United StatesUS_Camp US_List US_Salm US_Shig US_STEC US_Vibr US_Yers US_Cryp US_Cycl

To download all images, please see the attached zip file here.

Forest Plots of Seasonal Features

These are multi-panel plots that incorporate two forest plots (one each for annual peak timing and amplitude estimates) and one scatterplot (for peak timing and amplitude) to describe seasonality features. The top-left panel shows peak timing estimates (as month of the year, ranging from 1.0 (beginning of January) to 12.9 (end of December) – horizontal axis) for each study year (vertical axis). The bottom-right panel shows amplitude estimates where the horizontal axis indicates the study year and the vertical axis shows the amplitude (as monthly rates per 1,000,000 persons). The bottom-left corner shows the scatterplot of peak timing (horizontal axis) and amplitude (vertical axis) with markers representing each pair of annual estimates. Measures of uncertainty (95% confidence intervals) are reflected in error bars of each marker; dashed red lines show median peak timing and amplitude estimates.

CampylobacterListeriaSalmonellaShigellaSTECVibrioYersinia Cryptosporidium Cyclospora
CaliforniaCA_CampCA_ListCA_SalmCA_ShigCA_STECCA_VibrCA_YersCA_CrypCA_Cycl
ColoradoCO_Camp CO_ListCO_SalmCO_ShigCO_STECCO_Vibr CO_Yers CO_Cryp CO_Cycl
ConnecticutCT_Camp CT_List CT_Salm CT_Shig CT_STEC CT_Vibr CT_Yers CT_Cryp CT_Cycl
GeorgiaGA_Camp GA_List GA_Salm GA_Shig GA_STEC GA_Vibr GA_Yers GA_Cryp GA_Cycl
MarylandMD_CampMD_List MD_Salm MD_Shig MD_STEC MD_Vibr MD_Yers MD_Cryp MD_Cycl
MinnesotaMN_Camp MN_List MN_Salm MN_Shig MN_STEC MN_Vibr MN_Yers MN_Cryp MN_Cycl
New MexicoNM_Camp NM_List NM_Salm NM_Shig NM_STEC NM_Vibr NM_Yers NM_Cryp NM_Cycl
New YorkNY_Camp NY_List NY_Salm NY_Shig NY_STEC NY_Vibr NY_Yers NY_Cryp NY_Cycl
OregonOR_Camp OR_List OR_Salm OR_Shig OR_STEC OR_Vibr OR_Yers OR_Cryp OR_Cycl
TennesseeTN_Camp TN_List TN_Salm TN_Shig TN_STEC TN_Vibr TN_Yers TN_Cryp TN_Cycl
United StatesUS_Camp US_List US_Salm US_Shig US_STEC US_Vibr US_Yers US_Cryp US_Cycl

To download all images, please see the attached zip file here.

Comparisons of Seasonality Features

These are multi-panel plots for visualizing the annual peak timing and amplitude of each FoodNet reported infection in ten FoodNet-reporting states and the US from 1996-2017 (Across Locations) and for visualizing all reported infections within each state and the US (Across Diseases). The top-left panel shows the average peak timing per infection/location while the bottom-right panel shows the average amplitude per infection/location. The bottom-left panel shows a combined scatterplot between peak timing and amplitude estimates. Background colours indicate the four seasons defined by solar solstices and equinoxes: winter (blue), spring (green), summer (yellow), and autumn (orange).

CampylobacterListeriaSalmonellaShigellaSTECVibrioYersinia Cryptosporidium Cyclospora
Across LocationsLOC_CampLOC_ListLOC_SalmLOC_ShigLOC_StecLOC_VibrLOC_YersLOC_CrypLOC_Cycl

To download all images, please see the attached zip file here.

CACOCTGAMDMNNMNY ORTNUS
Across DiseasesDIS_CADIS_CODIS_CTDIS_GADIS_MDDIS_MNDIS_NMDIS_NYDIS_ORDIS_TNDIS_US

To download all images, please see the attached zip file here.

Comparisons of Trends and Seasonal Signatures

These are multi-panel plots for comparing seasonal signatures and yearly rates of of each FoodNet reported infection in ten FoodNet-reporting states and the US from 1996-2017 (Across Locations) and for visualizing all reported infections within each state and the US (Across Diseases). The top panel provides a box plot of monthly rates for each month of year for the location/infection. The calendar heatmap uses shared horizontal axes to show the distribution of monthly rates for each year and each location/infection. Darker hues indicate higher rates while empty cells with blue borders indicate years when FoodNet surveillance was not conducted. The right panel provides a rotated bar graph of yearly rates.

CampylobacterListeriaSalmonellaShigellaSTECVibrioYersinia Cryptosporidium Cyclospora
Across LocationsLOC_CampLOC_ListLOC_SalmLOC_ShigLOC_StecLOC_VibrLOC_YersLOC_CrypLOC_Cycl

To download all images, please see the attached zip file here.

CACOCTGAMDMNNMNY ORTNUS
Across DiseasesDIS_CADIS_CODIS_CTDIS_GADIS_MDDIS_MNDIS_NMDIS_NYDIS_ORDIS_TNDIS_US

To download all images, please see the attached zip file here.

Codes for Data Analysis and Visualization Creation

Stata code for calculating rates, trends, and seasonality features is available here.
R code for generating visualizations is available here.

References

1. Centers for Disease Control and Prevention (CDC). FoodNet Fast: Pathogen Surveillance Tool. Atlanta, Georgia: U.S. Department of Health and Human Services http://wwwn.cdc.gov/foodnetfast (2020).

2. Centers for Disease Control and Prevention (CDC). FoodNet Surveillance. U.S. Department of Health and Human Services https://www.cdc.gov/foodnet/surveillance.html (2020).

3. United States Census Bureau. 1990s: County Tables. United States Department of Commerce https://www.census.gov/data/tables/time-series/demo/popest/1990s-county.html#statelist_6 (2016.)

4. United States Census Bureau. County Intercensal Tables: 2000-2010. United States Department of Commerce https://www.census.gov/content/census/en/data/tables/time-series/demo/popest/intercensal-2000-2010-counties.html (2017).

5. United States Census Bureau. Annual Estimates of the Resident Population: April 1, 2010 to July 1, 2017. United States Department of Commerce https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=PEP_2017_PEPANNRES&prodType=table (2018).

6. Naumova, E. N. & MacNeill, I.B. Seasonality assessment for biosurveillance systems in Advances in Statistical Methods for the Health Sciences (eds. Mesbah, M., Molenberghs, G., Balakrishnan, N.) 437-450 (Birkhäuser, 2007).

7. Falconi TA, Cruz MS, & Naumova EN. The shift in seasonality of legionellosis in the USA. Epidemiology & Disease 2018; 146: 1824-1833.

8. Simpson, R.B., Zhou, B., & Naumova, E.N. Seasonal Synchronization of Foodborne Outbreaks. Nature Scientific Reports. In Submission.

9. Chui, K. K., Wenger, J. B., Cohen, S. A., & Naumova, E. N. Visual analytics for epidemiologists: understanding the interactions between age, time, and disease with multi-panel graphs. PloS one, 6(2), e14683 (2011).