Technical details

Exploratory analysis, data cleaning, plots

Data exploration was done to assess datasets completeness and consistency across various variables. The EU/non-EU columns were validated and corrected using the latest EU-membership information. The hazard types/allergens/foreign bodies data entries were cleaned and collapsed to create concise groups used to compare the categories. Data cleaning was done in R using tidyverse and Python (pandas/numpy). The final plots were done using matplotlib/seaborn libraries.

Maps

To produce the maps, we used KeplerGL which we interfaced with using Python in Jupyter notebooks. The map coordinates were generated in two ways: 1) country centroids represent countries as single locations (used in map arcs) 2) density map was generated by sampling random coordinates from polygons of country shapes. As polygons are just shapes and population density is not considered, some coordinates are located in uninhabited areas (e.g. north of Canada), and therefore should not be interpreted literally. The data points without a clear origin/notifying country were excluded. The maps are available as GIF, interactive maps (see 3D view), and static images. Used R packages: ‘sp’, ‘rworldmap’, ‘maptools’.


Home | Technical details