Summary
Whole genome sequencing (WGS) is a powerful “genetic fingerprinting” tool for foodborne pathogens. Routine use of WGS to “fingerprint” Listeria monocytogenes from humans and foods has considerably increased the number of disease outbreaks detected and traced back to specific foods, including produce. WGS also is used to identify instances where a specific type of bacteria appears to survive (“persist”) in a given food processing facility, indicating a particular food safety risk. However, our ability to interpret WGS data is hampered by (i) a lack of WGS data for bacteria from sources other than humans and foods and (ii) the need to better define how likely closely related bacteria can be found in different locations. In order to address these challenges, we will collect bacteria representing L. monocytogenes and other Listeria spp. from environmental sources throughout the US and perform whole genome sequencing on these bacteria. Comprehensive comparisons among these bacterial isolates along with isolates from produce associated environments and human cases globally will be used to define similarity cut-offs that identify closely related bacteria and the likelihood of closely related bacteria occurring in different locations. This will facilitate more accurate use of these tools to address produce food safety issues.
Technical Abstract
Whole genome sequencing (WGS) of Listeria monocytogenes (LM) has been used for routine human foodborne disease surveillance in the US since 2013. Regulatory agencies also routinely use WGS to characterize LM isolates obtained from foods, food processing facilities, and food-associated environments. Despite considerable WGS work on human isolates, there are currently limited data on the distribution and diversity of LM and Listeria WGS-based subtypes in non-food associated environments. Interpretation of WGS data hence does not have the benefit of comparison data that could be used to assess the likelihood of closely related LM and Listeria spp. being isolated from different sources. Consequently, we propose that an improved understanding of the distribution and ecology of LM and Listeria spp. WGS-based subtypes across the US is needed to optimize the use of WGS for source tracking and assessment of LM and Listeria spp. persistence. This is of particular importance for the produce industry where pathogen contamination can occur from a diversity of sources, including surface water as well as natural and agricultural environments. We thus propose the following objectives: Obj. 1: Develop a sampling plan for collection, across the US, of at least 1,500 soil samples focusing on non-agricultural and natural environments, followed by testing of samples for L. monocytogenes and Listeria spp. Obj. 2: Perform whole genome sequencing (WGS) of the L. monocytogenes and Listeria spp. isolates obtained through Obj. 1 and analyze data to assess associations between WGS sequence type and geographical origin. Obj. 3: Perform WGS of Listeria spp. isolated from throughout the produce chain (for example from irrigation water, packing houses, processing facilities, and produce environments in retail stores); isolates will be obtained from pre-existing isolate collections, and through concurrent sampling efforts that are part of ongoing, funded studies. Obj. 4. Perform a comprehensive analysis of LM and Listeria spp. WGS data to provide information on the number of SNP or allelic differences that provide an appropriate cut-off to identify isolates with a likely epidemiological link. The proposed work will provide (i) baseline data on the frequency of LM and Listeria spp. detection across environmental sources in the US (providing critically needed baseline data that will allow growers to interpret Listeria detection events), (ii) initial US-wide data on the effects of geo-spatial, soil, and meteorological parameters on the likelihood of LM and Listeria spp. detection, (iii) data on the distribution of identical or similar LM and Listeria spp. WGS sequence types in different locations across the US, and (iv) produce relevant data on the number of SNP and allelic differences that likely indicate a recent common ancestor and a likely epidemiological relationship between isolates. Importantly, outcomes (iii) and (iv) will provide critical information that will help the produce industry interpret WGS data; for example, our data will help industry assess how likely isolation of Listeria with a given small number of SNPs represents persistence in a given environment versus re-introduction or a chance event. Our findings also will inform future similar work on other pathogens (e.g., Salmonella, STECs).
Research Objectives
1. Develop a sampling plan for collection, across the US, of at least 1,500* soil samples focusing on non-agricultural and natural environments, followed by testing of samples for Listeria monocytogenes and Listeria spp.
2. Perform whole genome sequencing (WGS) of the L. monocytogenes and Listeria spp. isolates obtained through Obj. 1, and analyze data to assess associations between WGS sequence type and geographical origin.
3. Perform WGS of Listeria spp. isolated from throughout the produce chain (for example from irrigation water, packinghouses, processing facilities, and produce environments in retail stores); isolates will be obtained from pre-existing isolate collections, and through concurrent sampling efforts that are part of ongoing, funded studies.
4. Perform a comprehensive analysis of Listeria spp. and L. monocytogenes WGS data to provide information on the number of single nucleotide polymorphism (SNP) or allelic differences that provide an appropriate cut-off to identify isolates with a likely epidemiological link. (* Revised to 1,000 composite soil samples.)
Findings & Recommendations
Key findings
• The diversity of Listeria in soil was very high; 20 species identified including all 6 known sensu stricto species, 8 known sensu lato species, 5 new sensu stricto species, and 1 new sensu lato species.
• L. monocytogenes was the most prevalent species with a prevalence of 11.75%, followed by L. welshimeri and L. seeligeri. L. booriae was for the first time discovered from soil and ranked fourth in prevalence. Three lineages of LM were detected from soil, lineage I, II and III, with lineage III to be the most prevalent LM lineage.
• Listeria in soil across the contiguous US showed a distribution significantly delineated by longitude and elevation; this distribution was driven by a combination of soil, climate, and land-use variables; sodium, moisture, and molybdenum were identified as the top three important drivers.
• The ecological niche represented by soil property, climate and land-use varied by LM lineages and Listeria spp.
• LM in soil were very genetically diverse; closely related LM were more likely to be found at close locations.
• A few LM lineage I and II isolates were closely related to clinical isolates. They may be epidemiologically linked based on the cgMLST results.
• Listeria spp. in non-food associated environments and food associated environments were not very closely related except for L. seeligeri based on core SNP and hqSNP. Soil in nonfood associated environments may not be a common source for Listeria spp. contamination in the produce industry except for L. seeligeri.