Hidden in Plain Sight: How Scientists at Boston University are Using Math to Forecast Tick Populations

Every year as the tail end of summer approaches, an invisible army replaces its fallen warriors. Plump with the blood of their foes, these weary guerrillas achieve a victory at once decisive and Pyrrhic—having succeeded in intercepting enemy logistics, they now find themselves frozen in place by the weight of their spoils. With all other options exhausted, only one remains: accept their fate with grit, and convert their gains into the next generation of recruits. Wrought from this final sacrifice, this new cohort in turn assumes the same positions its predecessors once did—lying in wait for a passing convoy, ready to ambush at the first sign of exposure.

Though this army’s troops have no name for us, we have one for them: ticks. Indeed, far from hidden in impact, these stealthy arthropods—and, by extension, the diseases they carry—are an endemic feature of American life, with the share of counties nationwide reporting persistent contact with them estimated to approach 50%. What’s more, for many species this figure isn’t a steady state, but simply one point along a robust—and accelerating—upward trajectory. In the case of the black-legged tick (or “deer tick,” so named after its host organism), for instance, its present range spans more than twice as many counties as it did 25 years ago. Pathogens have followed suit, with annual estimated diagnoses of Lyme disease having likewise more than doubled since 2004 to nearly 500,000 today.

What accounts for the explosive rise of these otherwise diminutive parasites? Theories abound. From urban sprawl and climate change to global air travel and the overregulation of deer hunting, there are a truly dizzying number of factors one could cite, each differing in terms of scale, species, and explanatory value. Yet perhaps more directly instructive than the long-term causes of the ongoing tick invasion are its short-term contours—when and where are ticks most likely to emerge from dormancy? How long do they remain active after they do? And how many can we expect to encounter in a given week or area?

Enter the tick map. Once regarded as little more than a pipe dream, this forecasting tool—in essence, a user-friendly atlas of predicted tick exposure for each week of the year—is now the subject of an ongoing NASA-funded research project at the BU Department of Earth & Environment. Led by Ecological Forecasting Laboratory director Michael Dietze, the project’s logic is simple enough to grasp in principle. First, using tick population data from the past, try to understand how their survival and reproduction are affected by a series of environmental inputs—temperature, humidity, precipitation, host availability, etc. Next, use these relationships to build a model that predicts (or, in jargonese, retrodicts) how tick populations have changed over time and compare that estimate to actual tick numbers as recorded by people in the field. Finally, study the gap between estimate and reality to make a newer, better prediction of past tick populations. Rinse and repeat until your model performs satisfactorily, and you can begin to (attempt to) retool it for forecasting tick abundance a year from now.

In principle, of course, this isn’t terribly hard to grasp. In practice, however, complications begin to appear—and not simply appear, but multiply. The principal reason for this is uncertainty: we can’t be assured that the input data or model parameters we have are 100% accurate, so any predictions based on them are bound to capture reality imperfectly. Crucially, however, not all uncertainties are equally uncertain. Depending both on the quality of the data behind it and its transformation by the model (e.g. temperature², log[survival], etc.), a given variable or parameter can see its contribution to the model’s uncertainty either amplify or dampen.

An important tradeoff this implies is that while fine-tuning a model can improve its performance, it can also—indeed, must—reallocate its uncertainty across sources. This fact was recently confirmed by Dr. John Foster, a former PhD student of Dietze’s who built six models predicting tick abundance in the northeastern U.S.—among them, a control that assumes tick survival and reproduction stay constant, and another that assumes they vary with the weather and mouse populations. Although model performance improved significantly during the switch from control to variant, Foster also found that the proportions of model uncertainty contributed by initial tick counts and the parameters rose markedly (2% → 64% and 1.4% → 16%, respectively) while cratering for process uncertainty—in essence, variation associated with changes the model isn’t modeling (97% → 17%). Intuitive though such a dynamic might seem in this case, it can appear in less obvious ways as well. As Dietze notes of one, “The tick forecast is robust to uncertainty about mice if you use the actual observed weather [i.e. treat it as certain], but becomes sensitive to uncertainties about mice when you introduce the uncertainty from the weather forecast.”

Besides the hazards of error propagation, moreover, there exists another, altogether far more straightforward barrier to forecasting tick populations: data scarcity. Indeed, it wasn’t until the early 2000s that tick populations started to be monitored systematically by the federal government, with the latter’s support in turn taking the form of grants to regional nonprofits rather than a nationwide surveillance program. Consequently, to the extent that records of American ticks do exist, they tend to be fragmented and therefore difficult to generalize beyond the sites they originate from. As before, this constraint was borne out in the first run of forecasts. When Foster attempted to extend his model from the area it was trained on—the Cary Institute of Ecosystem Studies, which is housed in a Northeastern deciduous forest—to several of the National Ecological Observatory Network (NEON)’s own field sites, he found that its performance tended to improve with a site’s similarity to Cary. Thus while NEON’s highly deciduous HARV and TREE sites frequently vindicated the model framework—an achievement which should most certainly not be discounted—for most others the results were decidedly mixed, with the model either over-or underestimating tick populations depending on site (that said, Dietze notes, “as the model was exposed to, and learned from, site-specific information, its forecast uncertainty fell rapidly over time”).

Needless to say, the above two considerations (and much else besides) pose formidable challenges to any would-be tick cartographer. Yet where there’s a will, there’s a way. In particular, to mitigate arguably the most troublesome thorn in every forecaster’s side—limited data—Dietze’s team have landed on a naturally unnatural remedy: satellites. In contrast to humans, these machines possess the unique capability of inspecting every corner of the Earth’s surface on a more or less continuous basis. Where this becomes helpful isn’t so much the raw tick data—whose primary upgrade has been extension from Cary to 10 of NEON’s 46 sites, with further coverage still to come—as their environmental context: if a legitimate relationship can be found to obtain between tick counts (which are dispersed) and some satellite-derived data product (most of which are universal), then it suddenly becomes possible to scale relationships otherwise constrained by the supply of high-quality tick data to the rest of the country.

Although this phase of the project remains in its infancy—with a genuinely usable forecast likely still years away—steps are already being taken to turn its vision into a reality. In my own work as an undergrad RA, for example, I have downloaded satellite-derived vegetation indices and analyzed them for any potential association with tick abundance. Unlike Cary’s weather station data—which while precise is highly site-specific—these indices are generated in the exact same way for every pixel of land they cover. Though somewhat modest thus far, the results of this exercise could (in theory) lay the basis for a spatially exhaustive tick map, extending to the vast majority of counties in America that don’t possess thorough, up-to-date tick records in addition to the minority that do.

Another related way in which satellite data could prove useful is indexing heterogeneity. Being parasites, ticks tend to thrive in areas that contain a mix of habitats: those suited for themselves, and those suited for their host organisms. Though the nature of this relationship is far from invariant—most tick species rely on high levels of humidity so as not to die from desiccation—it isn’t arbitrary either. During one regression analysis of land cover diversity and nymph abundance, for example, it was found that variation in the former explained up to 20% of variation in the latter—a notable result for an ecological system, where even strong signals can be difficult to pick up on amid substantial statistical noise.

Yet while necessary, more data alone isn’t sufficient. As noted earlier, it is only once relationships between these data are identified—in particular, temperature, precipitation, vegetation (inputs) and tick counts (output)—that having more of it starts to be useful. Unlike my own work, moreover, these relationships have to be estimated as precisely as possible—or “parameterized,” in the modeler’s tongue— to become fit for purpose. Where to begin?

Dietze’s team have an answer. Rather than model tick abundance for Cary and the 10+ NEON sites individually, they propose to let each site learn from the others. What does this mean in practice? In the words of fellow project member and postdoctoral associate Dr. Emily Beasley, “we use math to tell the model that tick populations at each location behave similarly, but not identically.” To accomplish this, they split the model into two layers: one estimating a parameter (say, tick survival) for each site, and another estimating the across-site variation in that parameter. The aim of this procedure—known in the literature as “borrowing strength”—is to create a new series of estimates of tick survival, one that leans heavily on data-rich sites while using that parameter’s shared distribution to constrain estimates for data-poor ones.

Of course, it goes without saying that borrowing strength—like most everything else demanded by the tick map—is no simple feat. Executing it properly requires striking a balance between the preservation of variation across sites and the reduction of statistical noise—put simply, we want to let data-poor sites learn from data-rich sites without forgetting too much about themselves. To accomplish this, it’s worth understanding how the benefit of each ounce of strength borrowed—or each hike in the influence of the “prior,” in the language of Bayesian inference—compares to the cost it exacts on the data. As Beasley puts it:

“One can use uninformative priors, which basically just eliminate impossible scenarios (i.e. you can’t have negative tick densities), all the way to strongly informative priors that take a lot of data to overcome the assumptions within them (though it’s still possible). We’re using moderately informative priors that are narrow enough to exclude extremely improbable outcomes while still allowing sites to vary.”

Consider tick survival (Φ) again. Assuming we’d like our model not to violate common sense, a good first step would be to prevent it from attaining values that are biologically impossible (0 ≤ % of ticks that survive ≤ 100)—but what next? Since the benefit we’re willing to accept in exchange for less across–site variation is the elimination of unlikely—as opposed to strictly impossible—outcomes, we need to turn to the data to learn what that actually means. This is where the distribution comes in. By relating all possible values of Φ to how likely each one is to show up, it tells us how much across-site variation (σ) is supported by the data. That variation in turn regulates the strength of the prior, whose influence tends to rise for data-poor sites while waning for data-rich ones.

To see how this ties back into our earlier discussion of the merits of satellite data, we must first obtain a clearer picture of how the model estimates its parameters in the first place. Regarding a particular site’s Φ, we don’t simply allow its value to swing back and forth until it yields an adequate fit; instead, we constrain its movement by describing it in relation to our inputs:

Φ(site) = β₀ + β₁·Temp + β₂·humid + β₃·LAI…

Where LAI stands for leaf area index (a commonly used biophysical parameter which satellites can estimate) and each β represents another parameter used to estimate the one we’re interested in. These coefficients in turn don’t simply keep walking until they find home, but are instead led there by the sites, with the pull of each increasing with how much data it has to offer (i.e. how much strength it’s able to lend). Iterated liberally, this gives us a parameterization of Φ—β₀, β₁, β₂,…—that tells us, in effect: if an area (not just site anymore) is this hot, humid, densely vegetated, etc, then what can we expect Φ to be?

As noted before, the value added of satellite data primarily comes down to its spatial continuity; by cluing us into vegetation dynamics at the scale of an individual pixel (which depending on the product in question can range from 500-m × 500-m to 80-cm × 80-cm), it can give us an idea of how ecosystems change over relatively short distances. For ticks this often matters quite a bit, both due to their aforementioned fondness for habitat diversity as well as the capacity for vegetation to locally regulate the climate in ways that help (or harm) their chances. Returning to our equation, what this implies is that although temperature and humidity play significant roles in determining tick abundance at large scales, the data we have to hand for quantifying them (Temp, humid) may not faithfully represent conditions on the ground—a blindspot which we can correct for using vegetation (LAI). This in turn enables a more spatially precise rendering of tick populations, with the upshot that we can now convey the risk of humans coming into contact with them on the scales that truly matter (e.g. Census Tract, city, county).

That’s the synoptic view, at least. In truth, there is a positively bewildering amount of technical know-how that goes into building a tick map—actually specifying how sites share information in mathematical terms, relating each stage of a tick’s life (larva, nymph, adult) to the other two, validating predictions in unsampled locations—which hasn’t even been covered in these pages; in this, it is thus an undertaking with no small barrier to entry. Nonetheless, so long as its assembly remains in the capable hands of the BU Ecological Forecasting Laboratory, Americans can rest assured that it will arrive in due course—and just as surely as the pests whose advances it will one day track.

Works Referenced

Allen et al. “Evidence for the long-distance transport of ticks and tick-borne pathogens by human travellers to Texas, USA,” Journal of Travel Medicine, Vol. 32, Issue 4, 18 Apr. 2025, https://academic.oup.com/jtm/article/32/4/taaf032/8115886

Brown, J. “Tick-Borne Diseases Risk Increasing Due to Climate Change: Here’s What You Need to Know,” BU Today, 14 May. 2025, https://www.bu.edu/articles/2025/tick-borne-diseases-risk-increase/

Beck, M. “Process and observation uncertainty explained with R,” R is my Friend, 31 May. 2014, https://beckmw.wordpress.com/2014/03/31/process-and-observation-uncertainty-explained-with-r/

Caldwell, J. and Vahidsafa, A. “Propagation of Error,” LibreTexts – Chemistry, https://chem.libretexts.org/Bookshelves/Analytical_Chemistry/Supplemental_Modules_%28Analytical_Chemistry%29/Quantifying_Nature/Significant_Digits/Propagation_of_Error

Eschner, K. “The Quest for a ‘Tick Map,’” Scientific American, 7 Jul. 2022, https://www.scientificamerican.com/article/the-quest-for-a-lsquo-tick-map-rsquo/

Foster et al. “A modified matrix model captures the population dynamics for the primary vector of Lyme disease in North America,” Ecosphere, Vol. 15, Issue 10, 15 Oct. 2024, https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/ecs2.70022

Frenne et al. “Ten practical guidelines for microclimate research in terrestrial ecosystems,” British Ecological Society, Vol. 16, Issue 2, 16 Dec. 2024, https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14476

González et al. “Seasonal Dynamics of Tick Species in the Ecotone of Parks and Recreational Areas in Middlesex County (New Jersey, USA),” insects, Vol. 14, Issue 3, 5 Mar. 2023, https://pmc.ncbi.nlm.nih.gov/articles/PMC10057079/

Guo et al. “Influence of urban expansion on Lyme disease risk: A case study in the U.S. I-95 Northeastern corridor,” Cities, Vol. 125, Jun. 2022, https://www.sciencedirect.com/science/article/pii/S0264275122000725

Jones, B. “Here’s where dangerous ticks are spreading across the US — and what to do about them,” Vox, 27 May. 2022, https://www.vox.com/22567258/ticks-spreading-lyme-disease-deer-mice-reforestation-climate-change

Telford, S. “Deer Reduction Is a Cornerstone of Integrated Deer Tick Management,” Journal of Integrated Pest Management, Vol. 8, Issue 1, 27 Sep. 2017, https://academic.oup.com/jipm/article/8/1/25/4210016

Welch, A. “Lyme disease-carrying ticks spread to half of U.S. counties,” NBC, 18 Jan. 2016, https://www.cbsnews.com/news/lyme-disease-carrying-ticks-spread-to-half-of-u-s-counties/

Winny, A. “Tickborne Diseases Are on the Rise—Here’s What To Know,” Bloomberg School of Public Health, 21 Jun. 2023, https://publichealth.jhu.edu/2023/lyme-disease-isnt-the-only-tickborne-disease-to-watch

Leave a Reply Cancel reply