16
pages

Voir plus
Voir moins

Vous aimerez aussi

http://www.ij-healthgeographics.com/content/9/1/12

OF HEALTH GEOGRAPHICS

RESEARCH Open Access

Developing GIS-based eastern equine

encephalitis vector-host models

in Tuskegee, Alabama

1* 2 3 4 5Benjamin G Jacob , Nathan D Burkett-Cadena , Jeffrey C Luvall , Sarah H Parcak , Christopher JW McClure ,

5 5 6 1 7Laura K Estep , Geoffrey E Hill , Eddie W Cupp , Robert J Novak , Thomas R Unnasch

Abstract

Background: A site near Tuskegee, Alabama was examined for vector-host activities of eastern equine

®encephalomyelitis virus (EEEV). Land cover maps of the study site were created in ArcInfo 9.2 from QuickBird data

encompassing visible and near-infrared (NIR) band information (0.45 to 0.72 μm) acquired July 15, 2008.

Georeferenced mosquito and bird sampling sites, and their associated land cover attributes from the study site,

®were overlaid onto the satellite data. SAS 9.1.4 was used to explore univariate statistics and to generate regression

models using the field and remote-sampled mosquito and bird data. Regression models indicated that Culex

erracticus and Northern Cardinals were the most abundant mosquito and bird species, respectively. Spatial linear

®prediction models were then generated in Geostatistical Analyst Extension of ArcGIS 9.2 . Additionally, a model of

®the study site was generated, based on a Digital Elevation Model (DEM), using ArcScene extension of ArcGIS 9.2 .

Results: For total mosquito count data, a first-order trend ordinary kriging process was fitted to the semivariogram

at a partial sill of 5.041 km, nugget of 6.325 km, lag size of 7.076 km, and range of 31.43 km, using 12 lags. For

total adult Cx. erracticus count, a first-order trend ordinary kriging process was fitted to the semivariogram at a

partial sill of 5.764 km, nugget of 6.114 km, lag size of 7.472 km, and range of 32.62 km, using 12 lags. For the total

bird count data, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 4.998

km, nugget of 5.413 km, lag size of 7.549 km and range of 35.27 km, using 12 lags. For the Northern Cardinal

count data, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 6.387 km,

nugget of 5.935 km, lag size of 8.549 km and a range of 41.38 km, using 12 lags. Results of the DEM analyses

indicated a statistically significant inverse linear relationship between total sampled mosquito data and elevation

2(R = -.4262; p < .0001), with a standard deviation (SD) of 10.46, and total sampled bird data and elevation

2(R = -.5111; p < .0001), with a SD of 22.97. DEM statistics also indicated a significant inverse linear relationship

2between total sampled Cx. erracticus data and elevation (R = -.4711; p < .0001), with a SD of 11.16, and the total

2sampled Northern Cardinal data and elevation (R = -.5831; p < .0001), SD of 11.42.

Conclusion: These data demonstrate that GIS/remote sensing models and spatial statistics can capture

space-varying functional relationships between field-sampled mosquito and bird parameters for determining risk

for EEEV transmission.

Introduction and most survivors are permanently debilitated by neu-

Eastern equine encephalitis virus (EEEV) is the most rologic sequelae [2]. Besides the endemic and economic

dangerous endemic arbovirus in the United States. Up burdens to humans, frequent equine cases and sporadic

to 70% of symptomatic cases in humans are fatal [1], mass game bird die-offs are costly consequences of

EEEV transmission [3-5]. Epornitics in wild birds are

* Correspondence: bjacob@uab.edu also dramatic consequences of EEEV [6], such as die-

1School of Medicine, Department of Infectious Diseases, University of offs of the endangered whooping crane, Grus americana

Alabama at Birmingham, 845 19th Street South, Birmingham Alabama, USA,

[7]. Except in Florida [8,9], the ecology of EEEV is less35294

© 2010 Jacob et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons

Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in

any medium, provided the original work is properly cited.Jacob et al. International Journal of Health Geographics 2010, 9:12 Page 2 of 16

http://www.ij-healthgeographics.com/content/9/1/12

understood in the southeastern United States than in and St. Louis Encephalitis (SLE) clustered in urban/sub-

other endemic locations in the region. This disease is urban areas in Georgia and Alabama [22,23]; whereas,

endemic in Alabama with viral activity varying between EEEV transmission was restricted to freshwater swamps

years. The summer of 2001 was a particularly active in Florida [9]. Compared to other arboviral diseases,

year for EEEV, with one human and over 30 veterinary EEEV transmission tends to be more spatially isolated

cases in the central and southern regions of the state [8,9], with the notable exception of the 1989 Atlantic

[10]. and Gulf coast outbreaks, which caused 196 equine

The mosquito species Culiseta melanura is generally cases and 9 human cases [3]. Evidence for spatial isola-

believed to initiate EEEV transmission to wild birds tion of EEEV foci include the lack of early warning of

[11,12]. Passerine birds are the major enzootic reser- transmission with sentinel flocks and very low serocon-

voirs, and early transmission among the local avifauna is versions of both flocks (2%) and human popula-

believed to be initiated by ornithophilic species, such as tions within EEEV foci (1.7%) [3,8,9,24], suggesting few

Cs. melanura [11-13]. However, peaks in abundance of asymptomatic cases. Therefore, untargeted or random

Cs. melanura species do not correlate directly with interventions would be excessive and wasteful [25], as

peaks in EEEV transmission [14]. Differences in sampled EEEV vectors and hosts are not randomly distributed.

abundance count data suggest that multiple mosquito Quantification of vector-host interactions, by incor-

species are necessary as vectors to account for large epi- porating high resolution remotely sensed data in GIS,

zootics [11]. In addition to Cs. melanura, several other can help predict arbovirus transmission cycles by identi-

mosquito species are likely involved as bridge vectors fying site specific environmental predictors [25-32]. For

for EEEV transmission. These species include: Aedes example, in earlier research, Jacob et al. [31] found that

vexans, Coquillettidia perturbans, Culex erraticus, while land use land cover (LULC) change sites can aid in spa-

Culex peccator, Culex territans and Uranotaenia sap- tial prediction of human exposure to Culex mosquitoes

phirina are suspected of circulating EEEV among rep- using GIS-generated models. A LULC classification,

tiles and amphibians [15,16]. Of these previously listed based on Landsat-7 ETM+ data acquired in July 2003

species, it is suspected that Cx. erraticus is the most and Landsat-5 TM data acquired in July 1991, was com-

important EEEV bridge vector between birds and mam- pared to the abundance of Culex restuans and Culex

mals in the mid-south, because of frequent virus isola- pipiens egg rafts in Urbana-Champaign, Illinois. Total

tions and its abundance in bottomland swamps, flood LULC change, from 1991 to 2003 in the Urbana-Cham-

plains, permanent standing water, recreation areas near paign study site, was relatively low (12.1%). The most

rivers or ponds, and water impoundments in Alabama frequent LULC category was maintained urban. The

and throughout the Tennessee Valley [10,17,18]. Under- urban land cover was further subdivided by degree of

standing the spatial distribution of this habitat-restricted tree canopy coverage using QuickBird visible and near

species is valuable for predicting risk of EEEV infection infra-red (NIR) data, which revealed 73.3% of the urban

for nearby human populations. area was in the category classified as high canopy cover-

Despite the misnomer “equine,” EEEV transmission age, with 20% of the remotely stratified data categorized

initiates in the avian cycle. Antibody prevalence in wild as moderate canopy coverage, and 6.7% as low coverage.

birds associated with freshwater swamps in Alabama The remote stratification of the urban land cover

range from 6-85% [19], which suggests that different revealed that 83.3% egg raft distribution was in the high

bird species vary in attractiveness to mosquitoes and coverage areas [31].

defensive behaviors against mosquito bites [20]. In Characteristics of drainage networks and basin physio-

Macon County, Alabama, avian species overrepresented graphic parameters have also been used in hydrologic cal-

in mosquito bloodmeals included: Yellow-Crowned culations and land cover modeling of flood and swamp

Night-Heron, Carolina Chickadee, Great Blue Heron, water mosquito abundance, using satellite data [32-36].

Northern Mockingbird, and Wild Turkey [21]. There- The automated generation of drainage networks has

fore, determining the spatial distribution of common become increasingly popular with the use of GIS and

bloodmeal hosts of mosquito vectors is a critical step to availability of digital elevation models (DEMs). These

predicting early cycles of EEEV transmission. models account for topographic variability and their con-

Predicting foci of EEEV positive mosquitoes has been trol over soil moisture heterogeneity and runoff within a

difficult, perhaps as a result of movement of human and watershed by using a flow distance to stream grid-based

horse populations and fluctuations in bird populations analyses. The advantage of using a flow distance-to-

over the years [9]. Spatio-temporal distribution of arbo- stream algorithm generated in a DEM is that landscape

viral vectors and hosts vary over short distances, based profiles can be evaluated and terrain covariates can be

on differences in land cover and meteorological shifts. generated, which can estimate relationships between a

For example, human cases of West Nile Virus (WNV) response variable and other environmental-sampledJacob et al. International Journal of Health Geographics 2010, 9:12 Page 3 of 16

http://www.ij-healthgeographics.com/content/9/1/12

variables [35]. Topographic derivatives generated from a Materials and methods

DEM can also be calculated at different scales, using the Study Site

linear interpolation technique built in GIS, which can The study site is located in the Tuskegee National Forest

accurately yield several catchment hydrological variables, in Macon County, Alabama. Since the site was abandoned

including percent surface saturation and total surface in the 1900s, it has undergone extensive re-encroachment

runoff for identification of potential mosquito and avian of forest over depleted farmland and is characterized by

sampling sites [33]. forested bottomland wetlands [10]. The center of Tuske-

Vector-borne disease risk can also be modeled with gee, AL is located approximately 3 km from the edge of

high predictive accuracy by using geostatistical kriging the study site, an urban center with a human population

2algorithms in GIS. Kriging is equated with spatial opti- density of 3,700 persons/km http://factfinder.census.gov.

mal linear prediction, where the unknown random-pro- The western edge of the sampling grid abutted the City

cess mean is estimated with the best linear unbiased Lake, east of the center of Tuskegee, an area with a human

2estimator. Kriging field and remote-sampled mosquito population density of 1,100-1,600 persons/km.Thenorth-

and avian predictor variables require the use of various west portion of the sampling grid also overlapped with

geostatistical techniques to interpolate the parameters of populated areas northeast of Tuskegee and north of high-

a random field (e.g., the elevation, z,ofthelandscapeas ways US-29/AL-81, with a human population density of

2a function of the geographic location, at an unsampled also 1,100-1,600 persons/km . The geographic coordinates

location from data at nearby sampled locations) [34]. of the centroid of the sampling grid were 85.644444 by

Stochastic kriging can also be used to generate predic- 32.432494 decimal degrees. The central and southern por-

tion of abundance and distribution data, which can tions of the sampling grid had a human population density

2allow for numerical quantification of uncertainty esti- of 80 persons/km , and the eastern edge had a human

2mates in arboviral explanatory covariates [31]. Addition- population density of 0-50 persons/km .

ally, predicting landscape classes in urban environments

can reveal local spatial patterns of the physical and Collections

socio-economic factors hypothesized to be associated Mosquitoes were collected biweekly from May to Sep-

with arboviral transmission. For example, in northern tember 2007, from natural and artificial resting sites, by

California, kriging interpolation revealed that Culex tar- vacuum collection with a portable backpack aspirator as

salis was the most abundant species in ovitraps near previously described [38]. Briefly, light traps ran from

agricultural sites; whereas, Cx. pipiens was clustered dusk to dawn and were positioned approximately 2 m

within residential areas [33]. above ground. Vacuum collections were made twice a

The dynamics of transmission of any arthropod-borne week from resting boxes and natural resting sites during

infection is a complex function of many factors, which this same time period. These collections complemented

may include the intensity of infection in the vertebrate those from light traps and allowed sampling of mosqui-

reservoir, the competence of the vector, and the degree toes in different physiological/behavioral conditions, i.e.,

of contact of the vector with the infected vertebrate host nulliparous/parous host-seeking mosquitoes in light

reservoir [37]. Thus, generating models of EEEV, using traps versus blood-engorged or gravid ones in resting

field and remote-sampled mosquito and avian data, is boxes, or allowed the collection of species not attracted

essential to understanding the ecology of EEEV and for to light. Live material was returned to the laboratory,

developing effective means to control outbreaks. GIS/ sorted, identified using a chill table and binocular

remote sensing and spatial statistics can map interac- microscope, and frozen at -70°C [39,40]. Point counts

tions between arthropod mosquito vectors and avian were used to estimate bird densities at the study site as

amplification host populations, which can aid in spatially previously described [21]. Point counts lasted three min-

targeting high density foci of mosquito and avian sam- utes, and all birds seen or heard within 100 m of the

pling sites [31]. Treatments or habitat perturbations observed sites during the three-minute counts were

should be based on the surveillance of the most produc- recorded. Bird counts were conducted using trained,

tive areas of an ecosystem [25]. Therefore, the objectives competent observers. Birds were surveyed in a grid of

of this research were: a) to generate multiple regression 110 points, separated by 250 m within a 1.7 km radius

models to determine predictors associated with the from the center of the study site. The grid points were

sampled mosquito and avian data; (b) to develop spatial selected, systematically based on sampled strategies gen-

linear prediction models of potential avian and mosquito erated from previous research [21]. Counts lasted three

sampled sites; and, c) to construct a DEM to identify minutes and were conducted from June 30 through July

terrain covariates associated with sampled mosquito and 29, from 0500 until 1100 local time, with all birds seen

bird data in Tuskegee, Alabama. or heard within 100 m from the observer recorded.Jacob et al. International Journal of Health Geographics 2010, 9:12 Page 4 of 16

http://www.ij-healthgeographics.com/content/9/1/12

Remote sensing data variables. Thus, we determined p + 1 parameters b , ...,0

QuickBird data http://www.digitalglobe.com encompass- b . In order to estimate the sampled mosquito aquaticp

ing the visible and near infra-red (NIR) bands was habitat parameters, it was useful to use the matrix nota-

acquired on July 15, 2008 for the study site. QuickBird tion Y = Xb + ε,where Y was a column vector that

multispectral products provided four discrete non-over- included the mosquito count values of Y,..., Y,which1 n

lapping spectral bands covering a range from 0.45 to included the unobserved stochastic components ε , ..., ε1 n

0.72 μm, with an 11-bit collected information depth. and the matrix X. This matrix was the observed mos-

The spatial resolution of the data was 0.61m. The clear- quito aquatic habitat parameter values of the regressors

est, cloud-free imagery available of the contiguous sub- expressed as:

areas of the study site was used to identify mosquito

and wild bird sampling sites. ⎛ 1xx ⎞11 1p

⎜ ⎟

121 2p⎜ ⎟X = .Base mapping ⎜ ⎟

Base maps of major roads and hydrological networks ⎜ ⎟

⎜ ⎟1xxnn1 p®were created using ArcInfo 9.2 (Environmental Systems ⎝ ⎠

Research Institute, Redlands, California) from differen-

In this research, X included a column that did nottially corrected global positioning system (DGPS) ground

vary across the sampled mosquito data, which was usedcoordinates. In this research, fixed surveillance sites were

to represent the intercept term b .0geocoded using a CSI-Wireless (DGPS) Max receiver

The ecological-sampled data was log-transformed beforewith a real-time Omni Star L-Band satellite signal, which

analyses to normalize the distribution and minimize stan-has a positional accuracy of 0.179 m (+/0.392 m) [31]. A

dard error. Multicollinearity diagnostics from the COLLIN10m × 10 m grid-based matrix was overlaid on the base

®option in SAS were estimated. Residual-based diagnostics®maps of the study site, in ArcInfo 9.2 to generate effi-

for univariate and multivariate conditional heteroscedasticcient spatial sampling units. A unique identifier was

models, previously constructed from clustering field andplaced in each grid cell. For remote identification of arbo-

remote-sampled mosquito habitat parameter estimatesviral mosquito and avian habitats, the first step is often to

have revealed that errors in variance uncertainty estima-construct a discrete tessellation of the region [41-48].

tion can substantially alter numerical predictions models

due to multicollinearity [31]. The SAS COLLIN optionRegression analyses

produced eigenvalues and condition index, as well as pro-A linear regression, with statistical significance, was

portions of variances with respect to individual-sampleddetermined by a 95% confidence level and used to ascer-

predictor variables in the model. The conditional indextain whether the proportions of sampled mosquito data

scores indicated no significant multicollinearity with thediffered by grid cell. The linear regression model assumed

model. It was hypothesized, however, that serial correla-a random sample between Y, (sampled mosquito habitati

tion could be a major source of time-varying heterogene-count data), the regress and regressors X , ... X . A dis-i1 ip

ity. In this research, the Durbin-Watson statistic was usedturbance term ε , which was a random variable, wasi

to detect the presence of autocorrelation in the residualsadded to this assumed relationship to capture the influ-

from the regression analysis. The Durbin-Watson can testence of all habitat parameters sampled on Y other thani

for first-order serial correlation [49]. Usually, the Durbin-X ,... X . The random error term, ε, in a regression ana-i1 ip

Watson statistic is used to test the null hypothesis H : =0 1lysis of field and remote-sampled Culex aquatic model, is

0 against H : > 0 [49]. The generalized Durbin-Watson1 1typically assumed to be normally distributed with mean

statistic is written as:zero and variance s2 [31]. Statistical characteristics of the

sampled data were examined in PROC UNIVARIATE.

ˆ ˆuA′ A ujjThe PLOT option in the PROC UNIVARAITE statement DW =j ˆˆ′generated histograms and boxplots. The NORMAL uu

option was used to test whether the field and remote-

where ˆ is a vector of OLS residuals andA is a (T - j)u jsampled parameters had a normal distribution. The

× T matrix. In this research, the generalized Durbin-Wat-regression analyses was performed using PROC REG.

son statistic DWj was rewritten as:The multiple linear regression model was:

′′ ′′ ′YMA A MY()QA A QYX=+ +X ++X +,, i= 1 ,n. jj 11jjii011 2i2 pipi DW = =j

′ ′YMY

It was important to distinguish the model in terms of

where Q’ Q = I , Q’ X = 0, and h = Q’ u.random variables and the observed values of the random 1 1 T - k 1 1Jacob et al. International Journal of Health Geographics 2010, 9:12 Page 5 of 16

http://www.ij-healthgeographics.com/content/9/1/12

The marginal probability for the Durbin-Watson sta- two alternative statistics (Durbin h and t) can be used to

tistic was: test for time varying residuals that are asymptotically

equivalent [50]. In this research, we used the h statistic,

Pr(DW<=ch) Pr( < 0)j which was written as:

where h = h’(Q’ A’AQ - cI)h. ˆ1 j j 1 ˆhN=− /(1NV)

The p-value, or the marginal probability, for the gen-

eralized Durbin-Watson statistic, was computed by where

N N 2numerical inversion of the characteristic function j(u) ˆˆ ˆ ˆˆ ,and was the least = VV / V V∑ tt−1 ∑ tt=2 t=1

of the quadratic form h = h’(Q’ A’AQ - cI)h.Thetra-1 j j 1 squares variance estimate for the coefficient of the

pezoidal rule approximation to the marginal probability lagged dependent variable.

Pr(h < 0) was: In PROC AUTOREG,an estimationmethodwas used to

generate an autoregressive error model using the Yule-

1 Walker (YW) method. The YW method can be consideredK Im[ ((k+ )Δ)]1 2 as generalized least squares using the OLS residuals toPr(h<=0) − ++EE()Δ (K)IT∑ 12 estimate the covariances across observation [49]. In this()k+k=0

2 research, we let represent the vector of autoregressive

parameters, =( , ,..., )’, and we let the variance1 2 mwhere IM[j(·)] was part of the characteristic function

matrix of the error vector be ν=(ν , ..., ν )’ be Σ, E(νν’ =1 Nand E (Δ)andE (K) were integration and truncationI T 2Σ = s V. If the vector of autoregressive parameters iserrors, respectively. The trapezoidal rule is a way to cal-

known, the matrixV can be computed from the autore-culate the definite integral [49]. A numerically efficient

2gressive parameters; Σ is then s V [49]. Given Σ, the effi-algorithm was used to quantify the autocorrelated com-

cient estimates of regression parameters b were computedponents in the regression model, which required O(N)

using generalized least squares (GLS). The GLS estimatesoperations for evaluation of the characteristic function j

2then yielded the unbiased estimate of the variance s .(u). The characteristic function was denoted as:

The YW method alternated estimation of b using gen-

−1/2 eralized least squares with estimation of ,whichthe′′()ui|IQ− 2u(AAQ −cI )|11jj N −k

YW equations applied to the sample autocorrelation

function. The YW method started by forming the OLS

−−12// 1−1 2 1/2 estimate of b.Next, was estimated from the sample′′||VX|VX| |XX|

autocorrelation function of the OLS residuals by using

the YW equations. Then V was estimated from ,andwhere

2

Σ was generated from V and the OLS parameters of s .

The autocorrelation corrected estimates of the regres-′VI=+()12iuc −2iuA A and i=−1.jj

sion parameters, b, were then computed using GLS and

the estimated matrix. The YW equations, solved toBy applying the Cholesky decomposition to the com-

2obtain ˆ and a preliminary estimate of s,were Rj =plex matrix V, we obtained the lower triangular matrix

-r. In this research, we used the equation r=(r ..., r )’,G that satisfied V = GG’. Cholesky decomposition is a 1 m

when r was the lag i sample autocorrelation. The matrixdecomposition of a symmetric, positive-definite matrix i

R was the Toeplitz matrix, whose i, jth element wasinto the product of a lower triangular matrix and its

r . Toeplitz matrix is a matrix in which each descend-conjugate transpose [49]. The characteristic function |i-j|

ing diagonal from left to right is constant [49]. We spe-was evaluated in O(N) operations by using the following

cified a subset model. Only the rows and columns of Rformula:

and r corresponding to the subset of lags specified were

−−11//2 1 2 used. The BACKSTEP option was specified for purposes()u =|GX| | *′′X*| |XX|

of significance testing. The matrix [Rr] was treated as a

-1where X*= G X. sum-of-squares-and-cross products matrix arising from

We tested for serial correlation with lagged dependent asimpleregressionwith N - k observations, where k

variables in the model (Appendix a). When regressors was the number of estimated Cx. erraticus habitat para-

contain lagged dependent variables, the Durbin-Watson meters in the model.

statistic (d ) for the first-order autocorrelation is biased1

toward 2 and has reduced power [50]. If the Durbin- Digital elevation model

Watson statistic is substantially less than 2, there is evi- A three-dimensional model of the study area was con-

dence of positive serial correlation [49]. In AUTOREG, structed based on DEM statistics generated using

Jacob et al. International Journal of Health Geographics 2010, 9:12 Page 6 of 16

http://www.ij-healthgeographics.com/content/9/1/12

®ArcScene extension of ArcGIS . The DEM used in this preliminary data analyses, it was determined that North-

research was a raster representation of a continuous sur- ern Cardinals were the most abundant avian species in

face, originating from the Shuttle Radar Topography the study site (Table 2). Therefore, a kriged model was

Mission (SRTM) which had a spatial resolution of 92 m. generated using the Northern Cardinal data sample

®The probability distribution of the soil moisture deficit, points. All the models were created in the ArcGIS 9.3

i.e., statistics of topography, was generated from the Geostatistical Analyst Extension.

DEM data by using a multidirectional flow routing algo- Spatial linear prediction was performed using ordinary

rithm. The purpose of DEM construction was to extract kriging. Geostatistical techniques were used to interpo-

topographic parameters that may have been associated late the values Z(x ), at a sampled mosquito or bird0

with the field and remote-sampled EEEV mosquito and habitat Z(x), for unobserved sampling sites x and z =Z0 i

bird covariates. A flow apportioning algorithm can (x), i = 1... n, using data sampled at nearby sampledi

delineate a realistic channel network for quantifying habitat locations (x ,...x ). The kriged-based algorithm1 n

hydrogeomorphic properties of simulated drainage pat- computed the best linear unbiased estimator, Ž(x)of Zo

terns using DEMs for identifying floodwater mosquitoes (x ), for the sampled habitat data, based on a stochastic0

[35]. model of the spatial dependence quantified by the vario-

gram g(x, y), by expectation μ(x)= E[Z(x)], and by the

Spatial analyses covariance function c(x, y) of the random field. In this

Kriging models were generated using all sampled abun- research, the kriging estimator was given by a linear

dance count data in Geostatistical Analyst Extension of combination:

®ArcGIS 9.2 . However, based on the evidence that Cx.

nerraticus is likely the primary bridge vector of EEEV in

ˆZx() = w()x Zx( ) (2:1)oio iTuskegee [10] and was the most abundant species ∑

i=1sampled in the study site, it was selected for the inde-

pendent kriging analyses (Table 1). Also, kriging ana- for analyzing the sampled data; where, z = Z(x)wasi i

lyses were run for total abundance counts of the the weights while w (x)and i=1... n was the variancei o

sampled bird data in Tuskegee, in 2007. From used to minimize any biased condition [35]. The depen-

dent variables were the sampled adult count of mosqui-

toes or bird data, which were transformed to fulfill theTable 1 Adult mosquito counts for the Tuskegee study

site diagnostic normality test prior to performing the kri-

ging. The kriging weights were then used to fulfill theMosquito species Adult counts

unbiasedness condition in the spatial interpolation ofCx. erraticus 1,848

the ecological-dependent variables using:An. crucians 808

Cx. territans 632

n

An. quadrimacluatus 444

= 1 (2:2)i∑Cx. peccator 199

i=1

Ae. vexans 193

Cq. perturbans 134

An. punctipennis 126

Ur. sapphirina 124

Table 2 Bird counts for the Tuskegee study siteCx. quinquefasciatas 82

Species Abundance (% of count)Cx. restuans 51

Cx. salinarius 45 Northern cardinal 119 (37.6)

Cs. melanura 33 Carolina wren 77 (20.5)

Oc. canadensis 26 Red-eyed vireo 51 (8.2)

Oc. spp 14 Indigo bunting 50 (8.0)

Cx. nigripalpus 3 Tufted titmouse 36 (5.8)

Oc. sollicitans 1 White-eyed vireo 35 (5.6)

Oc. sticticus 1 Acadian flycatcher 31 (5.0)

Oc. triseriatus 1 Red-bellied woodpecker 14 (2.3)

Or. signifera 1 American crow 14 (2.3)

Ps. columbiae 1 Blue jay 13 (2.1)

Ps. ferox 1 Carolina chickadee 10 (1.6)

An. barberi 1 Northern parula 9 (1.4)Jacob et al. International Journal of Health Geographics 2010, 9:12 Page 7 of 16

http://www.ij-healthgeographics.com/content/9/1/12

which was given by the ordinary kriging equation sys- n

tem: ah( )+=(h ),j= 1.....nii,,j i 0∑

i=1

−1 *⎛ ⎞(,xx) (xx, ) 1 (,xx)⎛ ⎞ ⎛ ⎞111 1 n 1

⎜ ⎟⎜ ⎟ ⎜ ⎟ under the constraint

⎜ ⎟⎜ ⎟ ⎜ ⎟= ⎜ ⎟*⎜ ⎟ ⎜ ⎟ n(,xx) (xx, )) 1 (,xx)n nn1 n n⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ a = 1⎜ ⎟ i1 11 0 ∑⎝ ⎠ ⎝ ⎠ 1⎝ ⎠

i=1

The additional parameter μ was a Lagrange multiplier where h denoted the distance between any two mos-i, j2used in the minimization of the kriging error to (x)k quito or bird sampled locations, located at (x,y), andi i

honor the unbiased condition in the ecological dataset (x,y), and h was the distance between the two mos-j j j,0

[51]. The ordinary kriging was given by: quito or bird sampled sites (x,y ). The semivariance0 0

was defined as g (h) [50-53]. The magnitude of the**var(Z (xZ)−=(x ))

semivariance in this research was dependent on the dis-

−1* *⎛ ⎞ ⎛ ⎞(,xx) ⎛ (,xx) (,xx) 1 ⎞ (,x x ) tance between sampled mosquito or bird sites. Semivar-111 1 n 1⎜ ⎟ ⎜ ⎟⎜ ⎟ iance of the deviance residuals of the mosquito and bird⎜ ⎟ ⎜ ⎟⎜ ⎟

⎜ ⎟ ⎜ ⎟* *⎜ ⎟ count data was calculated, and a variogram was con-(,xx) (xx, ) 1(,xx)nn1 n (,xx )n n⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ structed to determine if there was evidence of latent⎜ ⎟ ⎜ ⎟11 01 1⎝ ⎠⎝ ⎠ ⎝ ⎠

spatial autocorrelation in the sampled data. The plot of

the semivariances as a function of distance from a pointand the interpolation was given by:

is referred to as a semivariogram [53]. The empirical

semivariogram and covariance can provide information′⎛ ⎞ ⎛Zx() ⎞11

on autocorrelation components in ecological-sampled⎜ ⎟ ⎜ ⎟ˆZx(*) =⎜ ⎟ ⎜ ⎟ datasets [49].

⎜ ⎟ ⎜ ⎟ Zx( ) In this research, parameters of a fitted mathematical⎝nn⎠ ⎝ ⎠

function (i.e., the variogram model) included generating

with the error variables quantified using: a range, a nugget and a sill. The range is the distance at

which curve levels of a constant value of semivariance

′ (,xx*) can indicate the spatial scale of a pattern in an⎛ ⎞ ⎛ ⎞11

⎜ ⎟ ⎜ ⎟ image [49]. The range, or active lag distance, is also the⎜ ⎟ ⎜ ⎟ˆvar Z(*x ) − Z(x*) =() approximate distance at which spatial autocorrelation⎜ ⎟ ⎜ ⎟ (,xx*)nn between sampled data point pairs ceases, or becomes⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ 1⎝ ⎠ ⎝ ⎠ much more variable [54,55]. The value at which the

model attains the range, (i.e., the value on the y-axis) is

The semivariogram generated described the spatial

called the sill, while the nugget is usually assumed to be

dependence, between the sampled mosquito and bird

non-spatial variation due to measurement error and var-

parameters, as a function of the distance between the iations in the data that relate to shorter ranges than the

sampling sites. The semivariogram allowed for mosquito minimum sampled data spacing [49]. In this research,

or bird abundance estimations at any point in the study the sill indicated that the semivariance values had been

site. The value of prevalence, Z, at the coordinate (x ,0 reached (i.e., the value of maximum variance was

y ) was estimated from the n nearest sampling values:0 equivalent to the variance of the image pixel value),

while a non-zero intercept value (i.e., nugget variance)

Zobs(,x y),Zobs(x,y) Zobs(x,y )11 2 2 nn of the varigram model was indicative of the variability of

the field and remote-sampled Cx. erraticus and North-by the linear formula:

ern Cardinal data quantified at a resolution smaller than

n the image resolution. A simple quantitative measure of

ˆZx(,y) = aZ (x,y ) (2:3) the interpolation performed was determined by generat-00 iobs i i∑

ing root-mean square error (RMSE) values for the mod-i=1

els. Optimizing the RMSE by minimizing the spatial

The a were found by the Lagrange multiplier l andi structure in a Culex aquatic habitat model, can generate

solving the system: a pure nugget variogram, of which the level of nuggetJacob et al. International Journal of Health Geographics 2010, 9:12 Page 8 of 16

http://www.ij-healthgeographics.com/content/9/1/12

variance can represent noise characteristics in field and estimate whether the OLS regression estimates indicated

remote-sampled explanatory variables [31]. Additionally, significant serial correlation with an estimated order of

a neighborhood distance search radius provided the a lagged covariance of 1. The AUTOREG procedure

mean standard errors of the interpolated values. Inter- corrected for serial correlation using the YW method.

polation accuracy can be measured by the natural loga- The Durbin-Watson statistic indicated that serial corre-

rithm of the mean squared interpolation error, which lation was not significant in the YW corrected model.

2

can reveal all main effects of parameter estimates, in an The YW estimates for the model indicated a R = 0.632,

autoregressive model, while quantifying several covariate F statistics of 39.177, and Durbin-Watson score of 1.935.

interaction terms [56-59]. For total mosquito count data, a first-order trend

ordinary kriging process was fitted to the semivario-

Results gram at a partial sill of 5.041 km, nugget of 6.325 km,

The regression models were able to classify sampled lag size of 7.076 km, and range of 31.43 km, using 12

high and low abundance count habitats. Temperature lags. For total adult Cx. erracticus count, a first-order

had a significant association with Cx. erraticus adult trend ordinary kriging process was fitted to the semi-

abundance (p < 0.0002). The predictor variable precipi- variogramatapartialsillof5.764km,nuggetof6.114

tation also presented a significant relationship (p < km, lag size of 7.472 km, and range of 32.62 km, using

0.05). In this research, Durbin-Watson statistics were 12 lags (Figure 1). For the total bird count data, a

®generated using the AUTOREG procedure in SAS to first-order trend ordinary kriging process was fitted to

Figure 1 Predicted Culex erracticus abundance data using ordinary kriged model overlaid on a QuickBird visible and near infra-red

(NIR) data of the Tuskegee study site.Jacob et al. International Journal of Health Geographics 2010, 9:12 Page 9 of 16

http://www.ij-healthgeographics.com/content/9/1/12

®the semivariogram at a partial sill of 4.998 km, nugget A DEM of the study site was generated in ArcGIS

of 5.413 km, lag size of 7.549 km, and range of 35.27 (Figure 3). The minimum and maximum range of the

km, using 12 lags. For the Northern Cardinal count elevation in the DEM models were calculated. Pearson’s

data, a first-order trend ordinary kriging process was correlation was used to evaluate the linear relationship

fitted to the semivariogram at a partial sill of 6.387 between mosquito and bird count data and the sampled

km, nugget of 5.935 km, lag size of 8.549 km, and a predictor variable elevation using the SRTM DEM.

range of 41.38 km, using 12 lags (Figure 2). To evalu- Results of the DEM analyses indicated a statistically sig-

ate the accuracy of the models, predictive mean stan- nificant inverse linear relationship between total

2

dard error distributions were generated, which revealed sampled mosquito data and elevation in meters (m) (R

that all models were within normal statistical limita- = -.426; p < .0001), with a standard deviation (SD) of

tions (Table 3). 104.6. The range of the elevation in the DEM had a

Figure 2 Predicted Northern Cardinal abundance count data in the Tuskegee study site using an Ordinary kriging algorithm.

Table 3 Residual model outputs from ordinary kriged models using mean error and root mean square error for the

sampled mosquito and bird and count data in the Tuskegee study site.

Data Ordinary kriging mean error Ordinary kriging root mean square error

Total bird counts 0.055 1.821

Northern cardinal 0.163 1,642

Total mosquito counts -0.132 4.664

Cx. erraticus count -4.814 8.535Jacob et al. International Journal of Health Geographics 2010, 9:12 Page 10 of 16

http://www.ij-healthgeographics.com/content/9/1/12

Figure 3 Digital Elevation Model (DEM) of the Tuskegee study site.

minimum value of 0 m, with a maximum value of 431 Table 4 Pearson correlation for mosquito and bird

m. The results of the total sampled bird data and eleva- sampled data and the sampled predictor variable

2tion were (R = -.511; p < .0001), with a SD of 22.97. elevation in the Tuskegee study site.

The range of the elevation in the DEM had a minimum Predictor variables Statistical tests Significance Elevation

level (m)value of 0 m, with a maximum value of 439 m. DEM

statistics also indicated a significant inverse linear rela- Total mosquito count Pearson 1 -.426

data Correlationtionship between total sampled Cx. erracticus data and

2 Sig. (2-tailed) <.0001 <.0001elevation (R = -.471; p < .0001), with a SD = 111.6. The

N 141 118range of the elevation in the DEM had a minimum

Total bird count data Pearson 1 -.511valueof0m,withamaximumvalueof487m.The

Correlation

results of the total sampled Northern Cardinal data and

Sig. (2-tailed) <.0001 <.00012elevation was (R = -.583; p < .0001), with a SD = 114.2.

N 141 118

The range of the elevation in the DEM had a minimum

Cx. erraticus data Pearson 1 -.471value of 0 m, with a maximum value of 501 m (Table 4).

Correlation

Sig. (2-tailed) <.0001 <.0001

Discussion

N 141 118

Culex erraticus was the most abundant mosquito species

Northern cardinal data Pearson 1 -.583

collected during this study in central Alabama bottom- Correlation

land freshwater wetlands, which was ~6 km from the Sig. (2-tailed) <.0001 <.0001

center of Tuskegee and ~1.5 km east of a populated N 141 118

area north of highways US-29/AL-81. This species pre-

viously yielded the highest number of EEEV-infected

pools in Tuskegee [21]. Habitat requirements of Cx. mammals (23-67%), and reptiles (2-20%), suggesting

erraticus are shallow water [60-66], especially overgrown greathostflexibilitybased on relative availability of

with surface plants or grassy margins, such as streams, hosts [16].

lakes or impoundments [18]. This species may be col- The bird communities present in the Tuskegee study

site are typical of reforested areas of bottomland hard-lected in high numbers during hot weather, and even

wood [10]. The level of vector contact with differentduring drought, periods in July and August [10]. Blood-

bird species in a given area is essential in identifyingfeeding hosts of Cx. erraticus include: birds (27-70%),