Visual Head Counts: A Promising Method for E ﬃ cient Monitoring of Diamondback Terrapins

: Determining the population status of the diamondback terrapin ( Malaclemys terrapin spp.) is challenging due to their ecology and limitations associated with traditional sampling methods. Visual counting of emergent heads o ﬀ ers a promising, e ﬃ cient, and non-invasive method for generating abundance estimates of terrapin populations across broader spatial scales than has been achieved using capture–recapture, and can be used to quantify determinants of spatial variation in abundance. We conducted repeated visual head count surveys along the shoreline of Wellﬂeet Bay in Wellﬂeet, Massachusetts, and analyzed the count data using a hierarchical modeling framework designed speciﬁcally for repeated count data: the N-mixture model. This approach allows for simultaneous modeling of imperfect detection to generate estimates of true terrapin abundance. Detection probability was lowest when temperatures were coldest and when wind speed was highest. Local abundance was on average higher in sheltered sites compared to exposed sites and declined over the course of the sampling season. We demonstrate the utility of pairing visual head counts and N-mixture models as an e ﬃ cient method for estimating terrapin abundance and show how the approach can be used to identifying environmental factors that inﬂuence detectability and distribution. such repeated data, and to ideal that will detection in future surveys, while revealing site-speciﬁc in of that are consistent with habitat preferences and phenology. propose head count surveys and associated hierarchical modeling framework, as a promising method for spatially explicit estimates of diamondback terrapin abundance and investigating drivers of spatiotemporal variation in abundance.


Introduction
The diamondback terrapin (DBT; Malaclemys terrapin spp.) is the only estuarine obligate turtle species in North America [1], and as a result, has a long but fragmented coastal range that extends from Cape Cod, Massachusetts, to the Texas Gulf Coast [2]. DBTs are currently listed as protected or regulated in every range state [3]. Despite their conservation status, population assessments have been limited to very local scales, and as a result, comparable and representative status assessments of this imperiled species are generally lacking [3,4].
As salt marsh specialists, DBTs have specific life history and behavioral traits that determine which monitoring techniques are suitable and how that data can be used. For example, DBTs have highly seasonal phenology in their northern range [5], form mating aggregations [6], and have highly specialized terrestrial nesting habitat requirements. The species is also highly mobile [1,7], with movements that are linked with tide cycles [7], and surface regularly to breathe [3,8]. A variety of methods are used to monitor DBTs, including modified crab traps [9][10][11], hoop traps [12,13], trammel nets [4,[12][13][14], fyke nets [15,16], seines [14,17,18], and dip nets [10,13]. While each method takes some aspects of DBT ecology into consideration, their success, in terms of consistent, reliable, and scalable population estimates has been variable. For example, in Wellfleet Bay (MA), almost four decades of monitoring using capture mark-recapture (CMR) has resulted in over 3000 marked individuals [19], but failed to produce reliable estimates of population sizes due to low detection rates (see also: [11,13,18]), mature female-biased captures (see also: [9,20]), and variable and opportunistic search effort. These challenges often result in extensive sampling effort being concentrated in extremely localized study areas that are not representative of the landscape [18,20,21].
A promising, but vastly under-utilized, monitoring method for the DBT is visual counting of emergent heads (hereafter, visual head counts), which offers an efficient, non-invasive method for generating abundance estimates of local populations [22]. Because DBTs are the only turtle species to inhabit coastal estuarine habitats, must surface to breathe air, and perform seasonal staging behavior where both sexes congregate to initiate courtship and mating, their biology and behavior lend themselves naturally to methods that rely on detection, but that does not require individual recognition. For example, Isdell et al. [22] used the detection and nondetection of emergent heads to investigate factors that influence site occupancy. Although there is some support for extending the presence-absence surveys to include the counts of heads to estimate relative abundance [4,9], the concept has yet to gain traction. This is despite the fact that visual (point) count surveying is a widely adopted monitoring methodology in ecology [23] with well-established statistical models for analyzing these data (N-mixture model [24]), and both can easily be modified for DBT.
In this study, we conducted visual head count surveys at 38 locations throughout Wellfleet Bay, Massachusetts. Using efficient spatially-and temporally-replicated visual head count surveys and well-established statistical models, we were able to produce estimates of local population size, including linking abundance to shoreline exposure and seasonality, and estimates of how environmental conditions (wind and air temperature) influence detectability. Our results suggest that visual head count surveys are a promising method for monitoring of diamondback terrapins.

Study Area
This study is focused on approximately 50 km of shoreline around Wellfleet Bay (WB), a protected area located in the town of Wellfleet, within Cape Cod Bay, Massachusetts, USA ( Figure 1). Wellfleet Bay is a marsh dominated system comprised of many creeks and inlets with an extensive intertidal zone that can exceed 3 meters during spring high tides.
Diversity 2019, 11, x FOR PEER REVIEW 2 of 15 monitoring using capture mark-recapture (CMR) has resulted in over 3000 marked individuals [19], but failed to produce reliable estimates of population sizes due to low detection rates (see also: [11,13,18]), mature female-biased captures (see also: [9,20]), and variable and opportunistic search effort. These challenges often result in extensive sampling effort being concentrated in extremely localized study areas that are not representative of the landscape [18,20,21]. A promising, but vastly under-utilized, monitoring method for the DBT is visual counting of emergent heads (hereafter, visual head counts), which offers an efficient, non-invasive method for generating abundance estimates of local populations [22]. Because DBTs are the only turtle species to inhabit coastal estuarine habitats, must surface to breathe air, and perform seasonal staging behavior where both sexes congregate to initiate courtship and mating, their biology and behavior lend themselves naturally to methods that rely on detection, but that does not require individual recognition. For example, Isdell et al. [22] used the detection and nondetection of emergent heads to investigate factors that influence site occupancy. Although there is some support for extending the presence-absence surveys to include the counts of heads to estimate relative abundance [4,9], the concept has yet to gain traction. This is despite the fact that visual (point) count surveying is a widely adopted monitoring methodology in ecology [23] with well-established statistical models for analyzing these data (N-mixture model [24]), and both can easily be modified for DBT.
In this study, we conducted visual head count surveys at 38 locations throughout Wellfleet Bay, Massachusetts. Using efficient spatially-and temporally-replicated visual head count surveys and well-established statistical models, we were able to produce estimates of local population size, including linking abundance to shoreline exposure and seasonality, and estimates of how environmental conditions (wind and air temperature) influence detectability. Our results suggest that visual head count surveys are a promising method for monitoring of diamondback terrapins.

Study Area
This study is focused on approximately 50 km of shoreline around Wellfleet Bay (WB), a protected area located in the town of Wellfleet, within Cape Cod Bay, Massachusetts, USA ( Figure 1). Wellfleet Bay is a marsh dominated system comprised of many creeks and inlets with an extensive intertidal zone that can exceed 3 meters during spring high tides. We conducted visual head count surveys (visual surveys from here) at 38 locations along the shoreline of WB. Sites were selected using the following approach: First, points were generated every We conducted visual head count surveys (visual surveys from here) at 38 locations along the shoreline of WB. Sites were selected using the following approach: First, points were generated every Diversity 2019, 11, 101 3 of 15 500 meters along the entire shoreline of WB using the Generate Points Along Lines tool in ArcMap 10.6 (ESRI 2018). We used 500 m between points to ensure that on any given day, we would avoid double counting of individuals (i.e., to ensure independence among the sampling locations). Next, we removed points that were located in non-habitat, leaving 44 potential survey sites. We note that 'non-habitat' was defined as areas with no marsh habitat, and thus, our surveys were focused on areas where DBTs would, in theory, be expected to occupy at some point during the tide cycle. Upon initial visits to these 44 sites, six were deemed either inaccessible or unsuitable for surveying (e.g., not enough open water visible to detect surfaced heads), leaving the final 38 suitable sites ( Figure 1).

Visual Head Count Surveys
We drove to the sampling locations and accessed each site from land using a handheld GPS unit (Garmin GPSMAP 78, Olathe, Kansas). Visual surveys were conducted by scanning the water from shoreline to shoreline using binoculars, and recording the number of DBT heads that were observed inside a 100 m radius from the survey point. During each site visit, we conducted five (5) such scans, each lasting no longer than 2 min, with a 1-minute break between the end of one scan and the beginning of the next. Surveys were conducted at high tide, when DBTs are most active [7] and demonstrate regular emergence-submergence behavior [8]. Separating scans by one minute was assumed to be sufficient to allow for turnover, and thus variation, in which individuals emerged and were available for detection ( Figure A1). Each site was visited at least once each month from May through August 2018 (median number of site visits: 4, range: 3-13). Thus, the data generated from each site visit are five (5) imperfect counts of a population assumed to be constant during the period of counting, but that can vary between site visits and between sites. In total, there were 184 five-scan head count surveys conducted at 38 sites (i.e., 184 unique site-visit combinations).
The area sampled was approximately a 100 m radius semi-circle around the sampling location from the shoreline, extending into the water. This area was identified using a rangefinder (Halo, XL450-7, Grand Prairie). Rangefinders cannot reliably or efficiently detect DBT heads, which are too small, and therefore, were not used to count heads. Instead, proficiency in observer distance estimation was achieved through extensive self-calibration prior to, and regularly during, the sampling season by comparing estimated distances with rangefinder distances of objects easily detected by rangefinders in the water (e.g., boats, buoys).

Statistical Analysis
A natural analytical framework for analyzing repeated counts of a closed population is the N-mixture model [24]. Formally, the counts y ik , which are the number of heads observed in scan k, where k = 1, . . . , 5, from site i, where i = 1, . . . , 184 unique site-by-visit surveys, are assumed to be binomial random variables with a trial size of N i , i.e., the true population size at site i, and success probability, p ik , which is the probability of detecting an individual in the population at site i during scan k. This can be written as follows: The N-mixture model assumes that individuals are equally detectable, but does allow detectability to be modelled using scan-or site-specific covariates that are assumed to influence detectability. In our case, we considered four environmental covariates that we hypothesized would influence our ability to detect DBT heads: air temperature (Celsius, 'Temp'), cloud cover (clear, <50%,~50%, >50%, or Overcast, 'Cloud'), wind speed (miles per hour, 'Wind'), and exposure classification ('Expo', see below). Temperature, cloud cover, and wind speed were measured immediately prior to conducting the visual surveys, and the same covariate value was used for each scan at a site during a single visit. Detection probability can be modeled using a logit-linear model as follows: where the intercept (β 0 ) and the coefficients for air temperature (β temp ), cloud cover categories (β cloud ), the effect of wind (β wind ), and exposure class (β expo ), are parameters to be estimated.
The N-mixture model is a hierarchical model, which means that the detection process (p ik above) can be modeled conditionally on, and independent of, the true abundance at a site N i. [24,25]. This ability to explicitly account for imperfect detection while simultaneously estimating variation in true abundance is what makes these types of observation-state hierarchical models so appealing [26]. Formally, abundance at a site, N i , is described as either a Poisson or negative binomially distributed random variable with expected value λ i . Preliminary analysis comparing Poisson and negative binomial formulations of the N-mixture model determined that the negative binomial model was preferred (Table A1). Thus, abundance is described as follows: As with detection, variation in abundance can be modeled as a function of site-specific covariates using an appropriate generalized linear model (GLM) [25]. For this pilot study, we were primarily interested in demonstrating that head counts could produce data suitable for analysis using the N-mixture model, and that the model could be used to make biologically meaningful inferences. As such, rather than explore the range of hypothesized drivers of DBT abundance, we instead used a broadly defined 'exposure' site classification ('Exposed', Figure 1), where sites were classified as exposed (i.e., sampling sites were located on a stretch of shoreline that were exposed to open water of the larger bay), or unexposed (i.e., sampling sites were located on a stretch of shoreline that were sheltered to the open water of the larger bay). We used this binary exposure category as a covariate on abundance that we assume broadly captures potential differences in habitat quality. For example, we observe larger areas of salt marsh, and specifically Spartina alterniflora, habitat in areas of the bay that are more sheltered and less affected by weather-related turbulence. In addition, we also included an effect of seasonality (day since first survey, 'Day') to capture spatiotemporal dynamics that may result in seasonal changes in abundance variation. For the negative binomial regression, an appropriate GLM is a log-linear model: log(λ i ) = α 0 + α exposed × Exposed i + α day × Day, where the intercept (α 0 ), which is the expected abundance, on the log scale, for exposed sites, and the coefficient measuring the difference between the expected abundance at exposed and unexposed locations (α exposed ), and the effect of relative day of the season (α Day ) are parameters to be estimated. Because we were interested in exploring which covariate effects were most important in explaining both detection (i.e., Wind, Cloud, and Temp) and abundance (i.e., Exposed, and Day), we fit all possible combinations of covariate effects models. For detection, this included a null model (constant detection across all sites), univariate models for each of the three covariates, all possible pairs of covariates, and a model with all three covariates included (eight detection models, Table A1). For abundance, this included a null model (constant abundance across all sites), univariate models for each of the two covariates, an additive model with Exposed and Day, and an interactive Exposed-Day model (five abundance models, Table A1). Thus, in total for the negative binomial N-mixture models, we considered 40 models (Table A1). We treated each of the 184 unique site-visit samples as independent sites, acknowledging that because the system is highly dynamic, not doing so would violate the assumption of closure. We analyzed the data using maximum likelihood using the R package 'unmarked' [27], and were ranked according to AIC values where lowest is best [28].

Results
A total of 184 head count surveys were conducted at 38 spatially distinct sites, each of which was visited at least once each month from May through August 2018 (median number of site visits: 4, range: 3-13). Of the 38 sites, 17 were categorized as exposed, and 21 were categorized as unexposed ( Figure 1). Twenty-nine percent (29%) of surveys were conducted under clear skies, 28% conducted in <50% cloud clover, 5% in 50% cloud cover, 20% in >50% and 17% in overcast conditions. Surveys were conducted in wind speeds that ranged from 1 mph to 16 mph. The mean head count in a single scan was 2.65 (median = 0) and ranged from 0 to 91 individuals. DBTs were detected at 36 out of the 38 sites surveyed.
Based on model evaluation using Akaike's Information Criterion (AIC [28]), the best-supported model allowed detection to vary as a function of wind speed, air temperature, and exposure category, and abundance to vary by exposure category and day of the year (i.e., this model had the lowest AIC, Table 1). A model that included cloud cover had some support based on AIC (∆AIC = 0.12, Table 1), but following recommendations in [29], this term can be considered non-informative because the additional covariate did not improve the support relative to the top model which was simpler by one term. Therefore, below, we report our findings based on the top model: p(Temp + Wind + Exposed) λ(Day × Exposed). Here we show the subset of 10 models that accounted for all the AIC support (ωAIC ≥ 0.01). The models are ranked according to their AIC score (lowest is better). The Detection (p) and Abundance (λ) model formulations are provided, as is the number of parameters in the model (K), the AIC score, the difference in AIC relative to the top model (∆AIC), the AIC weight (ωAIC) which is a measure of relative model support, and the cumulative AIC weights (ΣωAIC). Detection probability was higher in unexposed sites (β expo = 1.70, 95% CI: 2.39-1.01), and lowest when surveys were conducted in cold temperatures during high winds ( Figure 2). Detection probability was negatively influenced by wind (β wind = −0.19, 95% CI: −0.27-−0.11, Figure 2) and positively influenced by air temperature (β temp = 0.18, 95% CI: 0.12-0.25, Figure 2). For example, a survey conducted in the warmest observed temperature (32.2 C) and lowest wind (1 mph) would have an expected detection probability of 0.22 at an exposed site (95% CI: 0.10-0.43, Figure 2), and 0.61 at an unexposed site (95% CI: 0.42-0.77, Figure 2). Conversely, in the coldest temperatures (11.1 • C) and highest winds (16 mph), DBT would be practically undetectable at both exposed and unexposed sites (0.0003, 95% CI: 0.0001-0.001 and 0.002, 95% CI: 0.0007-0.005, respectively, Figure 2). Thus, maximum detection probability is achieved when sampling in warm conditions and low winds.   Based on AIC, the top abundance model included the exposure classification, day of the year, and an interaction between the two (Table 1). Unexposed sites had, on average, higher expected abundance than exposed sites at the beginning of the season (α unexposed = 0.24, 95% CI: −1.05-1.54, estimates on the log scale). Abundance declined through the season, although this decline was faster in unexposed sites: α relday:unexposed = −0.010 (95% CI: −0.027-0.007) than exposed sites: α relday:exposed = 0.030 (95% CI: −0.039-−0.018, Figure 3). The expected abundance at exposed sites on the first day of the sampling season (day 0 : 11 May 2018) is 125 terrapins (95% CI: 39-406) and on the last day (day 109 : 28 August 2018) was 42 (95% CI: 12-144). In contrast, expected abundance at unexposed sites ranged from 160 (95% CI: 84-305) at the start of the season to 7 (95% CI: 3-15) at the end.

Detection
Finally, we assessed the goodness of fit of our AIC-top model using parametric bootstrapping [25,30]. We conducted 1000 parametric bootstrap simulations (i.e., simulated data from our fitted model) and computed two commonly used goodness of fit statistics, the sums of squares (SSE, the sum of the squared residuals, [25]) and the Freeman Tukey statistic (a metric that compares observations to expectations under the model [25]). For both goodness of fit statistics, the observed statistic (SSE or Freeman Tukey) did not differ significantly from those produced via bootstrap simulations (p SSE = 0.153 and p FT = 0.215) suggesting that our model is a good fit to the observed data (Table A2, Figure 4).

Discussion
In this study, we demonstrate how pairing visual head count surveys with N-mixture modeling offers a complementary data collection and analysis framework for efficiently estimating local diamondback terrapin population sizes, while simultaneously accounting for factors that affect detectability and spatial variation in abundance. We found that detection was highest in warmer conditions and, as expected, negatively influenced by wind speed, and that DBTs were more likely to be detected in unexposed sites. Our results suggest that abundance was higher at sites along sheltered stretches of shoreline relative to sites that were exposed, and that per site abundance reduced over the course of the season, a reduction that was more pronounced at unexposed sites. We   . The vertical black line shows the observed test statistic and demonstrates that it is a reasonable assumption that the data could arise from the model used for inference.

Discussion
In this study, we demonstrate how pairing visual head count surveys with N-mixture modeling offers a complementary data collection and analysis framework for efficiently estimating local diamondback terrapin population sizes, while simultaneously accounting for factors that affect detectability and spatial variation in abundance. We found that detection was highest in warmer conditions and, as expected, negatively influenced by wind speed, and that DBTs were more likely to be detected in unexposed sites. Our results suggest that abundance was higher at sites along sheltered stretches of shoreline relative to sites that were exposed, and that per site abundance reduced over the course of the season, a reduction that was more pronounced at unexposed sites. We

Discussion
In this study, we demonstrate how pairing visual head count surveys with N-mixture modeling offers a complementary data collection and analysis framework for efficiently estimating local diamondback terrapin population sizes, while simultaneously accounting for factors that affect detectability and spatial variation in abundance. We found that detection was highest in warmer conditions and, as expected, negatively influenced by wind speed, and that DBTs were more likely to be detected in unexposed sites. Our results suggest that abundance was higher at sites along sheltered stretches of shoreline relative to sites that were exposed, and that per site abundance reduced over the course of the season, a reduction that was more pronounced at unexposed sites. We argue that Studies of DBTs have been largely focused on capture mark-recapture (CMR) and are plagued with reported issues of detectability [9,18,19]. Our study confirms that this challenge is not restricted to CMR, but also impacts visual head counts (see also [22]). However, repeat-survey designs, such as those called for under the N-mixture model, are developed specifically to capture variability in imperfect counts and relate that variability to variation in factors (e.g., environmental) thought to influence detectability. While robust design CMR is designed in the same way, an important distinction is that visual head counts do not require physical capture and are designed in line with the species surfacing and aggregation behavior to maximize detection, and this appears to yield far greater sample sizes. For example, Isdell et al. [22] were able to relate variation in DBT site occupancy to terrestrial-aquatic connectivity using presence-absence surveys of emerged heads; our approach extended this protocol to explicitly model the counts to make inferences about abundance rather than occupancy. While this precludes explicit estimation of demographic rates (survival and fecundity), when the inference objective is to estimate occurrence or abundance patterns then estimates from unmarked individuals as we have presented here have obvious value (see Figure 3).
Applying the N-mixture model to repeated head count surveys, we were able to both identify, and correct for, factors that influenced detectability and therefore, estimate population size free of these specific biases. Specifically, for DBT in Wellfleet Bay, wind reduced detectability, detection was highest in warm conditions, and terrapins were more likely to be detected at sheltered sites than exposed sites (Figure 2). The negative effect of wind on detection was expected, and likely related to either increase wave chop, making heads more difficult to observe or, behavior-related, that DBTs surface less in higher winds. Likewise, the positive effect of air temperature on detectability is intuitive, considering DBTs are ectothermic and are most active at high tides when our surveys are conducted [7]. Detectability was higher at unexposed sites (Figure 2), likely due to the fact that they are relatively (compared to exposed sites) sheltered from the weather and tidal systems that can reduce detectability. Linking potential behavioral responses to environmental conditions using hierarchical models as we have done here, and as demonstrated by Isdell et al. [22], has potential implications for other capture methods that rely on visual detection of terrapins (e.g., dip netting, drones). Our study suggests that the ideal conditions for conducting visual head count surveys in this system are in warm conditions with low-to-no wind.
One of the most appealing features of combining visual head counts and N-mixture modeling is the ability to generate estimates of local (scan area) population size for several locations within Wellfleet bay, and link those estimates to spatially varying covariates. This is in contrast with intensive CMR efforts, which often require multiple seasons to generate sufficient recaptures [18,19], and have been restricted to just two locations within the bay [19]. This is common throughout the range where CMR estimates of abundance are typically, and justifiably, made over a spatially restricted area that is small, and thus not representative, of the larger population of interest [11,12]. Moreover, this expansion of the spatial coverage and the statistical inference about DBT population status, at least in Wellfleet Bay, is achieved far more efficiently than traditional methods [9,15], and arguably yields more management-relevant results. For example, the ability to move beyond point estimates of abundance and start to relate spatial variation in local abundance to habitat characteristics or environmental conditions is potentially far more valuable information to inform proactive species-and habitat-specific conservation action.
Our focus here was to demonstrate the utility of visual head counts and N-mixture models as an efficient method for estimating abundance, and as such, we did not exhaustively explore all potential hypothesized predictors of abundance. Instead, we focused on differences between sheltered and exposed shorelines as a simple catch-all proxy for a wide range of environmental disturbance, and importantly, the extent and distribution of preferred salt marsh habitat. However, we were able to capture interesting and intuitive spatiotemporal patterns. We found that abundance was higher in unexposed sites at the beginning of the sampling season, but was lower than exposed sites by the end of the season due to a steeper population decline over the season. Despite being a crude measure of habitat quality, these model predictions are in line with what we would expect: unexposed sites experience less environmental disturbance (e.g., turbidity) where intact saltmarsh habitat is more likely to be found. The start of our sampling period coincides with post-brumation emergence of the DBT, and if sheltered areas are indeed a proxy for higher quality habitat, then our prediction of higher abundance in sheltered areas early in the season would be consistent with the idea that these areas are likely locations for courting and mating aggregations [6,7]. Likewise, DBT disaggregation, and in particular terrestrial nesting by females [7], is consistent with our prediction of reduced abundances in the better quality areas, coupled with an overall reduction in abundance. Our predictions do, however, have large associated uncertainty, especially for exposed sites (Figure 3), and should be interpreted with caution. While our measure of exposure broadly captures abundance patterns in Wellfleet Bay, there is more to be done to characterize finer scale drivers of spatiotemporal variation before these results can be used to inform conservation action; this remains an important area of research throughout the DBT range [3]. Encouragingly, though, we have demonstrated that the coupled field and statistical framework we have described here can be used to achieve this, but requires identifying the appropriate data that link covariates to standing hypotheses about DBT ecology. Our next step is to identify and develop a suite of spatiotemporal covariates to formally test hypotheses about drivers of population sizes in both space and time.

Conclusions
The visual head count methodology we describe naturally matches the ecology of DBT, is easy to conduct and require very little training (i.e., low intensive), and, as demonstrated here, can be used to generate meaningful spatially referenced estimates of local abundance across a region of interest. This contrasts substantially with the widely applied CMR approaches, which involve substantial effort, both in terms of time and expertise (i.e., highly intensive), and often suffer from extremely low capture rates that require multiple sampling seasons to generate abundance estimates, and as a result, are typically limited in terms of spatial coverage. Further, we demonstrate the application of N-mixture models, the canonical analytical framework for analyzing such repeated count data, and were able to identify ideal survey conditions that will maximize detection in future surveys, while revealing site-specific variation in estimates of abundance that are consistent with habitat preferences and phenology. We propose visual head count surveys and associated hierarchical modeling framework, as a promising method for generating spatially explicit estimates of diamondback terrapin abundance and investigating drivers of spatiotemporal variation in abundance.  Table A1. Full model selection table for the negative binomial (left side) and Poisson parameterizations of the N-mixture model. Here we show all 40 models for each parameterization. The models are ranked according to their negative binomial AIC scores (i.e., the AIC-preferred model, lowest is better). The Detection (p) and Abundance (λ) model formulations are provided, as is the number of parameters in the model (K), the AIC score, the difference in AIC relative to the top model (∆AIC), the AIC weight (ωAIC) which is a measure of relative model support, and the cumulative AIC weights (ΣωAIC). The '*' denotes interactions between terms and (·) denotes the intercept only (or null) model.  Table A2. Goodness of fit test statistics from a parametric bootstrap of 1000 simulations from the AIC-top model. SSE is the sums of squares and Freeman Tukey compares the observed data to that expected under the model. In the table, θ obs is the observed test statistic, i.e., that calculated from the actual data, θ boot is the statistic from data generated from the model, the mean and standard deviation of the differences are the difference from the observed and each simulated data set, and Pr(θboot > θobs) is the probability that the observed statistic is greater than expected as compared to the bootstrapped distribution (i.e., p < 0.05 could be interpreted as significantly different). Both p-values are > 0.05, and the model is therefore considered to be adequate. bootstrapped distribution (i.e., p < 0.05 could be interpreted as significantly different). Both p-values are > 0.05, and the model is therefore considered to be adequate.  Figure A1. Visualization of the within-visit variation in counts at sites with at least one observed 0 which we use to demonstrated that variation in the number of heads counted can be related to turnover of specific individuals at the surface. Each panel represents a site, each line within a panel Figure A1. Visualization of the within-visit variation in counts at sites with at least one observed 0 which we use to demonstrated that variation in the number of heads counted can be related to turnover of specific individuals at the surface. Each panel represents a site, each line within a panel represents a visit, and the points joined by the lines show the number of heads observed in each of the five scans (blue to red color shades correspond to scan one through five). Here we show only visits that had at least a single count of zero to emphasize the fact that there is a high rate of turnover within visits (demonstrated by the variation between scans). We note that these observations are subject to imperfect detection, but conditions for each visit were similar, and in the absence of available empirical data on emergence rates, this serves as good evidence that we are not simply counting exactly the same individuals in each scan, i.e., there is randomness associated with which individuals are available to be detected from scan-to-scan. Raw data for all data are provided here archived address here.