## Strategies to Estimate Global Prevalence of Trafficking for Sexual Exploitation

#### Global Estimation Based on Extrapolation

The research community faces two major challenges in estimating the global prevalence of human trafficking for sexual exploitation: common or agreed-upon measures, and feasible data collection methods. Both topics have their own challenges and deserve separate discussions. We tackle the second challenge here by proposing a few data collection strategies predicated upon the assumptions sexual exploitation occurs mainly in the commercial sex trade, with some definable geographical boundaries, and potential victims maintain social networks or are linked to others in some fashion.

Our strategies can apply to two primary data scenarios commonly encountered in estimating prevalence in human trafficking activities: using existing records and collecting primary data. The first approach is designed for analysis of identified trafficking cases where a body of records already exists and has to be mined. The second strategy is intended for collecting first-hand data from hard-to-find victims through survey methods. Empirical research on human trafficking remains underdeveloped, and many gaps are still filled with wild claims and sensational stories.

This article attempts to propose workable estimation strategies to close the knowledge gaps on the incidence and prevalence of human trafficking activities. Campaigns against human trafficking cannot be sustained, or even remain credible, without sound empirical evidence.

The need for and challenges in empirical research and producing estimates on the scale of human trafficking have been well-discussed. Laczko and Gozdziak detailed the multitude of conceptual and methodological challenges in these research efforts (2005).

Any efforts to establish global estimation of human trafficking activities is fraught with limitations that often invite criticisms and even alternative interpretations. Some even question the feasibility of estimating human trafficking activities at a macro level. For instance, Weitzer (2014) listed specific examples of a few notable macro-level estimates gone awry. However, political necessity makes such estimation necessary so countries and international organizations can mobilize (or de-mobilize) resources for fighting human trafficking or other gross human rights violations.

More importantly, from a scientific point of view, the burden falls upon the research community to answer the question of whether human trafficking is a social ill of sizable and significant proportions that call for proportionate counter-measures. Since global surveys of human trafficking are prohibitively expensive and logistically impractical, some methods of extrapolation (or generalization) must be used, by gathering local and regional data points to produce global estimates. To estimate the unknown, an extrapolation or scale-up scheme must be developed to generalize what is known. As additional local and regional data become available, such a conjecture strategy should improve over time.

#### The GSI Model of Extrapolation

An example of one such extrapolation is the Global Slavery Index (GSI), produced by the Australia-based Walk Free Foundation, which uses the Gallup Poll International to collect country data strategically and then a “vulnerability” model to estimate the prevalence of modern-day slavery for adjacent countries. This approach is based on the belief that countries with similar socio-economic and political conditions are likely to share similar levels of slavery.

While some may question the validity of applying one country’s condition to another, this is probably the only viable way to produce a regional or global estimate. Furthermore, GSI’s vulnerability model has gone through several iterations and reviews since its first appearance in 2014. Based on human security and crime prevention theories, the vulnerability model consists of 24 variables grouped into four dimensions: Political Rights and Safety, Financial and Health Protections, Protection for the Most Vulnerable, and Conflict. Demographers routinely use similar estimation efforts when they must study population trends and shifts between censuses. Extrapolation strategies are also frequently used in public health to estimate the prevalence of diseases.

Such an extrapolation scheme may also work when national surveys are unavailable or impractical to obtain. De Cock (2007) reviewed several strategies to assess the extent of trafficking activities, including national surveys to estimate prevalence, establishment-based surveys to target specific labor sectors, qualitative studies to seek in-depth knowledge about the nature of trafficking victimization, and a national database to gather all cases that came to the attention of police or service organizations. Although rare, some studies have used the traditional survey methods.

The best example is probably the survey conducted by the Gandhi Peace Foundation and National Labor Institute in the late 1970s, which drew a random sample of 1,000 villages in 10 Indian states where peasants were widely known to be attached to the landowners. The study estimated that there were 2.6 million bonded laborers in India. A more recent example was a study by Steinfatt and Baker (2011) in Cambodia, in which the researchers used geographic mapping techniques and informant-interviewers to estimate the population of sex trafficking victims in the country.

#### Strategies for Primary Data Collection

Researchers typically rely on two sources of data to assess human trafficking activities: cases already reported in the media or to government agencies or social service organizations, for which the records exist somewhere, and primary data collection using systematic methods to generate an estimate. Most existing knowledge of human trafficking reflects the first type of data.

Before we present our data collection strategies, a few parameters must be established to specify the boundaries of our methods.

First, sexual trafficking is a business and making money is traffickers’ primary objective. If this assumption stands, then sexual exploitation should occur mostly in places known and accessible to a sufficient population of potential customers, such as in urban centers or along major transit routes.

Second, as with most businesses, sex traffickers seek to maximize earnings. That means sex traffickers must find ways to attract customers, such as advertisements through printed materials or word of mouth. Keeping victims in total isolation and away from one another will increase operational complexity and reduce profit-making potential.

There are, of course, exceptions to these two assumptions. We present these data collection methods to accommodate two basic sets of field conditions: Either we know the geographical boundaries of commercial sex establishments or we have little knowledge of where the victims might be.

**Mark-Recapture Methods.** One of the few viable methods fit for estimating hidden population size is the mark-recapture sampling strategy. Mark-recapture sampling has its roots in wildlife studies and has gained traction in recent years among researchers studying criminal populations such as illicit drug makers and sex workers. The basic logic is fairly straightforward. An initial sample of the target population is identified, and all individuals in the sample are marked (or tagged) and then released back into the population.

After these marked individuals are dispersed, a second sample is drawn. The number of individuals who are marked in the initial sample—recaptured—is used to estimate the population size, based on the principle that the proportion marked in the second sample approximately equals the proportion of marked individuals in the population as a whole, as shown in the classic Lincoln-Petersen estimator,

where *S ^{1}* is the number marked and released into the population (the size of the first sample);

*S*is the size of the second sample;

^{2}*R*is the number recaptured in the second sample; and . is the estimate of the population size. The smaller the proportion of recaptured individuals, the higher the population turnover, and the larger the estimate of the population size.

Researchers in the criminal justice community are beginning to recognize the importance of mark-recapture inference strategies. For instance, Brunovskis and Tyldum (2004) successfully used this method to estimate the size of the street-based sex industry population in Oslo. To estimate the size of the underage sex worker population in New York City, Curtis and his colleagues at the John Jay College of Criminal Justice compared their study sample with official records provided by city’s Department of Criminal Justice Services (2008).

However, mark-recapture methods face different sets of challenges when applied to human populations. Since the recruitment patterns of human populations can be radically different from wildlife populations, such as in the form of “self-selection,” complicated mark-recapture models are usually required to obtain meaningful estimates.

**Respondent-Driven Sampling (RDS).** Heckathorn (1991) developed a network-based method called *respondent-driven sampling* (RDS) that purports to eliminate the biases inherent in chain-referral sampling methods. The RDS approach relies on Markov chain theory to achieve diversity and equilibrium (the point at which successive samples/waves no longer mirror initial samples) through successive waves of subject recruitment. This method modifies the traditional snowball sampling design through two basic changes: It employs a dual-incentive system whereby subjects are rewarded for both participation and for recruiting others into the study and using referral coupons means that subjects do not have to identify referrals to a researcher and the resulting anonymity encourages participation.

By confining the recruitment opportunities through a structured process, diversity is ensured and thus can be verified empirically. Also, volunteerism is minimized, since a dual-incentive system is used to encourage both participation and recruitment. Such a recruitment procedure also prevents researchers from deliberately seeking out particular subjects. “Masking” is minimized since researchers are not pointed in the direction of group members, but rather, recruited by group members themselves. Homophily is also minimized since recruitment is limited to three subjects per participant, and theories of Markov chains indicate that equilibrium can be achieved through a relatively small number of waves. Finally, RDS minimizes biases that may be introduced by those with larger personal networks. The RDS method has been successfully used in many studies on hard-to-reach populations.

**Adaptive sampling.** Adaptive sampling was first conceived to study unevenly distributed populations such as endangered species or highly clustered, hidden drug-using populations. The method exploits the ability of observing adjacent (neighboring) units of sampled individuals once a unit of high-interest has been found. The procedure has the ability to retain the attractive features of conventional sampling strategies, such as the ability to obtain unbiased estimators and control for final sample sizes.

Upon selection of initial samples, one can develop referrals (also called nominations) and, among sampled subjects, observe overlaps and map relations, thus adaptively building up the final sample. A Rao-Blackwell inference strategy outlined by Vincent and Thompson (2016) has the ability to incorporate the adaptively selected members into the inference procedure without introducing any bias to population size and other population quantity estimators.

Estimation of the size of a hard-to-reach population with an adaptive sampling design has received little attention in the literature. The earliest known work, based on a snowball sampling design, was developed by Frank and Snijders (1994). More recently, Felix-Medina and Thompson (2004) developed a method that is based on the assumption that recruitment can be accomplished through the availability of a partial sampling frame for the hidden population and that referrals are made in a predictable fashion.

Adaptive sampling has received validation through an empirical simulation of actual data observed from a population at high risk for HIV/AIDS in a Colorado Springs study (Vincent and Thompson, 2016). The population consists of 595 individuals, and links in the network represent a drug-sharing relationship between pairs of individuals. Sampling is based on selecting initial samples at random and then adding a set of units via link-tracing. In the corresponding inferential setup, all referrals made between members in the final sample must be observed. However, referrals made to individuals outside the sample do not have to be observed.

In this validation study, Vincent and Thompson estimated the population size and other population quantities, and found that, even with a small amount of adaptive effort, the new strategy makes a significant gain in improved precision over its conventional counterpart. This is the most-appealing feature of adaptive sampling: its ability to improve estimation rapidly with the addition of new observations recruited from the existing study subjects.

#### Estimating Prevalence Using Existing Records

Data mining among existing records has been used for some time to estimate human trafficking activities. In a recent prominent example (2011), the ILO released its estimates of global trafficking victimization based on the mark-recapture procedure to the published reports and accounts of identified victims. The ILO used two teams of researchers to conduct separate coding schemes to verify all reported cases. By applying the mark-recapture methods to these identified cases, the ILO estimated the total number of forced laborers to be around 20.9 million, with the vast majority being exploited by individual employers or private enterprises. Victims of forced sexual exploitation made up about 22% of the victims.

Despite this significant undertaking, the ILO acknowledges the limitations of using existing victim reports and calls for increased efforts in primary data collection through national or regional surveys, where mark-recapture methods may be more appropriate.

Essentially, the ILO sampling method relies on the use of two separate and independent teams of research assistants to build an independent database of all reported cases of forced labor each team could find to exploit the principle of mark-recapture. The idea behind this method is that if one team searches and finds all reported cases of forced labor, these reports will represent a sample of identified forced labor incidents. If both teams capture the same reported cases, they will represent the overlap between the two “independent” samples.

Following the same ILO sampling logic, the basic mark-recapture model assumes a binomial probability distribution of the sample cases. Therefore, a trafficking report is either “captured” or “not captured” with respective probabilities *p* and 1-*p*. The values of *p* are the same for all reports, but may differ between the teams, say *p*=*p*_{1} for team 1 and *p*=*p*_{2} for team 2.

Since global estimates are not likely to be generated from primary data collected systematically from around the world, data mining must be used. The example by ILO shows that it is possible and statistically sound to exploit reported cases that often represent some of the worst cases in trafficking violations. Moreover, the ILO method can be enhanced. For instance, with sufficient funding, one may explore the strategy of multiple recaptures where, for instance, four teams of research assistants are assigned to look for sex trafficking cases, with each team representing a sampling occasion. Elaborate mark-recapture models permit any combination of these analysis scenarios.

• Samples based on captures made by individual teams will comprise hypothetical samples, so mark-recapture analyses can be applied to more than just a two-sample setup and more-elaborate mark-recapture estimators can be used. For example, under the null model, M_{0}, where all capture probabilities are equal within and between sampling occasions, the probability distribution of the set of possible capture histories—vectors of zeros and ones denoting miss and capture, respectively, for capture occasions—is

where *n*_{ω}the number of individuals with capture history ω, ω(1) indexes capture histories with at least one capture, *N* is the total number of individuals in the population, *M*_{t+1} is the number of distinct individuals captured, *p* is the probability of capture, *t* is the number of sampling occasions, and *n* is the total number of captures in the study. The estimators for the population size and capture probability are those that maximize the equation.

• Heterogeneity/stratification effects, so heterogeneity of capture probabilities within each sample is permitted; for example, units captured from academic publications will be treated differently from units captured from newspaper articles, so heterogeneity effects can also be attributed to the research teams. With such a model, the hypothesis is that the capture distributions arise from a theoretical distribution *F*, so

where ƒ_{i} is the number of units caught *i* times and *t* is the number of sampling occasions. One popular estimator with this setup is the lower-bound estimator, as derived by Chao (1989),

Behavioral and time effects, so the probability of a captured unit is permitted to change from sample to sample—the probability a unit is captured by one research team is permitted to be different from that by another research team. For example, with the time-effects model, *M _{t}*, Chao (1989) derived a lower-bound estimator for the population size. The bias-corrected Chao estimator is

where *Z _{j}* is the number of individuals captured only on the

*j*occasion.

^{th}In each case, estimates of the number of individuals who are trafficked can be extrapolated by extracting the estimated mark-recapture parameters to estimate the probability of capturing trafficking cases at each geographical location, and then dividing the case size—number of trafficking victims reported in the case—by this probability.

Essentially, the Horvitz-Thompson estimator is used in this case. For example, let be the estimated capture probability for case *i* over the duration of the study; that is, the probability they are captured at least once. Let *y _{i}* be the number of reported victims of trafficking associated with this case. An estimate for the total number of trafficked victims is

#### Limitations of the Mark-Recapture Approach

Aside from the possible violations of the assumptions behind the mark-recapture models, there are major limitations with this approach. First, mark-recapture methods rely on independent samples from a “hidden” population that is impossible or impractical to enumerate, such as draining a pond to count all the fish. However, publications these days are rarely inaccessible via some publicly available venues, especially the Internet. If all teams of research assistants are doing their utmost due diligence, theoretically they should find all known cases of trafficking reported in the media, government reports, or agency reports. In other words, trafficking cases uncovered by all teams of research assistants should be identical and the overlap should be 100 percent, or close to it. If the overlaps between two “independent” samples are perfectly matched, the mark-recapture method becomes pointless.

Second, there will inherently be some dependence within lists—that the probability of one “source” being captured can easily influence another “source” being captured on the same sampling occasion; that is, by the same research team. For example, a magazine may report two or more cases of trafficking victims being rescued in a major city. Evidently, if one case is captured, then it is very likely that the other case will also be captured. This violates one of the basic assumptions in mark-recapture—that capture probabilities between individuals are independent within sampling occasions.

There are, however, ways to mitigate these problems. For instance, one can avoid sampling with dependence within sampling occasions by recording only the first captured case that the team encounters, stopping, and then starting from scratch to find a new captured case. Also, the original approach could be used to come up with a semi-exhaustive set of captured cases, randomly permuting them and then taking the final sample to be every kth captured case in the permuted list. One can also repeat the mark-recapture inference procedure by re-permuting and evaluating the estimator over these lists, each based on the *k ^{th}* entry.

This strategy helps to dampen the effect of dependence; consider the analogous effect of autocorrelation and using only every *k ^{th}* entry to remove dependence. With the arrival of mark-recapture software, statisticians can carry out sophisticated analyses these days. Some of this software includes the Rcapture package in R (Rivest and Baillargeon, 2014; R package version 1.4-2); Program Mark; and CARE (Chao, A., et al., 2001).

#### Linking Sources: Applying Adaptive Sampling to Record Search

Using the same logic as with adaptive sampling, reported trafficking cases can be conceived of as being linked (or networked), in the sense that one source nominates another source, perhaps via weblinks if online-based or references/citations if found in an academic press. In this case, it would be possible to apply an existing network sampling approach to estimate the population size. If links are abundant within the network, then simulation studies show that this approach is likely to result in more precise estimators than those obtained with a mark-recapture approach, and employing multiple and independent teams would not be a concern; the more the researchers communicate with their findings, the greater the precision in this one-sample approach.

An outline of this strategy is:

Define links (also called referrals or nominations) to other cases of sex trafficking in a suitable manner. For example, references from an academic source to another case, or a weblink from an online source to another case, would be considered a directed link (links are defined so such relations can be mapped and further selection from the literature can be steered to promising areas to adaptively build up the sample).

Carry out the following steps:

• Partition the population into strata, possibly based on whether a reported case falls under the academic press, news reports, NGO reports, or government reports case.

• Define *U _{k}* to be stratum

*k*. Select a random sample within each stratum. Suppose for stratum

*k*the sample is

*S*

_{0k}and

*n*

_{0k}is the number of units in this sample. For stratum

*l*define

*r*to be the number of links from

_{lk}*S*

_{0l}to

*S*

_{0k}and

*s*to be the number of links from

_{lk}*S*

_{0l}to

*U*\

_{k}*S*

_{0k}.

• A consistent estimator of the size of stratum *k* is

The population size estimator is then the sum of the strata size estimators,

• To obtain the Rao-Blackwellized (improved) estimators, employ a computationally intensive and elaborate Markov chain Monte Carlo procedure (Vincent, 2016) for details. To best apply this procedure, links should be traced from a subset of nominations from each unit selected for the initial sample. Variance estimation suggests using a jackknife procedure outlined by Vincent.

Data sources for this link-tracing method can include media reports (newspapers, radio, TV, Internet sites); local, national, regional, international, or thematic NGOs; government documents, from ministries of justice, labor, social affairs, migration, foreign affairs, and interior or from special police or other units dedicated to combating trafficking and forced labor; other international organizations, through their national offices or headquarters; academic reports; ILO reports, including those of the Committee of Experts on the Application of Conventions and Recommendations (CEARC); and trade union and employers’ organization reports.

To estimate the number of victims of sex trafficking, define *y*_{S0k} to be the average number of reported cases from *S*_{0k}. An estimate for the number of victims of sex trafficking in stratum *k* is then

Summing these values over all strata gives an estimate of the victim population size. A plug-in variance estimate is

This estimator may be regarded as an estimate for the total number of *reported* victims of sex trafficking. To estimate the number of reported and unreported number of victims in a stratum/geographic region, knowledge of independent studies based on primary data may be required. In this case, it might be possible to approximate the percentage of existing victims as reported, *p _{k}*, say, as a function of or through a sophisticated regression model based on socio-economic factors, such as how the GSI vulnerability model uses. For example,

where p̂_{k} is the estimated value of *p _{k}*, the β̂ values are the estimated coefficients of the regression model ƒ, and the

*X*values are the reported demographic values (economic development, rate of crime, income, level of education, etc.). An estimate for the number of unreported cases in this stratum/geographic region is then

A jackknife routine could be used to obtain variance estimates of this estimator.

The most-appealing feature of using an adaptive sampling design is the expected gains in efficiency relative to conventional designs of similar sampling with comparable sampling effort. Vincent and Thompson (2016) find that even with a small amount of adaptive effort, the new strategy makes a significant gain in improved precision over its conventional counterparts, such as standard population size and mark-recapture estimators.

In relation to the literature, teams of research assistants need not be concerned with conducting an exhaustive and intensive search of all reported cases because the estimators have been shown to have a great deal of efficiency with adaptive sampling effort. In this adaptive sampling approach, the more the research teams communicate with one another, the better the estimator.

#### Guessing the Tip of the Iceberg

The biggest methodological challenge in using reported trafficking cases to estimate prevalence is evaluating the dark figure. In criminology parlance, the “dark figure” refers to the gap between crimes that are reported and those unknown to the authorities. Although the true extent of human trafficking or modern-day slavery will never be known, there is little disagreement that official crime statistics represent only a small portion of the actual crimes committed.

If we know approximately how much or little the reported cases represent the “iceberg” of all human trafficking activities, we can then derive some estimates of the magnitude of the problem. The 2012 ILO study used select country surveys to provide the basis for such extrapolation. Once a multiplier is constructed, the extrapolation becomes easy. However, few, if any, studies have ever attempted to construct such multipliers.

Using reported trafficking cases to estimate unreported cases, such as the dark figure in criminology parlance, is problematic even in the best situations. Researchers have compared self-reported criminal activities to those in official arrest records (mostly among juvenile delinquents), and it is generally agreed that roughly less than 10 percent of all criminal activities ever come to the attention of the authorities. However, such estimates also point out that certain offenses tend to have much-greater reporting records, such as homicides and auto thefts, than others, such as assaults and drug offenses.

There is no known research literature on the proportion of trafficking victims that is known to the authorities or the rate of hidden versus known trafficking cases; this is the tip of the iceberg. Any extrapolation or scale-up estimation based on reported trafficking cases inevitably will be stigmatized as pure speculation that can be neither proven or denied.

One strategy to overcome this problem is to follow the ILO example and compare survey data against cases reported in the news media. A survey question can be added to ascertain whether a victim has reported his/her case to the authorities or the news media.

There are no easy solutions. Any attempts at establishing global estimation of human trafficking activities are bound to encounter criticism and doubts, simply because there are few practical ways to study the problem without enormous budget implications.

#### Promising Strategies

There is a great need for empirical research on trafficking for sexual exploitation, and two promising strategies may help answer the fundamental question of the magnitude of the problem. Both methods have found their way into recent research papers and ILO reports, as discussed above. Because of the needed resources implied in both methods, the extent to which researchers may apply these methods in their empirical work remains unclear. However, without reliable information about the scope of the problem, most seeking to influence policy-makers must resort to sensational claims and moral appeals, which sooner or later will create a credibility problem. Empirical research involving primary data collection can also help police agencies devise effective counter-measures.

#### Further Reading

Abdul-Quader, A.S., Heckathorn, D.D., Sabin, K., and Saidel, T. 2006. Implementation and analysis of respondent driven sampling: lessons learned from the field. *Journal of Urban Health 83*.S1:1–5.

Bouchard, M., and Tremblay, P.J. 2005. Risks of arrest across markets: a capture-recapture analysis of hidden dealer and user populations. *Drug Issues 34*, 733–754.

Bouchard, M. 2007. A capture-recapture model to estimate the size of criminal populations and the risks of detection in a marijuana cultivation industry. *Journal of Quantitative Criminology, 23*(3):221–241.

Brunovskis, A., and Tyldum, G. 2004. Crossing borders: an empirical study of transnational prostitution and trafficking in human beings. Fafo-Report 426.

Chao, A. 1989. Estimating population size for sparse data in capture-recapture experiments. *Biometrics 45*:427–438.

Chao, A. 2001. An Overview of Closed Mark-recapture Models. *Journal of Agricultural, Biological, and Environmental Statistics 6*:158–175.

Chao, A., Tsay, P.K., Lin, S.-H., Shau, W.-Y., and Chao, D.-Y. 2001. The applications of mark-recapture models to epidemiological data. *Statistics in Medicine 20*:3123–3157. *doi:10.1002/sim.996*.

Chapman, D. 1951. Some properties of the hypergeometric distribution with applications to zoological sample census. *University of California Publications in Statistics 1*, 131–160.

Collins, M.F., and Wilson, R.M. 1990. Automobile theft: Estimating the size of the criminal population. *Journal of Quantitative Criminology 6*, 395–409.

Curtis, R., Terry, K., Dank, M., Dombrowski, K., and Khan, B. 2008. *Commercial Sexual Exploitation of Children in New York City, Volume One: The CSEC Population in New York City: Size, Characteristics, and Needs* (PDF download). New York, NY: Center for Court Innovation.

De Cock, M. 2007. *Directions for national and international data collection on forced labor (Working Paper No.30)* (PDF download). Geneva, Switzerland: International Labor Organization.

Felix-Medina, M.H., and Thompson, S.K. 2004. Combining link-tracing sampling and cluster sampling to estimate the size of hidden populations. *Journal of Official Statistics 20*:19–38.

Frank, O. and Snijders, T. 1994. Estimating the size of hidden populations using snowball sampling. *Journal of Official Statistics 10*(1):53-67.

Gozdziak, E. and Collett, E.A. 2005. Research on Human Trafficking in North America: A Review of Literature. *International Migration 43*(1/2): 99–128.

Heckathorn, D.D. 2007. Extensions of respondent-driven sampling: analyzing continuous variables and controlling for differential recruitment. *Sociological Methodology 37*(1): 151–207.

Heckathorn, D.D. 2002. Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hidden populations (PDF download). *Social Problems 49*(1):11–34.

Heckathorn, D.D. 1997. Respondent-driven sampling: a new approach to the study of hidden populations (PDF download). *Social Problems 44*.2:174–199.

International Labor Organization (ILO). 2011. Hard to see, harder to count: Survey guidelines to estimate forced labor of adults and children. Geneva, Switzerland: ILO.

International Labor Organization. 2012. ILO global estimate of forced labor: Results and methodology. Geneva, Switzerland: ILO: Special Action Program to Combat Forced Labor (SAP-FL).

Klovdahl, A., Potterat, J., Woodhouse, D., Muth, J., Muth, S., and Darrow, W. 1994. Social networks and infectious disease: the Colorado Springs study. *Social Science & Medicine 38*, 79–88.

Kwanisai, M. 2004. Estimation in link-tracing designs with subsampling. PhD diss., Pennsylvania State University.

Laczko, F., and Gozdziak, E. 2005. Data and research on human trafficking: A global survey. A special issue of *International Migration 43* (1/2):5–16.

Laczko, F., and Gramegna, M. 2003. Developing better indicators of human trafficking. *Brown Journal of World Affairs 10*(1):179–194.

Larsen, J.J., Datta, M.N., and Bales, K. 2015. Modern slavery: A global reckoning. *Significance 12*(5):33-35.

Petersen, C. 1896. The yearly immigration of young plaice into the limfjord from the German sea. *Report of the Danish Biological Station 6*, 5–84.

Rivest, L., and Baillargeon, S. (2014). Rcapture: Loglinear Models for Capture-Recapture Experiments. R package version 1.4-2.

Roberts, J.M., and Brewer, D.D. 2006. Estimating the prevalence of male clients of prostitute women in Vancouver with a simple capture-recapture method. *Journal of the Royal Statistical Society: Series A, 169*(4):745–756.

Robson, D.S., and Regier, H.A. 1964. Sample size in Petersen capture-recapture experiments. *Transactions of the American Fisheries Society 93*:215–226.

Rossmo, D. Kim, and Routledge, R. 1990. Estimating the Size of Criminal Populations. *Journal of Quantitative Criminology 6*:293–314.

Sarma, M. 1981. *Bonded labour in India*. New Delhi: Biblia Impex.

Steinfatt, T.M., and Baker, S. 2011. *Measuring the extent of sex trafficking in Cambodia: 2008*. Bangkok, Thailand: United Nations Interagency Project on Human Trafficking.

Thompson, S.K., and Seber, G.A.F. 1996. Adaptive Sampling. *Wiley Series in Probability Statistics*. Wiley: New York, NY.

Tyldum, G., and Brunovskis, A. 2005. Describing the unobserved: methodological challenges in empirical studies on human trafficking. *International Migration 43*(1–2):17–34.

Vincent, K., and Thompson, S.K. 2016. Estimating population size with link-tracing sampling. *Journal of the American Statistical Association*. Accepted for publication.

Vincent, K. 2016. Recent advances in estimating population size with link-tracing sampling. Submitted.

Weitzer, R. 2014. New Directions in Research on Human Trafficking. *Annals of the American Academy of Political and Social Science 653*(1):6–24.

Zhang, S.X., Spiller, M.W., Finch, B.C., and Qin, Y. 2014. Estimating labor trafficking among unauthorized migrant workers in San Diego. *Annals of the American Academy of Political and Social Science 653*(1):65–86.

#### About the Authors

Sheldon Zhang, PhDis professor and chair of the School of Criminology and Justice Studies, University of Massachusetts Lowell, and has been conducting research on transnational human smuggling and trafficking activities for more than two decades. He has published extensively on this topic.

Kyle Vincent, PhDreceived his PhD in statistics from Simon Fraser University. His research focuses on developing link–tracing–based strategies for studying elusive networked populations.