Join the 200th Anniversary Celebration

Perspective

Digital Disease Detection — Harnessing the Web for Public Health Surveillance

John S. Brownstein, Ph.D., Clark C. Freifeld, B.S., and Lawrence C. Madoff, M.D.

N Engl J Med 2009; 360:2153-2157May 21, 2009

Article

The Internet has become a critical medium for clinicians, public health practitioners, and laypeople seeking health information. Data about diseases and outbreaks are disseminated not only through online announcements by government agencies but also through informal channels, ranging from press reports to blogs to chat rooms to analyses of Web searches (see Digital Resources for Disease Detection). Collectively, these sources provide a view of global health that is fundamentally different from that yielded by the disease reporting of the traditional public health infrastructure.1

Digital Resources for Disease Detection.

Sample Web-based data sources

ProMED-mail, www.promedmail.org

Global Public Health Intelligence Network (GPHIN), www.phac-aspc.gc.ca/media/nr-rp/2004/2004_gphin-rmispbk-eng.php

HealthMap, www.healthmap.org

MediSys, http://medusa.jrc.it

EpiSPIDER, www.epispider.org

BioCaster, http://biocaster.nii.ac.jp

Wildlife Disease Information Node, http://wildlifedisease.nbii.gov

H5N1 Google Earth mashup, www.nature.com/avianflu/google-earth

Avian Influenza Daily Digest and blog, www.aidailydigest.blogspot.com

Google Flu Trends, www.google.org/flutrends

Google Insights for Search, www.google.com/insights/search

DiSTRIBuTE, www.syndromic.org/projects/DiSTRIBuTE.htm

GeoSentinel, www.istm.org/geosentinel/main.html

Emerging Infections Network, http://ein.idsociety.org

Argus, http://biodefense.georgetown.edu

Sample health-related social-networking sites

Physicians, www.sermo.com

Patients, www.patientslikeme.com

Everyone, www.healthysocial.org

Over the past 15 years, Internet technology has become integral to public health surveillance. Systems using informal electronic information have been credited with reducing the time to recognition of an outbreak, preventing governments from suppressing outbreak information, and facilitating public health responses to outbreaks and emerging diseases. Because Web-based sources frequently contain data not captured through traditional government communication channels, they are useful to public health agencies, including the Global Outbreak Alert and Response Network of the World Health Organization (WHO), which relies on such sources for daily surveillance activities.

Early efforts in this area were made by the International Society for Infectious Diseases' Program for Monitoring Emerging Diseases, or ProMED-mail, which was founded in 1994 and has grown into a large, publicly available reporting system, with more than 45,000 subscribers in 188 countries.2 ProMED uses the Internet to disseminate information on outbreaks by e-mailing and posting case reports, including many gleaned from readers, along with expert commentary. In 1997, the Public Health Agency of Canada, in collaboration with the WHO, created the Global Public Health Intelligence Network (GPHIN), whose software retrieves relevant articles from news aggregators every 15 minutes, using extensive search queries. ProMED and GPHIN played critical roles in informing public health officials of the outbreak of SARS, or severe acute respiratory syndrome, in Guangdong, China, as early as November 2002, by identifying informal reports on the Web through news media and chat-room discussions.

More recently, the advent of openly available news aggregators and visualization tools has spawned a new generation of disease-surveillance “mashups” (Web application hybrids) that can mine, categorize, filter, and visualize online intelligence about epidemics in real time. For instance, HealthMap (see imageScreen Shot of HealthMap during the Recent Salmonella Typhimurium Outbreak.) is an openly available public health intelligence system that uses data from disparate sources to produce a global view of ongoing infectious disease threats. It has between 1000 and 150,000 users per day, including public health officials, clinicians, and international travelers. Other similar systems include MediSys, Argus, EpiSPIDER, BioCaster, and the Wildlife Disease Information Node. Automated analysis of online video materials and radio broadcasts will soon provide additional sources for early detection.

The ease of use of blogs, mailing lists, RSS (Really Simple Syndication) feeds, and freely available mapping technology has meant that even an individual expert can create an important global resource. For instance, Declan Butler, a reporter at Nature, took aggregated data from various sources to provide a view of the spread of H5N1 avian influenza on a Google Earth interface. Similarly, Claudinne Roe of the Office of the Director of National Intelligence produces the Avian Influenza Daily Digest and blog, a collection of unclassified information about confirmed and suspected human and animal cases of H5N1 influenza.

Although news media represent an important adjunct to the public health infrastructure, the information they report pales in comparison to the potential collective intelligence that can be garnered from the public. An estimated 37 to 52% of Americans seek health-related information on the Internet each year, generally using search engines to find advice on conditions, symptoms, and treatments. Logs of users' chosen keywords and location information encoded in their computers' IP (Internet Protocol) addresses can be analyzed to provide a low-cost data stream yielding important insights into current disease trends.3 The power of these data has been demonstrated by studies of search engines provided by Google4 and Yahoo,5 in which data on searches using influenza-related keywords were used to generate an epidemic curve that closely matched that generated by traditional surveillance for influenza-related illness, deaths, and laboratory results. Google Flu Trends now provides a prospective view of current influenza search patterns throughout the United States. By making the information freely available to public health officials, clinicians, and ordinary citizens, such tools could help to guide medical decision making and underscore the importance of vaccination and other preventive measures.

An example of the power of search-term surveillance can be found in an examination of the recent peanut-butter–associated outbreak of Salmonella enterica serotype Typhimurium. Using Google Insights for Search, a search-volume reporting tool from Google, we compared the epidemic curve of onset dates for confirmed infections with trends in the volume of Internet searches on related terms in the United States (see graphInfections with the Outbreak Strain of Salmonella Typhimurium, as reported by the CDC as of February 8, 2009.). Search terms included “diarrhea,” “peanut butter,” “food poisoning,” “recall,” and “salmonella,” and search volumes were compared with the corresponding volumes from the previous year. The initial public report of salmonella was released on January 7, 2009, triggering an increase in searches for “salmonella,” “recall,” and “peanut butter,” but we saw earlier peaks in searches for “diarrhea” and “food poisoning.” Admittedly, these data provide only preliminary evidence of an emerging problem and require further study, but they highlight possibilities for early disease detection.

Though mining the Web is a valuable new direction (see Influenza A (H1N1) Virus, 2009 — Online Monitoring), these sources cannot replace the efforts of public health practitioners and clinicians. The Internet is also providing new opportunities for connecting experts who identify and report outbreaks. Information technologies such as wikis, social networks, and Web-based portals can facilitate communication and collaboration to accelerate the dissemination of reports of infectious diseases and aid in mobilizing a response. Some scientific societies are now leveraging technologies for distributed data exchange, analysis, and visualization. For instance, the International Society for Disease Surveillance has created the Distributed Surveillance Taskforce for Real-Time Influenza Burden Tracking and Evaluation (DiSTRIBuTE), a group of state and local health departments that use the Web to share, integrate, and analyze health data across large regions. And the International Society of Travel Medicine, in collaboration with the Centers for Disease Control and Prevention (CDC), has created the GeoSentinel project, which brings together travel and tropical-medicine clinics in an electronic network for surveillance of travel-related illnesses. Similarly, the Emerging Infections Network, administered by the Infectious Diseases Society of America in collaboration with the CDC, is a Web-based network of more than 1000 infectious disease specialists that is geared toward finding cases during outbreaks and detecting new or unusual clinical events.

Broader Web-based networks are also proving useful for surveillance. Social-networking sites for clinicians, patients, and the general public hold potential for harnessing the collective wisdom of the masses for disease detection. Given the continued deployment of personally controlled electronic health records, we expect that patients' contributions to disease surveillance will increase. Eventually, mobile-phone technology, enabled by global positioning systems and coupled with short-message-service messaging (texting) and “microblogging” (with Twitter), might also come into play. For instance, an organization called Innovative Support to Emergencies, Diseases, and Disasters (InSTEDD) has developed open-source technology to permit seamless cross-border communication between mobile devices for early warning and response in resource-constrained settings.

These Internet-based systems are quickly becoming dominant sources of information on emerging diseases, though their effects on public health measures remain uncertain. Information overload, false reports, lack of specificity of signals, and sensitivity to external forces such as media interest may limit the realization of their potential for public health practice and clinical decision making. Sources such as analyses of search-term use and news media may also face difficulties with verification and follow-up. Though they hold promise, these new technologies require careful evaluation. Ultimately, the Internet provides a powerful communications channel, but it is health care professionals and the public who will best determine how to use this channel for surveillance, prevention, and control of emerging diseases.

Dr. Brownstein, Mr. Freifeld, and Dr. Madoff report receiving grant support from Google.org. No other potential conflict of interest relevant to this article was reported.

This article (10.1056/NEJMp0900702) was published at NEJM.org on May 7, 2009.

Source Information

Dr. Brownstein is a faculty member at the Children's Hospital Informatics Program, Children's Hospital Boston, and an assistant professor of pediatrics at Harvard Medical School, Boston. Mr. Freifeld is a research software developer at the Children's Hospital Informatics Program in Boston and a master's candidate in the New Media Medicine Group of the MIT Media Laboratory in Cambridge, MA. Dr. Brownstein and Mr. Freifeld are the cocreators of the HealthMap system. Dr. Madoff is a professor of medicine at the University of Massachusetts Medical School, Worcester, an infectious disease physician with the Massachusetts Department of Public Health, Boston, and editor of ProMED-mail, a program of the International Society for Infectious Diseases.

References

References

  1. 1

    Brownstein JS, Freifeld CC, Reis BY, Mandl KD. Surveillance Sans Frontières: Internet-based emerging infectious disease intelligence and the HealthMap Project. PLoS Med 2008;5:e151-e151
    CrossRef | Web of Science | Medline

  2. 2

    Madoff LC. ProMED-mail: an early warning system for emerging diseases. Clin Infect Dis 2004;39:227-232
    CrossRef | Web of Science | Medline

  3. 3

    Eysenbach G. Infodemiology: tracking flu-related searches on the web for syndromic surveillance. AMIA Annu Symp Proc 2006:244-8.

  4. 4

    Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature 2009;457:1012-1014
    CrossRef | Web of Science | Medline

  5. 5

    Polgreen PM, Chen Y, Pennock DM, Nelson FD. Using Internet searches for influenza surveillance. Clin Infect Dis 2008;47:1443-1448
    CrossRef | Web of Science | Medline

Citing Articles (21)

Citing Articles

  1. 1

    Courtney D. Corley, Mary J. Lancaster, Robert T. Brigantic, James S. Chung, Ronald A. Walters, Ray R. Arthur, Cynthia J. Bruckner-Lea, Augustin Calapristi, Glenn Dowling, David M. Hartley, Shaun Kennedy, Amy Kircher, Sara Klucking, Eva K. Lee, Taylor McKenzie, Noele P. Nelson, Jennifer Olsen, Carmen Pancerella, Teresa N. Quitugua, Jeremy Todd Reed, Carla S. Thomas. (2012) Assessing the Continuum of Event-Based Biosurveillance Through an Operational Lens. Biosecurity and Bioterrorism: Biodefense Strategy, Practice, and Science120209122534006
    CrossRef

  2. 2

    Ryan W. Newkirk, Jeff B. Bender, Craig W. Hedberg. (2012) The Potential Capability of Social Media as a Component of Food Safety and Food Terrorism Surveillance Systems. Foodborne Pathogens and Disease120104063836006
    CrossRef

  3. 3

    Zvi Shimoni, Joseph Rodrig, Natan Dusseldorp, Mark Niven, Paul Froom. (2012) Increased emergency department chief complaints of fever identified the influenza (H1N1) pandemic before outpatient symptom surveillance. Environmental Health and Preventive Medicine 17:1, 69-72
    CrossRef

  4. 4

    Kamran Khan, Scott JN McNabb, Ziad A Memish, Rose Eckhardt, Wei Hu, David Kossowsky, Jennifer Sears, Julien Arino, Anders Johansson, Maurizio Barbeschi, Brian McCloskey, Bonnie Henry, Martin Cetron, John S Brownstein. (2012) Infectious disease surveillance and modelling across geographic frontiers and scientific specialties. The Lancet Infectious Diseases
    CrossRef

  5. 5

    A. Lyon, M. Nunn, G. Grossel, M. Burgman. (2011) Comparison of Web-Based Biosecurity Intelligence Systems: BioCaster, EpiSPIDER and HealthMap. Transboundary and Emerging Diseasesno-no
    CrossRef

  6. 6

    Sarah Elwood, Michael F. Goodchild, Daniel Z. Sui. (2011) Researching Volunteered Geographic Information: Spatial Data, Geographic Research, and New Social Practice. Annals of the Association of American Geographers110927093212001
    CrossRef

  7. 7

    Chintamani Atreya, Hira Nakhasi, Paul Mied, Jay Epstein, James Hughes, Marta Gwinn, Steven Kleinman, Roger Dodd, Susan Stramer, Mark Walderhaug, Peter Ganz, Raymond Goodrich, Clark Tibbetts, David Asher. (2011) FDA workshop on emerging infectious diseases: evaluating emerging infectious diseases (EIDs) for transfusion safety. Transfusion 51:8, 1855-1871
    CrossRef

  8. 8

    Lawrence C. Madoff, David O. Freedman. 2011. Detection of Infectious Diseases Using Unofficial Sources. , 11-21.
    CrossRef

  9. 9

    S. Briand, A. Mounts, M. Chamberland. (2011) Challenges of global surveillance during an influenza pandemic. Public Health 125:5, 247-256
    CrossRef

  10. 10

    Joseph T. Ripberger. (2011) Capturing Curiosity: Using Internet Search Trends to Measure Public Attentiveness. Policy Studies Journal 39:2, 239-259
    CrossRef

  11. 11

    John W. Ayers, Kurt M. Ribisl, John S. Brownstein. (2011) Tracking the Rise in Popularity of Electronic Nicotine Delivery Systems (Electronic Cigarettes) Using Search Query Surveillance. American Journal of Preventive Medicine 40:4, 448-453
    CrossRef

  12. 12

    Christopher P. Scheitle. (2011) Google's Insights for Search: A Note Evaluating the Use of Search Engine Data in Social Research*. Social Science Quarterly 92:1, 285-295
    CrossRef

  13. 13

    Charles L. Briggs. (2011) Communicating Biosecurity. Medical Anthropology 30:1, 6-29
    CrossRef

  14. 14

    Ronan W Glynn, John C Kelly, Norma Coffey, Karl J Sweeney, Michael J Kerin. (2011) The effect of breast cancer awareness month on internet search activity - a comparison with awareness campaigns for lung and prostate cancer. BMC Cancer 11:1, 442
    CrossRef

  15. 15

    Anette Hulth, Gustaf Rydevik. (2011) GET WELL: an automated surveillance system for gaining new epidemiological knowledge. BMC Public Health 11:1, 252
    CrossRef

  16. 16

    E. H. Chan, T. F. Brewer, L. C. Madoff, M. P. Pollack, A. L. Sonricker, M. Keller, C. C. Freifeld, M. Blench, A. Mawudeku, J. S. Brownstein. (2010) Global capacity for emerging infectious disease detection. Proceedings of the National Academy of Sciences 107:50, 21701-21706
    CrossRef

  17. 17

    Brian K. Lee. (2010) Epidemiologic Research and Web 2.0—the User-driven Web. Epidemiology 21:6, 760-763
    CrossRef

  18. 18

    Sherrilynne Fuller. (2010) Tracking the Global Express. Epidemiology 21:6, 769-771
    CrossRef

  19. 19

    Brownstein, John S., Freifeld, Clark C., Chan, Emily H., Keller, Mikaela, Sonricker, Amy L., Mekaru, Sumiko R., Buckeridge, David L., . (2010) Information Technology and Global Surveillance of Cases of 2009 H1N1 Influenza. New England Journal of Medicine 362:18, 1731-1735
    Full Text

  20. 20

    Toni González-Pacanowski, Pablo Medina-Aguerrebere. (2010) Sistemas de usuario en la gestión online de contenidos biomédicos. El Profesional de la Informacion 19:3, 260-268
    CrossRef

  21. 21

    Michael J. McCarthy. (2010) Internet monitoring of suicide risk in the population. Journal of Affective Disorders 122:3, 277-279
    CrossRef