Join the 200th Anniversary Celebration

Correspondence

No Place to Hide — Reverse Identification of Patients from Published Maps

N Engl J Med 2006; 355:1741-1742October 19, 2006

Article

To the Editor:

The mapping of health data is now widespread in both academic research and public health practice.1 Although the notion that location influences the risk of disease dates back to the mapping of yellow fever and cholera in the 1800s, research that integrates maps with human health is an emerging field based on the widespread availability of geographic information system (GIS) software.2 Such systems have broad applicability, and their use has been fueled by the availability of increased computing power, user-friendly software, and large geographic databases. The number of publications that use GIS data for health research has grown by about 26% per year, four times the rate of increase in the number of articles on human health in general.2 Patients' addresses are mapped to identify patterns, correlates, and predictors of disease. These maps are then published electronically and in print.1

Using keyword searches for the terms “geographic” and “map” in the figure legends of articles in five major medical journals published between 1994 and 2005, we identified 19 articles (including 5 in the Journal) that included maps with the addresses of patients plotted as individual dots or symbols. In these articles, more than 19,000 such addresses were plotted on maps.

Given the potential implications for the privacy of patients, we investigated whether we could use these published maps to reidentify the patients. We created a simulated map of 550 geographically coded addresses of patients in Boston, using the minimum figure resolution required for publication in the Journal (Figure 1AFigure 1Reverse Identification of Patients from a Simulated Health-Data Map of Boston.). We then used standard GIS techniques to determine the accuracy with which such addresses can be identified.3 Strikingly, the reverse-identification method precisely identified 432 of the addresses (79%) and identified all 550 addresses within 14 m of the correct address (Figure 1B).

The publication of maps of disease with precise locations of patients jeopardizes patients' privacy. Guidelines for the display or publication of health data are needed to guarantee patients' anonymity.4 A common approach has been to map according to administrative unit rather than home address. However, the aggregation of data in this manner places constraints on the visualization of disease patterns. Another method is spatial skewing, or randomly relocating patients' addresses within a given distance of their true location. Skewing can allow a visualization that conveys the necessary information while preserving patients' privacy.5 Both aggregation and skewing are systematic and reliable means of de-identification that are far safer, in terms of protecting identifiable health information, than simply reducing the resolution of a map. Editors of journals and textbooks should consider implementing such policies to guide the safe reporting of spatial data.

John S. Brownstein, Ph.D.
Children's Hospital, Boston, MA 02115

Christopher A. Cassa, M.Eng.
Harvard–MIT Division of Health Sciences and Technology, Boston, MA 02139

Kenneth D. Mandl, M.D., M.P.H.
Harvard Medical School, Boston, MA 02115

5 References
  1. 1

    Croner CM, Sperling J, Broome FR. Geographic information systems (GIS): new perspectives in understanding human health and environmental relationships. Stat Med 1996;15:1961-1977
    CrossRef | Web of Science | Medline

  2. 2

    Pickle LW, Waller LA, Lawson AB. Current practices in cancer spatial data analysis: a call for guidance. Int J Health Geogr 2005;4:3-3
    CrossRef | Medline

  3. 3

    Brownstein JS, Cassa CA, Kohane IS, Mandl KD. Reverse geocoding: concerns about patient confidentiality in the display of geospatial health data. AMIA Annu Symp Proc 2005:905.

  4. 4

    Rushton G, Armstrong MP, Gittler J, et al. Geocoding in cancer research: a review. Am J Prev Med 2006;30:Suppl:S16-S24
    CrossRef | Web of Science | Medline

  5. 5

    Cassa CA, Grannis SJ, Overhage JM, Mandl KD. A context-sensitive approach to anonymizing spatial surveillance data: impact on outbreak detection. J Am Med Inform Assoc 2006;13:160-165
    CrossRef | Web of Science | Medline

Citing Articles (7)

Citing Articles

  1. 1

    Ellen K. Cromley. (2011) The Role of the Map and Geographic Information Library in Medical Geographic Research. Journal of Map & Geography Libraries 7:1, 13-35
    CrossRef

  2. 2

    Pau Dominkovics, Carlos Granell, Antoni Perez-Navarro, Marti Casals, Angels Orcau, Joan A Cayla. (2011) Development of spatial density maps based on geoprocessing web services: application to tuberculosis incidence in Barcelona, Spain. International Journal of Health Geographics 10:1, 62
    CrossRef

  3. 3

    K. H. Hampton, M. K. Fitch, W. B. Allshouse, I. A. Doherty, D. C. Gesink, P. A. Leone, M. L. Serre, W. C. Miller. (2010) Mapping Health Data: Improved Privacy Protection With Donut Method Geomasking. American Journal of Epidemiology 172:9, 1062-1069
    CrossRef

  4. 4

    William B. Allshouse, Molly K. Fitch, Kristen H. Hampton, Dionne C. Gesink, Irene A. Doherty, Peter A. Leone, Marc L. Serre, William C. Miller. (2010) Geomasking sensitive health data and privacy protection: an evaluation using an E911 database. Geocarto International 25:6, 443-452
    CrossRef

  5. 5

    Ellen Wright Clayton, Maureen Smith, Stephanie M. Fullerton, Wylie Burke, Catherine A. McCarty, Barbara A. Koenig, Amy L. McGuire, Laura M. Beskow, Lynn Dressler, Amy A. Lemke, Erin M. Ramos, Laura Lyman Rodriguez. (2010) Confronting real time ethical, legal, and social issues in the Electronic Medical Records and Genomics (eMERGE) Consortium. Genetics in Medicine 12:10, 616-620
    CrossRef

  6. 6

    Myron P. Gutmann, Kristine Witkowski, Corey Colyer, JoAnne McFarland O’Rourke, James McNally. (2008) Providing Spatial Data for Secondary Analysis: Issues and Current Practices Relating to Confidentiality. Population Research and Policy Review 27:6, 639-665
    CrossRef

  7. 7

    S. C. Wieland, C. A. Cassa, K. D. Mandl, B. Berger. (2008) Revealing the spatial distribution of a disease while preserving privacy. Proceedings of the National Academy of Sciences 105:46, 17608-17613
    CrossRef