Feature Article: Ethical AOD Research

Confidentiality Protections in Research Involving People with Substance Use Disorders

Sylvia Baedorf Kassis, MPH

Clinical Research Manager, Center for Human Genetic Research and Division of Neurocritical Care and Emergency Neurology, Massachusetts General Hospital, Boston, MA, USA

As described in previous issues of this newsletter, there are numerous ethical considerations surrounding the conduct of research involving people with substance use disorders.1,2,3 Weighing the potential risks of the studies’ interventions, as well as determining capacity and obtaining informed consent, are just a few of the ethical issues. Protecting the confidentiality of research subjects and their data is another imperative.

Under the Code of Federal Regulations 45 CFR 46.111 – Criteria for IRB Approval of Research, the seventh criterion requires “adequate provisions to protect the privacy of subjects and to maintain the confidentiality of data” in human subjects research.4 Thus, while all researchers must afford their subjects confidentiality protections, given the inherent vulnerability of individuals who are most likely to be recruited into research on substance use disorders and the sensitive information likely to be collected, additional attention must be paid to protecting this kind of research data from breaches as well as subpoena for use in legal proceedings.

According to the National Institutes of Health (NIH), sensitive information “includes (but is not limited to) information relating to sexual attitudes, preferences, or practices; information relating to the use of alcohol, drugs, or other addictive products; information pertaining to illegal conduct; information that, if released, might be damaging to an individual's financial standing, employability, or reputation within the community or might lead to social stigmatization or discrimination; information pertaining to an individual's psychological well-being or mental health; and genetic information or tissue samples.”5

While it is important to acknowledge that CFR 42 Part 2 – Confidentiality of Alcohol and Drug Abuse Patient Records6 is a regulation that provides some guidance on the protection of clinically derived information, it does not speak to the more conservative provisions that are necessary to safeguard sensitive research data. The bar is set much higher in conduct of research studies because, in general, the information being collected is not primarily for the care and treatment of the individual patient, but rather for the purpose of answering a research question and contributing to generalizable knowledge. This article discusses the features that are particularly salient to ensuring the protection of research subjects who participate in studies on substance use disorders.

Anonymous Versus Identifiable Research Data

Research studies frequently involve the collection of sensitive information beyond that which would usually be collected in a clinical context and included in a medical record. The onus is on researchers to safeguard this data to the greatest extent possible, especially if it is in any way identifiable. A complete understanding of the difference between anonymous and identifiable research data is essential to devising the most appropriate plan to protect research subjects’ confidentiality.

Identifiable research data contains identifying characteristics or a code that links to identifying characteristics, even when that code is stored separately. According to the NIH, identifying characteristics include a subject’s “name, address, social security or other identifying number, fingerprints, voiceprints, photographs, genetic information or tissue samples, or any other item or combination of data about a research participant which could reasonably lead, directly or indirectly by reference to other information, to identification of that research subject.”7 In contrast, anonymous means that there are no identifying characteristics and there exists no link to any identifying characteristics. Simply put, if a researcher is able to link research data to an individual subject’s identity, that data is identifiable. Thus, coded data linked to a master list that includes identifying characteristics is not considered anonymous and requires special confidentiality protections.

Strategies to Improve Protection

  • Conduct anonymous research studies

Whenever possible, researchers should avoid collecting direct identifiers such as name, medical record number (MRN), social security number (SSN), date of birth, etc. While conducting a study anonymously is likely only feasible in studies involving a single study visit that does not require any follow-up activities—whether they be additional study visits or repeated reviews of the subjects’ medical records—collecting anonymous research data is the best way to protect research subjects’ confidentiality. Although most studies that require more than one participant contact will not be able to be done anonymously, some can be done this way by using a participant generated code or password that allows linking of the data collected at the contacts without identifying the participant.  The downside is that if the participant forgets the code or cannot easily regenerate it, the ability to link to the data is lost.

Another option to consider when conducting research of a more ethnographic/qualitative nature is to ask subjects to assign themselves a pseudonym, which ultimately also renders their data anonymous.8 This is most appropriate in research where follow-up occurs but personally identifying characteristics (like name or date of birth) and linking to other records (via MRN or SSN) are not required to answer the research question. In these cases, the researchers never learn the identities of their subjects even when working with them directly and can more easily safeguard their confidentiality.

  • Destroy the link to identifying characteristics as soon as possible

When some identifiers are required, for example, to link research subjects’ survey responses to information in the medical record, one safeguard is to conduct the study in such a way that the data is anonymized as soon after collection as possible. This means that there would be no link to identifying characteristics and, therefore, no way to collect additional data. When considering this option, researchers often express concerns about being unable to confirm or correct data points and/or add new data to their analyses if an error is found or additional hypotheses emerge. While these are legitimate questions, paying careful attention to the development of complete data collection tools and implementing quality control safeguards during data entry can minimize these concerns. In deciding whether or not, and when, to anonymize their data, researchers should take into account the risk-benefit ratio of the study, thereby balancing the need for confidentiality protections with the study’s scientific and analytic needs and the value of the research’s contribution to generalizable knowledge.

  • Take precautions to prevent breaches in confidentiality

In many cases it is simply not feasible to conduct a research study without collecting any identifiers about research subjects. When the gathering of identifying characteristics is absolutely necessary, researchers should consider the following:

  • Using passwords to protect all electronic data and securing paper records in locked cabinets and offices.

  • Limiting the number of individuals who have access to identifying characteristics and master codes, whether electronic or paper records.

  • Storing any master code files that contain identifiers separately from study data.

  • Ensuring paper research records and documents (e.g., surveys, data collection forms, etc.) are coded, do not contain any identifying characteristics, and are stored separately from any master code files that contain identifiers.

  • Never traveling between study sites or to study visits with identifiable information and the research records stored within the same folder or bag.

  • If sending letters or postcards to research subjects, avoiding making any details about the study visible to others at the recipient’s address. If initiating contact via phone, taking care not to reveal details of study participation if the subject is unavailable or a voice message has to be left.

  • Protect research data from subpoena

Since research on substance use disorders often includes the collection of information beyond what would normally be recorded in a clinical context, extra protection of identifiable research information from forced disclosure is recommended through a NIH-issued Certificate of Confidentiality (COC).9 All types of research studies—regardless of funding source or status—that collect identifiable research data on sensitive matters are eligible to apply. Retroactive to the start of the study, a COC permits anyone on the research team who has access to research records to refuse to disclose identifying information on research participants in any civil, criminal, administrative, legislative, or other proceeding, whether at the federal, state, or local level. By protecting researchers and institutions from being compelled to disclose information that would identify research subjects, COCs help achieve the research objectives and promote participation in studies by helping to assure confidentiality and privacy to participants.


The protection of subject confidentiality is essential in all research studies, but is particularly important when enrolling people into research on substance use disorders. Understanding the difference between anonymous and identifiable data is necessary for researchers to implement the most appropriate plan to protect their subjects’ research data. Some of the best practices in protecting the confidentiality of sensitive data have been elucidated above and include collecting anonymous research data or anonymizing it as soon as possible, instituting precautions to prevent breaches of confidentiality, and obtaining a Certificate of Confidentiality to prevent subpoena of subjects’ personal information. A thoughtful and well-executed plan of protection is an ethical imperative.

