Sunday, August 9, 2009

Pharmaceutical Data and Privacy

Today's New York Times has another misguided article on privacy in the medical world. This article seems to be designed to scare Americans into believing that health care privacy is endangered, and that such data is regularly and wantonly traded among companies.

My perspective is different, since I am coming from the side of analyzing data.

First, the pharmaceutical industry is different from virually every other industry in the United States. For the most part, it is illegal for pharmaceutical manufacturers to identify the users of their products. This is based originated with the Health Information Portability and Privacy Act (HIPAA), explained in more detail at this government site.

What is absurd about this situation is that pharmaceutical companies are, in theory, responsible for the health of the millions of people who use their products. To give an example of the dangers, imagine that you have a popular product that causes cardiac damage after several months of use. The cardiac damage, in turn, is sometimes fatal. How does the manufacturer connect the use of the product to death registries? The simple answer. They cannot.

This is not a made-up example. Millions of people used Cox-2 inhibitors, which were on the market until 2004, when Merck voluntarily took Vioxx off the market. This issue here is whether the industry could have known earlier that such dangers lurked in the use of the drug. My contention is that the manufactureres do not have a chance, because they could not do something that virutally every other company can do -- match their customer records to publicly available mortality records.

To be clear about the laws related to drugs. If someone has an adverse reaction while on a drug, then that must be reported to the pharmaceutical company. However, if the adverse reaction is detected a certain amount of time after the patient stops therapy (I believe two weeks), then there is not reporting requirement. Guess what. Cardiac damage caused by Cox-2 inhibitors does not necessarily kill patients right away. Nor is the damage necessarily detected while the patient is still on the therapy.

I have used deidentified records at pharmaceutical clients for various analyses that have ranged from the amusing (anniversary effects in the scripts for ED therapies) to the socially useful (do poor patients have less adherence due to copayments) to the actionable (what messages to give to prescribers). In all cases, we have had to do more work than necessary because of the de-identification requirements, and to make assumptions and work-arounds that may have hurt the analyses. And, contrary to what the New York Times article may lead you to believe, both IMS and Verispan take privacy very seriously. Were I inclined to try to identify particular records, it would be virtually impossible.

Every time a drug is used, there is perhaps an opportunity to learn about its effectiveness and interactions with other therapies. In many cases, these are questions that scientists do not even know to ask, and such exploratory data mining can be critical in establishing hypotheses. Questions such as:
  • Are the therapies equally effective, regardless of gender, age, race, and geography?
  • Do demographics affect adherence?
  • What interactions does a given therapy have with other therapies?
  • Does the use of a particular therapy have an effect on mortality?
Everytime patients purchase scripts and the data is shielded from the manufacturers, opportunities to better understand and improve health care outcomes are lost. Even worse, asa the New York Times article points out, HIPAA does not protect consumers from the actions of nefarious employees and adroit criminals. There has to be a better way.


  1. Dear Linoff.

    I'm a BI developer in South Korea.

    I have read your book, and I found what I was born to do. Yes, that is a data mining.

    But Even though I have Computer Science back ground, and studying Calculus 1,2,3,Probability Theory, I feel Its not enough to be a good miner. Please let me know what I should study or prepare.
    Any recommendations , suggestions would be appreciated. Tnanks!

  2. To Bruce: I think there are two steps: 1) get a solid foundation in statistics, especially regression (that's my own bias showing!) and 2) do it.

    One of the things I really like about data mining as a field is that the field is by no means set in stone. Start doing it, and always be asking yourself what works, what doesn't work, and why.

    To Gordon: It's about marketing. From what I've read people *really don't* want drug companies to be able to agressively market to individual patients or doctors based on data.

    We need to be able to have a way of managing health data while also managing the uses the data is put to. After all, we're basically conducting a large-scale long-term experiment on drung interactions (among other things) so we might as well gather the data.

  3. Ed: I totally agree with your 2 step approach (get a solid foundation in statistics, especially regression and then do it).

    I once asked a very well paid consultant how he managed to gain enough experience to bring in as much work as he does, given he charges what he does.He said that on top of his hard earnt top grades via university education in maths, economics, and econometrics, he was lucky enough to be offered a problem to solve.

    Without that problem to solve, all he had was some bits of paper saying that in theory he *should* be capable of solving something.

    WHat I took from that was to not wait until a suitable problem falls into your lap courtesy of a paid job. Instead, look for problems everywhere you can, and see if you can solve them on your own time. Sooner or later the money and respect - and more importantly, wishom - will follow.

  4. Finally, there's a clear need for regulators and advocates for evidence-based medicine, patient care, and health care privacy to be as innovative as industry is wealthy to fight back against the overwhelmingly negative patient health effects of prescriber-identifiable prescription tracking.This a great lens will credit this and save.

  5. This blog post focuses on one of the aspects of data mining that interests me the most, ethics. The example of whether data should be collected in order to improve medical health is an interesting one. I strongly believe that pharmaceutical companies should be able to collect relevant data on their drugs in a manner that isn’t hindering them from improving their products; however, I also see how that kind of information can be abused. I think it comes down to how much people care.

    For example, I’ve resigned myself to the fact that, thanks to social networks such as Facebook, privacy is fast becoming a thing of the past. A lot of information about me is available to companies willing to pay for it. There are almost a billion members on Facebook, but how many are aware that information about them is being sold to companies. Do people not care because they are ignorant? Or do they not care because the information they are sharing is what they want to share?

    It is interesting just how much people are willing to share about their lives, even with anonymity stripped away. In the case of Facebook, there are no apparent adverse affects to sharing content about oneself. While there have been stories about people being fired for something they wrote, it was the customers choice to share the information, so we don’t throw up our arms in outrage. It is when companies use our data in ways we didn’t expect, things we didn’t explicitly share, that people are enraged (and rightly so).

    For example, if pharmaceutical companies were able to access more information about their consumers, their abuse of the data could reflect negatively on the consumer. So, in order for there to be any progress, the data that companies have access to should be data that can be used to benefit the customer without any apparent negative effects. Although even then I’m sure we’d spark debate on what people perceive as negative effects.


Your comment will appear when it has been reviewed by the moderators.