Clouds of Drugs

I've been doing a bit of primitive database mining for my drug transporter work. I took drugs and synonyms from DrugBank, and transporter genes and synonyms from HGNC, searched against PubMed and calculated some simple co-occurrence statistics.

Wordle tag cloud for drugs that co-occur with SLC15A1 (peptide transporter)

I'll be doing other stuff with the data, but while it was there it was simple enough to throw it into Wordle to get a nice overview of the drugs that occur most frequently in the literature associated with a given transporter. Above is the example of SLC15A1 (a peptide transporter). Valaciclovir is large because it is mentioned in many SLC15A1 papers, whereas Cidofovir is mentioned in fewer. These counts are also normalised by a background of how many PubMed hits each drug has on its own (so a drug that crops up in thousands of papers carries less weight than one that is only mentioned dozens of times, because the latter was less likely than the former to co-occur with the transporter by chance alone).

Note that you have to be careful about interpreting the numbers. Sometimes a drug is mentioned a lot because it is studied a lot, not because it is a particularly important transporter substrate. Sometimes DrugBank includes things that aren't what everyone would call a drug. Sometimes my synonym lists are rubbish. Sometimes I mangled the text. Sometimes a match occurs for some other reason (particularly amino acids, which might pop up because they are being discussed in the context of the protein, for example).

The buttons below take you to your own Wordle tag clouds for each transporter based upon my data ("gene (number of drugs)"). Note I cut off any transporter with fewer than ten drugs because they make for boring word clouds. Also note Wordle truncates to 150 words, but you can edit this via >Layout>Maximum Words

1 comment: