Black Patients Miss Out On Promising Cancer Drugs

September 25, 2018

Wrapped up my summer fellowship at ProPublica last week when our investigative piece was published! Give it a read here:

Black Patients Miss Out On Promising Cancer Drugs

A ProPublica analysis found that Black people and Native Americans are under-represented in clinical trials of new drugs, even when the treatment is aimed at a type of cancer that disproportionately affects them.

The accompanying data methodology is here: How We Compared Clinical Trial and Cancer Incidence Data

This story was co-published with STAT and can also be found on Mother Jones.

For this story, I pitched the idea and did a ton of research, data analysis, reporting, interviews, all the data visualization—a huge thank you to my wonderful co-author Caroline Chen and amazing editor Sisi Wei!

The story was on the front page the day it published and seemed to be received well. I’ve learned so much from this fellowship and have been super grateful for this opportunity from ProPublica and the Google News Lab.

Update—Statement of impact since our story was published:

Our story was featured on Information is the Best Medicine, a Black-owned talk radio station in Pennsylvania, as well as Axios, Vice, Mother Jones and The Atlantic’s People v. Cancer forum. It was reprinted in the Boston Globe and Indianz, a Native American publication. Nonprofit BIO Ventures for Global Health also wrote an op-ed in response to our story, noting that “clinical trials are perpetuating existing health care disparities across the globe.”

In the course of interviewing these patients, we realized that many people don’t understand how trials work, which prompted us to create the Cancer Patient’s Guide to Clinical Trials. The guide has been shared by the Leukemia and Lymphoma Society.

Update 2—

For this piece, my co-author and I were awarded the American Association for Cancer Research (AACR) June L. Biedler Prize for Cancer Journalism, as well as the Society of American Business Editors and Writers (SABEW) Best in Business Honorable Mention in the Health/Science category.

Predicting Readmission Risk after Orthopedic Surgery

May 23, 2018

My colleagues and I from the Clinical Research Informatics Core at Penn Medicine gave poster presentations at the Public Health session of the Symposium on Data Science and Statistics last week.

Here's the abstract:

Our project examined hospital readmissions after knee and hip replacement surgeries that took place within the University of Pennsylvania health system. We used a variety of information available within patient electronic health records and an assortment of machine learning tools to predict the risk of readmission for any given patient at the time of discharge after a primary joint replacement surgery. We faced challenges related to missing data. We used a number of different machine learning models such as logistic regression, random forest and gradient boosted trees. We also used an automated machine learning pipeline tool, TPOT, that uses a genetic algorithm to search through the machine learning model/parameter space to automatically suggest successful machine learning pipelines. We trained multiple models that predicted readmissions better than the existing clinical methods, with statistically significant increases in AUC over the clinical baseline. Finally our models suggested a number of features useful for readmission prediction that are not used at all in the existing clinician model. We hope our new models can be used in practice to help target patients at high risk of readmission after joint replacement surgery, and to help inform which interventions may be most useful.

Machine Learning for Healthcare

May 3, 2018

Yesterday I gave a dev talk at Philly Tech Week on machine learning for healthcare, slides embedded below.

Description: "How are machine learning and data science being adopted in healthcare? From diagnostics, risk predictions, and more, this session will provide an overview of machine learning applications using electronic health records, walk through the process of how a model might be trained and used, and discuss methods for improving interpretability to augment medical decision-making."

Here's a link to the talk slides with notes included.

I think the talk went pretty well. In fact, I think I am actually a pretty good speaker, although I'm not sure how much I get out of speaking personally. The talk was pretty well attended, and I received a lot of positive feedback, so hopefully I inspired some people in healthcare or machine learning in some way or another.

Music and Mood: Assessing the Predictive Value of Audio Features on Lyrical Sentiment

January 3, 2018

aka - what's the relationship between the audio features of a song and how positive or negative its lyrics are?

aka - data analysis of my spotify music data + sentiment analysis + supervised machine learning

aka - my senior thesis

the full jupyter notebook used to conduct this data analysis can be found on my github here: Spotify Data Analysis

(pg. 32 and onward is just the full python jupyter notebook in the appendix.)