A Data Collection, Analysis, and Visualization Project Proposal
This project is proposed by Robert Nicholson, owner of the ED Treatment Information Center website (EDtreatment.info).
The goal of this project is to analyze Twitter “tweets” to understand attitudes towards Erectile Dysfunction (ED).
The project is currently loosely defined – as described below. I am currently looking for students who would be interested in undertaking this project during the summer of 2022. The student(s) will be encouraged to suggest directions for research and analysis.
The project will analyze tweets relating to erectile dysfunction, as well as several other medical conditions that may affect older people, to develop an understanding of how attitudes and emotional response differ based on the condition. The conditions to be included are:
- Erectile Dysfunction
- Chronic Obstructive Pulmonary Disease: COPD
- Colon Cancer
- Heart Disease
The study should be based on English-language tweets collected, using the Twitter API, over a period of at least 4 weeks, and should include a minimum of 5,000 tweets for each condition included in the study.
See the following tutorial:
Data will be collected for six different medical conditions, including Erectile Dysfunction.
The project team should manually inspect portions of the resulting dataset to ensure that it meets selection criteria, and consider “cleaning” of the data to delete extraneous results and to eliminate duplicates and retweets.
Results to Be Determined
Using Python code and libraries for natural language processing, the tweets will be examined to determine, for each condition:
- The total number of tweets, before and after cleaning.
- The proportion of tweets that refer to – or ask about – treatments or cures.
- The proportion of tweets that refer to a spouse, partner, husband/wife, boyfriend/girlfriend, significant other, etc. (In other words, we are trying to understand if people are talking about their own condition, or trying to find help for a partner.)
- The proportion of tweets that are questions.
- The proportion of tweets that express emotions. Optionally, this could be expanded into a full sentiment analysis.
- The proportion of tweets that refer to websites. (This is useful to understand how many people are relying on the web site health care information.)
- The most common websites mentioned.
- The most common keyword clusters (of 3 or more words).
- Demographic data (age, gender, marital status, etc.) – if this data is accessible through the Twitter API.
Additional results may also be considered and presented. Often, something “jumps out” in the course of analyzing the data.
Is the medical condition predictive of any of the metrics from the analysis of tweets? (E.g., sentiment.)
Are the tweet metrics clustered? Are they predictive of the medical condition?
The most obvious opportunity for visualization is to show how the various data results compare across the different medical conditions.
Other ideas might include the ability to superimpose or compare the different data items.
Publication of Results
Acceptance of Results
The project will be considered for publication on the ED Treatment Information Center (EDtreatment.info) website, provided it:
- Meets the requirements listed in this document, and
- Receives a grade of at least 85% or a B+, depending on the grading criteria used in your course.
If multiple projects meet these criteria, the project proposer (Robert Nicholson) will select one of the projects for publication, or consider publishing the results of more than one project.
The team will be expected to provide both the raw data, tweets, and the results of the analysis to Robert Nicholson.
The team will also provide a description of the methodology used to collect the data, and a description of any data cleaning that was applied.
The results of the project will be used as the basis for a research paper, written by Robert Nicholson and submitted for publication to one or more peer-reviewed journals. In the event that the paper is not accepted for publication in a peer-reviewed journal, it will be posted on a non-reviewed open access site such as arxiv.org, and on the EDtreatment.info website.
Members of the project team will have an opportunity to review the final paper before publication, and will be listed as co-authors on the paper. The ED Treatment Information Center will publicize the paper through a press release, and announcements on the Center’s mailing list and social media channels.
Please note that the ED Treatment Information Center website is not a refereed journal, and is generally not cited in other papers. Nevertheless, a publication on the website can be listed on resumes, discussed in job interviews, etc.
Members of the team will be required to grant the ED Treatment Information Center perpetual, non-exclusive rights to publish their research and visualization on the EDtreatment.info website.
This is a non-exclusive publication; members of the team are free to post or publish their results and visualization elsewhere as well.
No payment will be provided by Robert Nicholson or the ED Treatment Information Center.
The sole compensation for your work will be publication and credits.
Advice and Consulting
I anticipate we will have regular online meetings to discuss the project. I will also be available via telephone, email, or Skype to discuss goals, background, requirements, research direction, and technical questions.
Prior Research Based on Tweets
Erectile Dysfunction Research
A recent study examined tweets to understand attitudes about Erectile Dysfunction, and another sexual condition, Premature Ejaculation.
Reviewing the methodology of this study may be helpful to the project team.
General Medical Related Research
Analysis of tweets has become a useful and powerful tool for health-related research. A paper published in 2017 reviewed over 1,000 articles to understand how they used Twitter.
A review of this paper may provide useful ideas for the
Prior Research Conducted by the ED Treatment Information Center
The ED Treatment Information Center conducts original research to better understand how erectile dysfunction affects men and their partners. Our previous studies include:
Comprehensive Study on the Impact of Erectile Dysfunction
Published: March 30, 2018
A survey of 597 adult men suffering from erectile dysfunction found high levels of stress, dissatisfaction with medical care and treatment options, and mental health issues.
The Impact of Erectile Dysfunction on Partners of Men with ED
Published: December 15, 2018
A survey of 129 adult partners of men with erectile dysfunction found high levels of relationship stress and a general lack of communication.
 For previous papers published by the ED Treatment Information Center, see: https://edtreatment.info/about-edtreatment-info/research-on-erectile-dysfunction/
 Sansone, A.; Cignarelli, A.; Ciocca, G.; Pozza, C.; Giorgino, F.; Romanelli, F.; and Jannini, E. A. “The Sentiment Analysis of Tweets as a New Tool to Measure Public Perception of Male Erectile and Ejaculatory Dysfunctions.” Sexual Medicine. August 2019: S2050-1161(19)30085-6.
 Sinnenberg, Lauren; Buttenheim, Alison
M.; Padrez, Kevin; Mancheno, Christina; Ungar, Lyle, and Merchant, Raina M.
“Twitter as a Tool for Health Research: A Systematic Review.”
American Journal of Public Health.
January 2017: 107(1): e1–e8.