SHARP


SHARP is short for Social Health Analytics Research Platform

Online health-related social networks generate an exponentially increasing stream of big data. This social health data is of large volume, created in real-time, and contains a high degree of noise. In order to develop the enormous potential of big data and social media to improve public health and consumer health, cutting-edge computer science techniques from the Semantic Web and Machine Learning need to be applied to the new interdisciplinary problems that are hard to solve with traditional methods. Some examples of these problems are integrating heterogonous health data sources, monitoring disease outbreaks in real-time, mining public sentiments towards epidemic, predicting potential diseases for individual patients, etc. The application of big data analytics will potentially help patients, clinicians, as well as the general public to make healthcare decisions based on better use of available data, thus building a solid foundation to improve healthcare services in the 21st century.

Monitoring threats to public health is important for the healthcare community. The traditional report-based method suffers from high expenses and significant time delays. Disease Outbreak Monitoring component gathers the real-time tweets containing certain specified health-related keywords (e.g. listeria), along with associated user profile information for subsequent analysis into our local relational database.

To serve different needs of epidemics detection, four visualizations are generated. (1) The instance map is used to show the tweets based on “single” users’ locations. (2) In the distribution map, absolute and relative frequencies of the distribution are displayed. The relative frequency is the absolute frequency divided by the population of each state. The distribution map enables the detection of which states house most Twitter users tweeting about an epidemic. (3) The filter map gives users the flexibility to monitor the spread of epidemics based on time series and users’ influence with a (minimum, maximum) range of followers to only display Twitter users in this range. (4) Hashtag cloud presents a frequency-based word list of disease. By clicking on each hashtag, the related tweets mentioning this hashtag will be shown.


As seen in the recent Ebola virus scare, public concern may cause unnecessary anxiety and other negative social consequences. Keeping track of trends in public health concerns and identifying peaks of public concern are crucial tasks.

To address these problems, we classified user-generated social network data, i.e. Twitter messages, into different sentiment category. We have also quantified the sentiment trend by defining a measure of concern (MOC) derived from relevant Twitter messages. Our sentiment classification approach has two steps. In the first step, a subjective clue-based approach is used to automatically label the training datasets to distinguish Personal tweets from News (i.e. non-personal) tweets. Then the Machine Learning models are developed using these automatically labeled datasets for classifying new tweets into personal tweets or News tweets. In the second step, an emotional clue-based method is used to automatically label another training datasets to distinguish Negative from Non-negative tweets. Finally the Machine Learning models are used using these training datasets to classify a personal tweet into either negative or non-negative tweet.

Using the results of sentiment classification, we compute the MOC in a time interval, the MOC is displayed in real-time to quantify the public concern in both temporal and geographic dimension.


Delivering contextually-relevant knowledge resources into EHR systems at the point of care was proposed as Infobutton HL7 standard. However, the InfoButton standard does not consider social media data or the patients’ healthcare behaviors or practices. Social InfoButtons system collects patient-generated social media data and other open health data to provide insights on healthcare trends and patients’ practices and issues, using the Semantic Integration model, that supports Social Healthcare Knowledge for clinicians, patients and policy makers.

With the Health 3.0 trend, it is increasingly becoming important to understand the patients’ actual health practices, behaviors, trends and concerns. Social InfoButtons system generates contextually summarized information about social health practices by geographic or temporal dimensions, providing end-users (e.g. patients, clinicians, or government officials) with healthcare information, such as treatments, practices, conditions, experiences, sentiments, and behaviors reported by other patients through social media.