InForMID
Tufts Initiative for the Forecasting and Modeling of Infectious Diseases
Tufts Initiative for the Forecasting and Modeling of Infectious Diseases

TAWB – Oral Presentations

Group A: Nasir et al. Substance Use Disorder Outcome Predictions for Decision Support: A Holistic Data Approach

Substance use disorder treatment has been an increasingly important issue in the last few decades, and with the recent opioid epidemic, it is getting more critical to better understand and tackle the problem. Each day, 130 people in the US die due to complications from substance use. In this work, we propose a novel data analytics approach to devise decision support tools for care providers using machine-learning models. Using the models, we also highlight potentially important factors and their dynamic relationships for the various outcomes. We also show that our model can provide useful information to augment decision making processes.

Group A: Wang et al. Sleeping Disorders and Alzheimer's Disease: Evidence for Heterogeneity

Disturbed sleep has been well-known as a typical symptom of Alzheimer's disease (AD), showing a high prevalence among AD patients. However, emerging evidence indicates that disturbed sleep may also contribute to the risk of Alzheimer’s. Experimental studies suggest that the sleep-wake cycle directly influences the levels of toxic proteins related to AD in the brain. Poor sleep quality and sleep deprivation encourage the accumulation of and impede the cleavage of toxic proteins linked to AD.  Keeping up with the previous studies, the purpose of the present study is twofold:1) to understand the relationship between sleep disorder and risk of late-onset of AD based on a larger sample size than in past work; 2) to investigate whether a particular sub-population of sleep disorders is related to a higher risk of late-onset of AD.  Applying a Bayesian survival analysis (i.e., Bayesian approach to Cox-Gompertz model), this large-scale retrospective longitudinal study reveals that patients who suffer from sleep disorders (leaving out hypersomnia/insomnia) display a higher risk of late-onset of Alzheimer's disease after adjusting for other risk factors for AD including gender, ApoE genotype, and clinician-assessed health conditions. Our results suggest the existence of heterogeneity among sleep disorders in terms of how they affect the development of late-onset of AD.

Group B: Stark. What a “Batman Graph” shows about US income: A cautionary tale of correlation, principal components, and nonlinear regression

This mini-case-study provides a striking example of why we need to graph the relationships we’re analyzing. Via principal components analysis, we can summarize many US Census income indicators into a Wealth index and a Poverty index, to describe each of 33,000 ZIP codes.  Simple correlation indicates a weak negative linear relationship between the two indices.  A plausible next step is to check for nonlinear fits, but regression with quadratic and cubic terms hardly outperforms correlation.  Visualization, however, reveals a complex and puzzling relationship that clearly requires more creative and resourceful analytic approaches.

Group B: Nasir et al. Synthetic Average Neighborhood Sampling Algorithm (SANSA): A Neighborhood Informed Synthetic Sample Placement Approach to Improve Learning from Imbalanced Data

Machine-learning classification models are increasingly being used in both real-world applications as well as in academic literature. However, many real-world phenomenon happen much less often, and thus are more interesting and in many cases much more high-stakes to predict. In this work, we propose a new synthetic data generation algorithm that uses a novel “placement”  parameter that can be tuned to adapt to the each datasets unique manifestation of the imbalance. SANSA also defines a novel modular framework to rank, generate and scale the new samples, which can be used in the future with other functions to propose better methods.

Group C: Ericson. Minds in Motion: Using positional data to study human behavior and cognition

This talk provides an introduction to three analytical approaches for using positional data to gain insight into human behavior and cognition: (1) mouse tracking, (2) motion tracking, and (3) space syntax. Each approach is illustrated through brief presentations of ongoing research projects in human-computer interaction (HCI), spatial cognition, and environmental design. The first project uses mouse-tracking to examine the spatiotemporal dynamics of human decision-making during website use. The second project uses virtual reality (VR) motion-tracking data and directional (or circular) statistics to investigate human wayfinding and the geometry of our “cognitive maps.” The third project explores the limits of space syntax, a popular set of computational tools for predicting pedestrian walking patterns in urban and architectural environments. For each analytical approach, resources—including freely available software products and useful R packages—are noted for researchers interested in exploring the power of positional data in their own work.

Group C: Abhilash et al. Building a Predictive Framework for Analyzing Opioid Patient Toxicology Data to Improve Medication-Assisted Therapy Adherence and Relapse

In the United States, over 2 million people are diagnosed with OUD, resulting in costs exceeding $500 billion each year. Pharmacological intervention by medication-assisted treatment (MAT) is most effective, based on minimizing withdrawal severity, reducing relapse, and improving retention in treatment. Despite demonstrated MAT effectiveness, there are differences in individual responses due to high dropout, non-adherence, and relapse. We have analyzed data from clinical toxicology tests over 18 months from 71 providers listing suboxone as the MAT on order for approximately 1,500 patients. For these patients, analysis of clinical toxicology results (over 512,000 tests) enabled determination of frequency of testing relative to MAT positives and unexpected positives of common drugs of abuse (including non-MAT opiates, amphetamine/methamphetamine, and benzodiazepines). Descriptive statistics demonstrate that testing frequency is related to adherence (measured by consistent appearance of MAT) and relapse (measured by unexpected positives). The analysis also found that more frequent testing showed decreased adherence and higher indicators of relapse with all drugs except benzodiazepines. In addition, significant associations between testing frequency, unexpected positives, and simulated adherence (i.e. dropping a drug directly into the urine) were observed among patients in a 6-month sample of the dataset. Finally, we identified geographically-distributed patterns of drugs showing unexpected positives that were consistent with public health data on drug use. Our ongoing work includes predictive analytics to identify patients at risk for adverse adherence and relapse outcomes based on testing history, drug metabolism, and analysis of unexpected positive rates—in addition to using data to distinguish between metabolism and drug impurities and determine the parent drug that was ingested. We propose that this predictive method can improve adherence and retention of MAT patients by aiding evidence-based treatment determination.

Group D: Bai & Wallbaum. Dual-dynamic Retirement Income Strategy for Better Retirement Outcomes

As a response to the persistent low expected return market environment, the present study proposes a dual-dynamic retirement management strategy aiming at improving different retirement outcomes in the retirement portfolio decumulation context. Utilizing a Monte Carlo simulation approach, the present study reveals that the newly proposed dual-dynamic retirement strategy embedded with a target volatility asset management component could improve the survival rate of a retirement portfolio and provide more sustainable retirement coverage for retirees over the two hypothetical retirement spans – a twenty-year retirement scheme and a twenty-five-year retirement scheme – examined in this study. Therefore, benefiting from the target volatility mechanism, the dual-dynamic retirement solution that adjusts the portfolio asset allocation and the portfolio annual withdrawal rate simultaneously could be a new option for current retirement markets.

Group D: Mentzer et al. The Role of Gender and Party on the Twitter Conversation in the 2018 U.S. Senate Elections

This study examines the impact gender and party had on the Twittersphere conversation that occurred leading up to the 2018 U.S. Senate Election. We consider render and party of not only the candidate but also the Twitter user. This allows us to dive into who is driving the conversation on Twitter.  We propose a novel technique, using sentiment analysis, to measure affective polarizations.

Group E: McGuirk. Business Ethics at the Core of Successful and Sustainable Analytics Practices

The International Institute for Analytics has included 'ethics of analytics' as a top five business imperative in both their 2019 and 2020 Analytics Predictions and Priorities list. Interestingly, this heightened need to focus on ethical practices coincides with a separate industry and business priority to ramp up efforts to identify new ways to collect consumer data and apply analytics and data science techniques on this data to gain a competitive advantage. In fact, in a recent study completed by Forrester Research, they found that over 50% of the companies surveyed are planning to launch initiatives to expand their ability to source external data and that many of these firms will appoint 'data hunters' to lead these initiatives. In this session we will dive into this critical moment in time for the analytics and data science communities. A time when an increased reliance on these practices will intensify the need to incorporate sound judgement and ethical business practices across all facets of analytics operations. Unfortunately, there have been too many recent cases of companies committing transgressions that quickly erode consumer trust in their ability to properly manage data and deploy ethical analytics practices. In 2018, Facebook enabled Cambridge Analytica to use millions of members’ personal data without their consent for targeted political advertising and in 2019 Goldman Sachs has been under fire for allowing blatant gender bias in algorithms used to establish credit limits for Apple Card customers.. This session will explore the important role educational institutions can play helping to inspire and empower students to use ethical and socially responsible data collection and analytic practices. It will also explore how industry can utilize cross-functional data governance teams to ensure diverse and empathetic perspectives are considered when deciding what data should be collected, analyzed, and used to support insight-driven decision-making.