Bingjie Zhou

PhD candidate at Tufts University’s Friedman School of Nutrition Science and Policy, in the Division of Nutrition Epidemiology and Data Science.

Bingjie earned her MS in Biochemical and Molecular Nutrition from Tufts University in 2019 and her BA in Veterinary Medicine from China Agricultural University in 2017. Her current research focuses on developing a visual analytic framework to investigate time-series nutrition outcomes at individual, national, and global levels. Zhou’s work aims to enhance the interpretation of complex nutritional data, contributing valuable insights into the associations between dietary factors, such as sugar-sweetened beverages, and chronic diseases. Her current work aims to enhance understanding of the geo-temporal relationship of complex nutritional data, contributing to broader insights in nutrition epidemiology and public health.

Final Presentation

Final Project

AI-Powered Thematic Analysis: Improving Accuracy and Efficiency in Qualitative Health Behavior Research

Project Description

Abstract:

The evaluation of nutrition data dashboards is critical for ensuring accurate communication of complex dietary information, yet manual processes are time-consuming, labor-intensive, and prone to biases. As the volume and complexity of nutrition-related data grows, there is a pressing need for efficient and automated evaluation methods. Artificial intelligence (AI), particularly large language models (LLMs), offers a promising solution by advancing beyond traditional text processing to include tasks like image recognition, visual interpretation, and data visualization. This study aims to assess the effectiveness of LLMs in ranking and comprehending nutrition data dashboards and visualizations. It will compare various LLM versions and evaluation instruction detail levels by collecting text and image data from dashboards and providing different sets of evaluation instructions. These range from basic website links for ranking without specific criteria, to detailed criteria alongside links, and finally, providing extracted data and visualizations with comprehensive criteria. Anticipated discrepancies in dashboard ranking across LLMs and human experts will underscore the necessity of human involvement in meticulous evaluation using LLMs.

Introduction and Rationale:

The evaluation of nutrition data dashboards is crucial for ensuring their effectiveness in conveying complex dietary information accurately and comprehensively. However, manual evaluation processes are often time-consuming, labor-intensive, and subject to human biases. Moreover, with the increasing volume and complexity of nutrition-related data, there is a growing need for more efficient and automated evaluation methods. Artificial intelligence (AI), particularly large language models (LLMs), presents a promising solution to address these challenges as LLMs have advanced their application beyond traditional text processing, extending into areas such as image recognition, interpreting visual information, and generating figures and charts from raw data inputs.

Literature Review:

AI techniques like machine learning and deep learning are increasingly being applied to analyze complex nutritional data and provide personalized dietary recommendations.(1) Concurrently, interactive dashboards and data visualization tools are gaining popularity for monitoring and tracking food and nutrition information.(2) LLMs are being explored to generate textual summaries and insights from structured and unstructured data and data visualizations.(3) However, ensuring the accuracy and reliability of AI-generated evaluations and insights, especially for critical healthcare applications like nutrition, is a significant challenge.(4) Additionally, addressing potential biases and limitations in the training data used for AI models, which could lead to skewed or incomplete evaluations, is another concern. (5) Maintaining transparency and interpretability of AI models’ decision-making processes for nutrition dashboard evaluations is crucial.

Use Case Description:

Objective: This study aims to assess the effectiveness of LLMs ranking and comprehension quality in nutrition data dashboards and visualizations, comparing various versions of large language models and different levels of detail in the provided extracted data.

Methodology: We will collect both text and image data from the dashboard, provide three types of instructions with varying levels of detail, and compare the LLMs’ rankings and evaluations to those of human experts. The simplest instructions involve giving only the website links of the dashboard and asking LLMs to rank the dashboard without specific criteria. A more detailed approach provides LLMs with website links to the dashboard along with detailed evaluation criteria. The most comprehensive approach includes supplying the extracted data and visualizations from the dashboard, along with detailed evaluation criteria.

Key stakeholders: Key stakeholders involved in this study include AI researchers and developers, nutritionists, data scientists, and healthcare professionals. AI researchers and developers will be responsible for refining and testing the various versions of large language models. Nutritionists and healthcare professionals will contribute their expertise to ensure the evaluation criteria accurately reflect the quality and effectiveness of the nutrition data dashboards. Data scientists will facilitate the collection and processing of both text and image data from the dashboards, ensuring the integrity and reliability of the extracted data used in the study.

Proposed AI techniques or tools: Large Language Models (LLMs): Models like GPT-4 or similar advanced LLMs will be utilized to comprehend and rank the nutrition data dashboards based on the provided instructions. Natural Language Processing (NLP): NLP techniques will be employed to analyze and interpret the textual data extracted from the dashboards, ensuring accurate understanding and evaluation.

Knowledge Graphs and Conceptual/Causal Diagrams:

  • Include a knowledge graph to represent the relationships between entities in your use case.

Provide a conceptual or causal diagram to illustrate the relationships between concepts and their implications for the proposed solution.

Ethical Considerations:

Using AI tools to assess the quality of dashboards introduces several ethical considerations, particularly concerning data accessibility and usage. AI currently requires structured data formats, which may necessitate web scraping to extract information from interactive dashboards or websites. This raises ethical concerns regarding the legality and appropriateness of scraping data.

To address these ethical concerns, several measures can be taken:

  • From the Perspective of Web Scraping:

Ethical Considerations: Ensure that web scraping practices adhere to ethical guidelines, respecting the terms of service and privacy policies of the websites being scraped.

  • From the Perspective of Website Developers:

Terms of Use: Ensure that the website’s terms of use explicitly state the conditions under which data can be accessed and used, including any restrictions on automated data extraction.

Copyrighted Data: Respect intellectual property rights by ensuring that scraped data does not infringe on copyrights or other legal protections.

Conclusion and Recommendations:

Key findings:

This study anticipates discrepancies in dashboard ranking between LLMs and human experts, underscoring the critical role of human involvement in meticulous evaluation using LLMs. Collaborative efforts involving AI researchers, nutritionists, and healthcare professionals are crucial to refining evaluation criteria and ensuring the accuracy and relevance of dashboard assessments, whether conducted manually or with AI tools.

Limitations and areas for further research.

The challenge of ensuring the accuracy and reliability of AI-generated insights, particularly in healthcare domains like nutrition, is notable. Concerns arise from biases inherent in training data and the imperative for transparent AI decision-making processes. Future research should focus on investigating methodologies that enhance the transparency and interpretability of AI models, thereby fostering stakeholder trust and comprehension of AI-driven insights.

1.         Theodore Armand TP, Nfor KA, Kim JI, Kim HC. Applications of Artificial Intelligence, Machine Learning, and Deep Learning in Nutrition: A Systematic Review. Nutrients. 2024 Apr 6;16(7):1073.

2.         Zhou B, Liang S, Monahan KM, Singh GM, Simpson RB, Reedy J, et al. Food and Nutrition Systems Dashboards: A Systematic Review. Advances in Nutrition. 2022;13(3):748–57.

3.         Han Y, Zhang C, Chen X, Yang X, Wang Z, Yu G, et al. ChartLlama: A Multimodal LLM for Chart Understanding and Generation [Internet]. arXiv; 2023 [cited 2024 Jun 17]. Available from: http://arxiv.org/abs/2311.16483

4.         Zhang P, Kamel Boulos MN. Generative AI in Medicine and Healthcare: Promises, Opportunities and Challenges. Future Internet. 2023 Sep;15(9):286.

5.         Norori N, Hu Q, Aellen FM, Faraci FD, Tzovara A. Addressing bias in big data and AI for health care: A call for open science. Patterns (N Y). 2021 Oct 8;2(10):100347.

7 Replies to “Bingjie Zhou”

  1. Bingjie, I really enjoyed the evolution of your project from the class. I really think there is utility in AI-generated evaluation of nutrition-related dashboards and you bring up an excellent point about the copyrights. I am wondering if there is a particular type of dashboard you would use as your use case. Is it something related to the general population or would you focus more on a specific disease?

    I am excited to watch this further evolve!

  2. Bingjie, I enjoyed your presentation. This timely project would provide valuable insight for accurately communicating nutrition information while reducing the burden. I appreciated your consideration of the ethical implications of using AI and suggestions for web developers.

  3. Hi Bingjie, I like how one focus of your use case is to compare the performance of LLMs with human evaluation. This is an important step for us to understand where AI tools like LLMs can improve our abilities, and where they might fall short. Nice job!

  4. Hi Bingjie, your project sounds very interesting, especially your approach to using AI to evaluate the effectiveness of the dashboard. I’m curious—will you be including self-evaluation as part of your methodology for this project?

  5. As my peers have noted,

    There is such a great importance to the delicate balance of human intervention and evaluation, hand in hand with the aid of AI efficiency. I love the way that you highlighted a significant place for both in a field that will increasingly require both parties to improve research methods and analyses. Thanks Bingjie!

  6. Bingjie! This is a great project. Dashboards have become an integral way of delivering critical information to consumers in a manageable form. I appreciate the ethical considerations you have noted, specifically around privacy. I am excited to see your continued efforts in this field and the improvements in nutrition health.

  7. Hi Binjie, this is extremely fascinating and I appreciate the real-world applicability of your project. Dashboards are an incredibly useful and underutilized visual tool to help people understand nutrition information. I believe that AI would be very helpful in achieving these goals. You mentioned concerns for bias both from a human and AI standpoint. As we know, nutrition can be a very subjective field (even though maybe it shouldn’t be). I wonder if there’s a way to shape your project so that the AI reflects the nutrition knowledge through the lens of different types of nutrition experts (think about how low-carb vs low-fat scientists view nutrition). I’d be interested to see if the accuracy of a LLM differs based on the human expert inputs.

Comments are closed.