# Statistical Analysis of Two Machine Learning Models Used for a Wearable Sweat Patch Mobile System

by Sevara Nastritdinova

mentor: Fiorenzo Omenetto, Biomedical Engineering; funding source: Gatof Summer Scholars Fund

As part of the Summer Scholars 2020 Program, I worked in Dr. Fiorenzo Omenetto’s lab assisting the development of a mobile system used to quantitatively analyze the pH levels of sweat indicated by color changes in a wearable patch. My role involved analyzing the Mean Absolute Error in two machine learning models: Multilayer Perception (MLP) and Ridge Regression. Given that I possessed no knowledge in Advanced Statistical Analysis nor Machine Learning, my journey encompassed not only getting numerical results for my mini-project, but also gaining tangible knowledge in the afore-mentioned fields.

The first step of my plan required me to study machine learning in more depth. However, before that it was necessary to at least learn Introduction to Python programming to understand some of the syntax used by authors on the web. Therefore, I took and completed a course, which also taught me web-scaping and basic statistical analysis.

Next, I was confused by the names and approaches used by different models used in machine learning. What is more, I did not know how machine learning worked! Michael Pine, a graduate Tufts Computer Scientist, held an educational session, where he explained the concept and procedure. He taught Ordinary Least Squares Regression, Support Vector Machines, Random Forest and Neural Networks as well as the pros and cons of each. The lecture also discussed evaluation methods, including Mean Absolute Error and Mean Square Error.

However, given the expansive nature of the overall project, I started to focus on Principal Component Analysis trying to find the Optimal Principal Component. I discussed the topic with Michael, who reassured that the results had already been produced. Hence, I would only need to confirm them. However, once again I did not know what Principal Component Analysis was. Consequently, I took an Advanced Statistical Analysis course on Coursera to get an idea of the concept. Although I did learn the topic and many more, I was not able to apply my knowledge using Python syntax because the Introduction to Python did not cover that!

Therefore, given my very limited knowledge in the area, I personally decided to cover more material and calculate the MAE for two of the models used in analysis: the Neural Network MLP and Ordinary Least Squares Regression model – Ridge Regression. One, at least I knew the concepts from Michael’s mini-lecture. Two, I knew what Mean Absolute Error was from my Engineering Mathematics classes. Thankfully, Michael’s team had already obtained the results for that too. After a couple weeks of struggling with the syntax, I obtained the results for Mean Absolute Error for both training and validation data by referring to Michael’s results and Python notebook. My values matched Michael’s.

To conclude, this summer was one of the most challenging learning experiences. It was not necessarily my lack of knowledge, but deficiency in in-person teaching or mentoring that limited the opportunities for my self-learning of various new concepts. I could have reached out to my team and Principal Investigator more frequently to receive guidance, but my idea of a self-starting and independent researcher pacified my pro-activeness. Regardless, I gained substantial knowledge in Statistics and gained exposure to the Machine Learning field as I am able to understand some of the main concepts discussed by computer scientists in the field.

Hi Sevara! Sorry you had such a struggle with this project! Looks like you had a crazy uphill battle to go through but I’m glad to see you got something out of it. I’m looking forward to seeing the results of this research in practice 🙂

Nice job! Learning these ideas on your own is definitely not easy! I’m impressed by your perseverance. As both a math nerd and athlete, it was cool to read about your use of these ML tools applied to this really interesting wearable research.