Yiting Zhang
For Aspiring Data Scientist Yiting Zhang, Research Opportunities and Diverse Courses Distinguish BUMET
Yiting Zhang
Research Assistant, Health Informatics Research Lab (HILab)
MS in Computer Science, Concentration in Data Analytics
BS, Penn State University
What compelled you to return to school and pursue a graduate education? What is your long-term objective?
I completed my bachelor’s degree in Industrial Engineering (IE) at Penn State University in 2018. I fully recognized the valuable role of IE technology in transforming businesses. Since the IE department at Penn State had so many interdisciplinary courses, I built a keen interest and a solid theoretical foundation in mathematics, statistics, and computer science. This is why I would like to pursue a different field here at BU and give myself a challenge.
As we all know, AI, big data, machine learning, and deep learning have become very hot in recent years, both in the United States and elsewhere in the world, including my home country. I hope to learn some core knowledge and technologies here, since the US has the best educational institutions and I definitely need to take the advantage of that. I believe pursuing a graduate degree will allow for a more in-depth exploration of my fields of interest, and can prepare me well for my future career.
To be a data scientist and, if possible, work for companies/organizations that relate to healthcare, is my objective now. I would like to eventually develop a consulting team to help different businesses with their data analysis and IT strategy, or even work with government and non-profit organizations.
Why did you choose BU MET for your graduate studies? What set BU MET apart from other programs you were considering?
My personal motivation is to quickly find a better job after I graduate from school. I have noticed that BU MET has lots of fantastic programs that are career-oriented, which perfectly matches my concerns and future plans. I have also considered other programs such as operational research or business analytics. However, BU MET’s courses in the Computer Science department were more appealing to me. MET has fundamental programming courses for Python, R, and Java.
In advanced courses, such as Data Science with Python (MET CS 677) and Data Analysis and Visualization with R (MET CS 555), they teach data analysis and visualization methods using Python and R languages. For my own interests, as mentioned above, Machine Learning (MET CS 767) and Big Data Analytics (MET CS 777) are two wonderful courses if you wish to become a data scientist. BU MET has the most diverse courses among all the programs that I considered, so it became my first choice.
Is there a particular faculty member from your courses who has enhanced your experience at BU MET? Who and why?
I took Professor Kia Teymourian’s course Data Analysis and Visualization with R, and he became one of my favorite instructors at MET. His explanations of statistical concepts are very clear and understandable. I can recall many concepts I have learned before in his lectures, as well as new advanced statistical knowledge. Since he is using R in this course (which is my favorite language for processing data), Professor Teymourian shares lots of sample codes on GitHub, and they are very helpful for my homework, as well as saving time when processing or dealing with data in real-life situations.
In Spring 2019, I was selected to become a research assistant in the Health Informatics Research Lab (HILab) under Professor Guanglan Zhang, and I am currently working with her team.
How do you apply concepts you are learning in your courses at BU MET in your research with the HILab?
It will be fantastic to use R to apply the hypothesis test and to understand the significance of any compared variables in our research. R is also a strong tool to process the data visualizations and draw box plots, mosaic plots, scatter plots, histograms, and others.
R packages are also worth exploring. Here we find a package which can transfer different ICD (International Classification of Diseases) codes from each other. It also can transfer numerical codes to disease descriptions and eventually it can save time for manually filtering and transferring the ICD code.
Python is a hot language but a little bit unfamiliar to me. I have only taken Information Structures with Python (MET CS 521), but now I can use Python to process and filter the raw data from our research lab database. By taking the Python course and doing research at the same time, I have always tried to simplify my codes with new knowledge from the lecture so I can cut down my run time when dealing with huge amounts of data from our database.