Virtual Poster Session

12:00 - 12:50 PM, Wednesday, April 21, via Gather.Town

Poster #1

Name: Neil Callahan



Title: Predicting Success of College Running Backs in the NFL

Abstract:

This poster will look at relationship between National Football League (NFL) draft picks from National Collegiate Athletic Association (NCAA) football programs and the success of these players in the NFL.  For this project data was collected on running backs who were drafted from 2005 to 2020. The goal was to build a model to predict whether players would be successful in the NFL. I used four variables to predict the outcome. Various predictive models were built but ultimately the best model was a Naïve Bayes model that was around 80% accurate at correctly classifying busts and successes. Career yards turned out to be the most important factor in making predictions while BMI was the least important. The four distribution graphs compare the variables against the outcome and helped in making decisions about cutoffs when classifying the players as busts or successes.  

 

 

Poster #2

Name: Joe Kulas



Title: Will Minor League Baseball Players Make it to the Major Leagues?

Abstract:

Many minor league baseball players never make it to the majors, especially given that there are many more players in the minor leagues than there are spots available on major league rosters. The goal of this project was to use predictive modeling to investigate which factors predict whether current minor leaguers will make it to the majors in the future. I collected data on minor league baseball statistics for current and former professional baseball players. Using this data, I implemented multiple different prediction methods and used the misclassification rates to determine which model performed the best. The random forest model was found to be superior to the other methods. A few of the most important factors for predicting whether pitchers make it to the majors are strikeouts, games played and hits allowed, and batter’s games played, at-bats, and hits. Finally, this best model predicted that only about 120 of the thousands of current minor leaguers would make it to the majors in the future.

 

Poster #3

Name: Evan Rondeau



Title: Impacts of Data on Direct Marketing

Abstract:

Many businesses employ an analytics team to help them gain insight into industry trends and make decisions regarding workflow and revenue.  What benefits can this offer to a business that does not employ such a team?  My project will show the effect of a short-term internship and the effect this work had on a marketing campaign surrounding a webinar series.

 

Poster #4

Name: Thomas Veenker



Title: Analyzing and Predicting the Success of Reddit User Submissions

Abstract:

For this project, I examined user submissions to Reddit, a popular social news aggregation website, to determine what factors generated community approval and lead to higher visibility.  To obtain the data, I created a unique Reddit API, learned basic programming in Python, and taught myself how to web scrape Reddit in Python via the use of API wrappers.  After scraping 25,000 user submissions from Reddit, I analyzed the data to ascertain the effects of certain parameters (e.g., keywords, sentiment, length, submission time/date) on the “success” of a Reddit submission, created a regression model to predict said “success” of any user submission, and developed a general strategy to maximize the potential visibility of a user submission.  My research has promise for both advertisers and individual users who want to broadcast to a larger audience on Reddit. 

 

Poster #5

Name: Benjamin Winters



Title: eSports Predictive Analysis - A Study of Hearthstone Tournaments

Abstract: This poster will analyze and discuss how certain factors influence game outcomes in a tournament setting for the digital collectible card game Hearthstone. The main forms of analysis that will be used are logistic regression and decision trees in order to determine significant factors and to make predictive analysis. Features under consideration of analysis will be mainly in-game factors specifically geared towards players going first, concepts around mana, mana being the medium with which players can interact with the game, and different ways in which cards can influence the state of play. Finally, the outcome of interest with which the scope of this study will be viewed is the end result of games, that being winning or losing.


Poster #6

Name: Rebecca Barter



Title: Survival Analysis

Abstract: For my study, I was interested in looking into biostatistics and more specifically survival analysis. My main goal was to learn about the statistical methods that can be applied to survival data.  I obtained data that contained information on the heart failure of patients along with several other covariates that affected the length of survival for these patients.  I learned about and applied methods such as Kaplan-Meier and Cox Proportional Hazards.

 

Student Seminar

12:00 - 12:50 PM, Wednesday, April 14, via ZOOM

Data Engineering: Extract Transform and Load (ETL)

N’Dri Diby

Moving data from one place to another is an important step for a company that relies on its own data for decision making. Data are coming from different sources and it is necessary to bring data into one place to help businesses become more productive. Over the summer I worked for a financial technology company called Spave. As a Data Analytics Engineer intern, I built up a data pipeline to transport raw data, transform data per business logic, and load the data into the target database which enabled software engineers to display information in front of the app. In this presentation, I will go over the different steps I took to build the ETL data pipeline.

       Quality Assurance Analyst at Fastenal

Benjamin Garling

As part of the Quality Assurance team at Fastenal my focus has been working with the Contract Management team where we test the Contract Management application. The primary focus of my talk will be on what it is like to work for a big corporation, how testing programs makes you write better code yourself and some of the ways I have applied my classwork on the job. I will also speak briefly on some challenges that come with working for such a big company and ways to get around some of the hurdles.

Student Seminar

12:00 - 12:50 PM, Wednesday, April 7, via ZOOM

Predictive Modeling for COVID-19

Aaron Schram

My capstone featured a dataset from Kaggle that was created in order to test a Long Short-Term Memory (LSTM) Neural Network method for creating predictive models based off of B-cell data. The goal of the B-cell data was to use machine learning to determine a reliable method for predicting epitope regions antigens that these B-cells can map onto. For this project, I explored various methods of supervised learning to understand to process of creating a predictive model for a binary categorical response.

       National Hockey League (NHL) Data Analyses

Rochelle Ziemann

Predicting different performance statistics and finding differences amongst the players is becoming more popular in all sports. I chose to investigate the sport of hockey because it is my favorite sport to watch, plus Minnesota is considered the state of hockey. Included in this presentation will be analyses for differences between players and teams and finding what performance statistics best predict whether or not a team will make the playoffs.