James Aaron James Aaron

Unsupervised ML Algorithms

The Trade & Ahead Project uses Cluster Analysis by K-Means, PCA and t-SNE to group publicly traded companies based on financial metrics and fundamentals. The algorithm was able to reliably separate the companies into 5 distinct groups with very different business dynamics and likelihoods of providing an ROI.

Read More
James Aaron James Aaron

Predictive Maintenance

The ReneWind project uses ML to predict failure of wind turbines based on a variety of onboard sensor data. Decision Tree, Bagging, Random Forest, Gradient Boosing, Adaboost and XGBoost were evaluated and a Pipeline was built for the Gradient Boost model as it had the best performance.

Read More
James Aaron James Aaron

Supervised ML Algorithms with Hyperparameter Tuning

The EasyVisa model predicts an applicants chances of being approved for a work visa based on demographic, economic and career related data. Bagging, Boosting, Random Forest, Adaboost, Gradient Boost, XGBoost and Stacking models were evaluated with hyperparameter tuning to maximize performance.

Read More
James Aaron James Aaron

Logistic Regression and Decision Tree

The INN Hotels project predicts the probability of customer cancellation for a Portugal based hotel chain given demographic, customer history and behavioral data using Logistic Regression with ROC-AUC to develop explainable coefficients and compare to Decision Tree.

Read More
James Aaron James Aaron

Linear Regression, Data Cleaning & EDA

The ReCell Model uses OLS to predict the resale price of used cell phones based on screen size, 5g compatibility, camera resolution, RAM, weight, age and original sale price. The regression equation shows the influence of each variable and allows us to gain an edge by focusing on high margin items.

Read More
James Aaron James Aaron

Hypothesis Test & Statistical Analysis

The E-News Express project performs an a/b test on a new website vs. the old website using the Two-Sample Independent t- and Proportions z-tests, Chi-Square test for independence and ANOVA. Sites were compared based on time spent on the page, their preferred language and whether they clicked one of the links.

Read More