Machine LeaRning

Time Series forecasting using Prophet in R

prophet
time series

I'm so excited to write this post. Time series forecasting was something that I thought was missing from the classes I took at SMU. Prof Roh pointed me in the direction of learning Prophet from Meta.

Kaggle Competition: Natural Language Processing with Disaster Tweets using quanteda

quanteda
classification
SVM
LASSO

Trying my hand at my second Kaggle competition using quanteda. I wonder how I will perform. Wish me luck!

text analysis in R: tidytext, stm, quanteda

LDA
stm
quanteda
tidytext

I completed Module 5 of my course, which was about Text Classification and Topic Modeling. I wanted to get some practice with what I learnt in class, as well as explore more about the quanteda package.

Kaggle competition: Multi-Class Prediction of Obesity Risk

PCA
tidymodels
classification

My first Kaggle competition. Wish me luck.

Exploring Principal Component Analysis with tidymodels and prcomp

PCA
tidymodels
prcomp

We ran out of time during class due to the immense amount of material that needed to be covered. Hence, these assessment questions became homework assignments. Within, you will find explorations of PCA using the tidymodel way, as well as using prcomp from Base R.

Using ML to predict customer attrition

classification
logistic regression
random forest
lightgbm
knn
xgboost

This short exercise was assigned as homework. Let's practice my ML skills and try and predict customer attrition from this dataset.

Pharmaceutical machine learning with tidymodels

classification
random forest
logistic regression
tune_grid
tune_bayes
pca

Use Machine Learning to develop a model to determine if a proposed drug could be a mutagen.

using machine learning to predict risk of type2 diabetes

classification
logistic regression
LASSO regression
random forest
decision tree
naive bayes
knn
xgboost

Type 2 diabetes is one of the most prevalent chronic diseases in the United States, affecting the health of millions of people, and putting an enormous financial burden on the US economy.

Hotel reservation cancellations

classification
logistic regression
LASSO regression
random forest
decision tree
naive bayes
knn
xgboost

In this next exercise, we try and predict the probability that a hotel's reservations will eventually be cancelled, given information we have on hand, such as ADR, customer segment, and deposit type...

HDB Rental Prices

regression
linear regression
random forest
knn
xgboost

Let's try and predict the mediaa rental price per sqft of HDA flats given information about their location, age, distance to MRT station....

More articles »

Machine LeaRning

ML


I started Module 1 of SMU Academy’s Predictive Analytics and Machine Learning class in January 2024. This blog will serve as a repository for various Machine Learning projects I embark on.