Success Is No Accident

Kickstarter Campaign Success Prediction with ML Classification Algorithms


The goal of this project is to predict the success of a Kickstarter campaign. The datasets used in this project came from Web Robots website from January through April 2021. I used various classification algorithms such as KNN, Logistic Regression, Decision Tree, Random Forest, Naive Bayes, and XGBoost. My final classification model was XGBoost that has a F1 score of 0.80 and AUC score of 0.82. The Model was interpreted using SHAP values metrics to understand which features have higher importance for success. Lastly, a Flask app was built using the final model after retraining it with the entire dataset.

Tools

  • SQL
  • Python (sqlalchemy, Numpy, Pandas)
  • Matplotlib, Seaborn
  • Tableau
  • Scikit-learn
  • Flask

Techniques/Algorithms

Classification Algorithms:

  • K-nearest Neighbor
  • Logistic Regression
  • Decision Tree, Random Forest
  • Naive Bayes - Gaussian, Bernoulli
  • XGBoost

Metrics Selection:

  • ROC-AUC curve - for model comparison
  • F1 score - balance between precision and recall
    • Minimize False Positives: creators wouldn’t want the model to predict too many success that will turn out to be a failure
    • Minimize False Negatives: backers would want to make sure the model capture as many success as possible
  • Confusion matrix - show actual prediction results

Application Usage

Use Case 1

Flask demo from Crystal Huang on Vimeo.

Use Case 2

If the campaign outcome prediction is bleak, some suggestions based on the feature importance with SHAP values are provided.

To Learn More, Check Out My:

  • Blog
  • Code
  • App (Coming Soon)
  • Presentation (Coming Soon)

Let's Talk!