Data Science

Project 1
Binary Classification Regression Project

The team has developed Binary Classification Project on Python.

Project 2
Easy R-analysis

Consists of checking out a small data set, loading the data properly.

The task is to perform different logistic regressions and knn-model on a small data set. These models should be validated “using 10-fold cross-validation, repeated 5 times, with the area-under-the-ROC-curve metric (the ROC statistic in caret). Where appropriate, select the best settings for tuning parameters using the one-standard error rule. We use the same folds for all models (for example, by resetting the random seed appropriately”

After that the best model should be chosen.

The whole project delivered in three analyses:

1 A summary of from a business perspective, intended to be shared with “the client”

2 A more technical summary of the work done until now, for your data scientist colleagues; and

3. A list of priorities to investigate in subsequent iterations of the project, also for your colleagues.

Everything is in an R-notebook and also in an html output!