Customer churn is the number of customer leaving from using the company service. It is the critical thing to the company to known about it since it directly impacts in the business growth. There is said that it is really hard to bring back churned customer than the new customer. Thus, it is better to know in beforehand the caused reasons for customer churn. The customer churn can happen due to reasons like 1) either not happy with the company services(switch to other company) or 2) customer own circumstances(eg. relocation, death, etc)
In this project, the tele-communicaton dataset is used to figure out the reason of customer churn in the company. Basically, the following questions are answered in this work.
1.Data
1.1 Data Overview
1.2 Data Manipulation
2.Exploratory Data Analysis
2.1 Visualize Class distribution
2.2 Variables distribution
2.3 Observation
3. Data Preprocessing
3.1 Change category value into dummy
3.2 Standardizing features
3.3 Split Train and Test split
3.4 Handle Imblanced Data
4. Model
4.1 Logistic regression
4.2 Decision Tree Classifier
4.3 Random Forest
4.4 Gradient Boosting
4.5 XGBoost Classifier
4.6 Lightgbm Classifer
# import necessary libraries
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import graphviz
import pydotplus
import plotly.figure_factory as ff #for table visualization
from plotly.offline import init_notebook_mode, iplot
from sklearn import tree
from scipy.stats import zscore
from imblearn.over_sampling import SMOTE # for handeling unbalanced data
from sklearn.externals import joblib
from yellowbrick.classifier import ConfusionMatrix
from sklearn.tree import DecisionTreeClassifier
from xgboost.sklearn import XGBClassifier
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from lightgbm import LGBMClassifier
from sklearn.model_selection import train_test_split, GridSearchCV, StratifiedShuffleSplit
from sklearn.metrics import accuracy_score,confusion_matrix, classification_report, roc_auc_score, roc_curve, recall_score, precision_score, f1_score
'''=== function to load dataset ===='''
def load_daset(dataset):
df =pd.read_csv(dataset)
return df
# call load_daset()
df = load_daset('Telco-Customer-Churn.csv')
df.head()
#drop customerID column and show last 10 rows
df_copy = df.copy()#copy datafram
data = df_copy.drop('customerID', axis=1)
data.tail()
print ("Rows : " ,data.shape[0])
print ("Columns : " ,data.shape[1])
print ("\nFeatures : \n" ,data.columns.tolist())
In this dataset, there are 7043 rows(obseravations) and 19 columns(features) and last columns as predictive variable
'''=== Show descriptive statistics for numerical and categorical === '''
#numerical only
print('Descriptive statistics for numerical\n')
print(data.describe())
#categorical only
data.describe(include='O')
# check datatype of each columns
data.info()
The above output shows that the majority of variables are of categorical datatype (object). This categorical datatype should be changed into numerical one before feeding to machine learning algorithms.
And, It was also noticed that the column 'TotalCharges' is appeared as non- float data type though it should be float type. So, lets convert all the values in that columns as float data type. The non numeric values changed into NaN value after following command.
# convert columns 'MonthlyCharges' as float data type
data['TotalCharges'] = pd.to_numeric(data['TotalCharges'], errors='coerce')
data.info()
Now 'TotalCharges' is changed to float data type. And, the total counts is only 7032 out of 7043 observations. It seems that there is NaN values. Lets check out and remove them.
'''== check Null values in all columns ===='''
# show the rows with NaN values
print('No. of rows with NaN: {}'.format(data[data.isnull().any(axis=1)].shape[0]))
#remove all Null values
data = data.dropna()
data.shape
#drop duplicate except first occurance
#data[data.duplicated()]
data = data.drop_duplicates( keep='first')
data.shape
# check the catergory values in each category column
caterogy_column = data.select_dtypes(include='object')
for x in caterogy_column.columns:
print(x,':', data[x].unique())
In the above ouput, there is two different words in a column but refereing to the same meaning, for example, 'No phone service' and 'No', 'No' and 'No internet service'. Lets make them to same one.
'''change into same word- NO'''
binary_for = { 'No phone service':'No', 'No internet service':'No'}
# function to convert catergory values into numeric
def convert_value_binary(col_names: list):
for col in col_names:
data[col] = data[col].replace(binary_for)
#columns to change its category values
columns = ['MultipleLines', 'OnlineSecurity',
'OnlineBackup', 'DeviceProtection','TechSupport', 'StreamingTV', 'StreamingMovies']
#call function
convert_value_binary(columns)
data.head(5)
In this section, each variables are visulaized and analyised with regard to predicitive class(churn or not churn). The answers for the questions no. 1, 2 and 3 are tried to figure out in this part.
'''==== check class/label distribution in dataset ===='''
class_count = data['Churn'].value_counts()
print('Class count: \n',class_count) # each class with total no. sample of data
y_pos = [class_count[0], class_count[1]] # for y_axis
x_pos = ['No', 'Yes'] # for axis
# creates bar chart of class labels
plt.figure(figsize=(7,5))
plt.bar(x_pos, y_pos, width=0.50, color='g')
plt.xlabel('Class')
plt.ylabel('No. of sample in class')
plt.title('Class distribution')
plt.show()
#''' === another way to plot using frequency distribution ==='''
# creates histogram figure
#class_dist1 =list(np.array(df['Churn'])) # takes class column
#plt.figure()
#plt.hist(class_dist1)
#plt.xlabel('Class')
#plt.ylabel('No. of sample in class')
#plt.title('Class distribution')
#plt.show()
In the above bar chart, it clearly shows that the class label are imbalanced i.e. class- Yes(27%) and class- No(73%) of total dataset. Thus, we need to balance them either by upsampling the class- Yes or undersampling the class-No so that the model will predict the unseen data corretly to both class. Otherwise, the model will bias with the majority of class. So, it is handle in imblance data section
variable_not = ['MonthlyCharges', 'TotalCharges']
#show barchart of all coloumns
for x in data.columns[:-1]:
if x not in variable_not:
print('Show '+ x + ' in Class Distribution ')
#a =data.groupby([x, 'Churn']).size().unstack(fill_value=0)
if x == 'tenure':
pd.crosstab(data[x],data['Churn']).plot.bar(figsize=(15,7))
plt.show()
print('\n\n')
else:
pd.crosstab(data[x],data['Churn']).plot.bar(figsize=(7,5))
plt.show()
print('\n\n')
"""=== Lets try to anylize the services based on the contract types ==="""
lists_ofservice = ['PhoneService', 'MultipleLines', 'InternetService', 'OnlineSecurity',
'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV',
'StreamingMovies',]
for x in lists_ofservice:
print('Show '+ x + ' with contract type ')
g = sns.catplot('Churn', col=x, row ='Contract', data=data, kind='count', aspect=0.8, hue_order=['g', 'b']) # bar plot
plt.show()
print('\n\n')
From the above plots, it is clearly noticed that the large of protion of customer churn are from those customers who has short tenure, phone service(yes), multiplelines(No), oniline security(No), online backup(No), Device protection(No), Techsupport(No), fiber optic as internet service, month-to-month contract type, electronic check payment type.
Especially the customer who has short contract type i.e. month-to-month are most likely to churn.
The reasons might be: 1) Company doesn't take necessary steps before contract end, 2) Customer might get better offer from another company, and 3) Since the majority of leaving customer are from 'phone service' and 'internet-service(Fiber-optic)', the services in these categories should be checked/revised to make sure they are providing the really better service than the competitor.
#change the category values into dummy variable and drop the first column(to avoid dummy trap)
df_dummy= pd.get_dummies(data.iloc[:,:-1], drop_first=True)
df_dummy = pd.concat([df_dummy, data['Churn']], axis=1)
print('Dimension of dataset after dummy variable applied:\n',df_dummy.shape)
df_dummy.head()
# apply zscore to ceratain columns
columns = ['tenure', 'MonthlyCharges','TotalCharges']
new_data = df_dummy.copy()
new_data[columns] = new_data[columns].apply(zscore)
new_data.tail()
#split data into train (70%) and test (30%)
def data_split(data, test_size):
'''
data = data to split
test_size = size of test-data
'''
train, test = train_test_split(data, stratify=data['Churn'], test_size = test_size, random_state=3)
X_train = train.drop(['Churn'], axis=1) # drop 'Churn' column
y_train = train['Churn'] # Churn column for train
X_test = test.drop(['Churn'], axis=1) # drop 'Churn' column
y_test = test['Churn'] # Churn column for test
return X_train, y_train, X_test, y_test
#call data split function
X_train, y_train, X_test, y_test = data_split(new_data, 0.3)
# check ratio of train and test based on class label
print(y_test.value_counts()[0]/(y_test.value_counts()[1]+y_test.value_counts()[0])) # class 'No churn'
print(y_test.value_counts()[1]/(y_test.value_counts()[1]+y_test.value_counts()[0])) #clas 'Churn'
For handling this imblanced data, over-sampling technique(SMOTE) is used. So that the both class will have equal sample. In this case, up-sampling only to the train dataset, but test datset set should be as it is. This is done in the section below- model building.
'''=== up-sampling is done to the minority class. So that it becomes
balanced. This is done only to train data but not to test data ==='''
#over-sampling the data
def up_sampling(X_train, y_train):
'''
X_train = Input features of training dataset
y_train = Ouput/class of training dataset
'''
sm = SMOTE(random_state=42, ratio=1.0) # SMOTE is apply for up-sampling
X_train_new, y_train_new = sm.fit_sample(X_train, y_train) # up-sampling only the train data
return X_train_new, y_train_new
# call up_sampling function
X_train_new, y_train_new = up_sampling(X_train, y_train)
print('Before up-sampling:\n', y_train.value_counts())
print()
unique_elements, counts_elements = np.unique(y_train_new, return_counts=True)
#print('\n After up-sampling:', np.bincount(X_train_new)) for int values types
print('\n After up-sampling:', counts_elements)
The gridSearchCV with 10 Cross-validation(CV) is used for tunning the parameteres (find the optimal parameters) for each of algorithms. And, the selected optimal parameters is used to to buld the model and evaluated with test data. The confusion matrix from each model are displayed. AUC score, classification accuracy are used for model perfomance.
"""== return best parameter =="""
#parameter tuning with gridsearch
def find_optimal_para(algorithm, parameters, cv, X_train, y_train):
"""
algorithm = instance of alorithms
parameters = list of parameter
cv = no. of cross-validation
X_train = training dataset(features)
y_train = training dataset(labels)
"""
best_clf = GridSearchCV(algorithm, parameters, cv =cv)
best_clf.fit(X_train,y_train)
return best_clf.best_params_
"""== return clasification accuracy, confusion matrix and classification report =="""
#model evaluate
def model_evaluaton(model, X_test, y_test ):
"""
model= final model(parameter tunned)
X_test = testing datset(features)
y_test = testing dataset(labels)
"""
predict = model.predict(X_test) # prediciton on test data
accu_score = accuracy_score(predict, y_test)# clasification accuracy
con_matrix = confusion_matrix(predict, y_test) # confusion matrix
cla_report = classification_report(predict, y_test) # classification report
return accu_score, con_matrix, cla_report
#creates confustion matrix(from yellobrik)
def confusion_Matrix(model, X_test, y_test, label):
"""
model = final model(parameter tunned)
X_test = testing datset(features)
y_test = testing dataset(labels)
label = labels/ouputs in list, e.g ['Yes', 'No']
"""
# ConfusionMatrix model
cm = ConfusionMatrix(model, classes=label)
# Fit fits the passed model.
#cm.fit(X_train, y_train)
cm.score(X_test, y_test)
cm.finalize()
'''=== ROC curve plot ===='''
def creat_roc_curve(model, X_test, y_test, pos_label):
"""
model = final model(parameter tunned)
X_test = testing datset(features)
y_test = testing dataset(labels)
pos_label = postive class(either in integer or string), eg. 'Yes'
"""
y_test_prob_tune = model.predict_proba(X_test)#class probabilities for Auc_score
prob_pstive = [p[1] for p in y_test_prob_tune] # probabilty for positive class
auc_score = roc_auc_score(y_test, prob_pstive)
#print('Auc_score:', auc_score)
#create figure- area under curve
fpr,tpr,thresholds = roc_curve(y_test, prob_pstive, pos_label=pos_label) # find true positve, false positive rate and thresholds
plt.figure(figsize=(7,5))
plt.plot(fpr, tpr, marker='o',label="auc_score = " + str(auc_score))
plt.plot([0, 1], [0, 1],'r--')
plt.xlim([0, 1])
plt.ylim([0, 1])
plt.title('ROC Curve')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.legend(loc=4)
plt.show()
"""=== train model with best paramter and evalute perfomance ==="""
#train model
def train_evaluate_model(algorithm, parameters,cv, X_train, y_train, X_test, y_test, label, pos_label ):
"""
algorithm = instance of alorithms ,
parameters = list of parameter ,
cv = no. of cross-validation,
X_train = train dataset(features),
y_train = train dataset(labels),
X_test = test dataset(features),
y_test = testa dataset(labels),
label = labels/ouputs in list, e.g ['Yes', 'No'],
pos_label = postive class(either in integer or string), eg. 'Yes'
"""
best_parmeter = find_optimal_para(algorithm, parameters, cv, X_train, y_train) # return best parameter
print('best parameter: ', best_parmeter)
print()
alg_with_best_par = algorithm.set_params(**best_parmeter) # create algorithm instance with best parameter
alg_with_best_par.fit(X_train_new, y_train_new) #fit training data
accu_score, con_matrix, cla_report = model_evaluaton(alg_with_best_par,X_test, y_test) # call model_evaluaton
print( 'Classification acurracy:', accu_score)
print('\nConfusion Matrix:\n', con_matrix)
print('\nclassification_report:\n', cla_report )
print()
print( 'Confusion Matrix from yellobrick:')
confusion_Matrix(alg_with_best_par, X_test, y_test, label)
print()
creat_roc_curve(alg_with_best_par, X_test, y_test, pos_label)
#return model
return alg_with_best_par
"""== function to save model =="""
# save the model to disk
def save_model(model, file_name):
joblib.dump(model, 'Models/'+ file_name) #save in Models folder
return ('Model saved')
#parameter
parameter_lr = {
#'penalty' : ['l1', 'l2'],
'C' : [0.001,0.01,0.1,1,10,100],# regularization parameter. C = 1/λ.
'solver' : ['newton-cg', 'lbfgs', 'liblinear'] # optimizer
}
#no. of fold forcross-validaton
cv = 10
#train dataset
train_features= X_train_new
train_labels = y_train_new
#test dataset
test_features = X_test.to_numpy()
test_labels= y_test.to_numpy()
#for confusion matrix(yellobrick)
label = ['Yes', 'No']
#positve class for roc_curve
pos_label ='Yes'
%%time
#call train_evaluate_model()
alogrihtm = LogisticRegression()
model_lr =train_evaluate_model(alogrihtm, parameter_lr, cv, train_features, train_labels,test_features, test_labels, label, pos_label )
#save model
save_model(model_lr, 'final_model_lr')
#show features importance in diagram
feat_import = pd.Series(model_lr.coef_.flatten(), index=X_train.columns)
feat_import.nlargest(23).plot(kind='barh',figsize=(15,13))
plt.show()
# save the model
save_model(model_lr, 'final_model_lr')
#parameters
parameter_dt = {'criterion':['entropy','gini'],
'max_depth':range(1,50),
'min_samples_leaf': range(20,100, 10)
#'max_features': [ 5, 10, 12],
#'splitter': ['best','random']
}
%%time
#train with best parameter and evaluate
alogrihtm = DecisionTreeClassifier( random_state=42)
model_dt =train_evaluate_model(alogrihtm, parameter_dt, cv, train_features, train_labels,test_features, test_labels, label, pos_label )
# save the model
save_model(model_dt, 'final_model_dt')
# show in tree-structure
churnTree = tree.export_graphviz(model_dt, out_file=None,
feature_names = list(X_train.columns.values),
class_names = [ 'No Churn', 'Churn'],
filled=True,
rounded=True,
special_characters=True)
graph = graphviz.Source(churnTree)
#graph.render('decision_tree.gv', view=True)
pydot_graph = pydotplus.graph_from_dot_data(churnTree)
pydot_graph.write_png('original_tree.png') # shave figure
#pydot_graph.set_size('"5,5!"')
#pydot_graph.write_png('resized_tree.png')
graph
'''==== plot train and test score with respect to 1-50 max_depth ===='''
acc_train = []
acc_test = []
for n in range(1,30):
#decision tree instance
deci_with_tune = DecisionTreeClassifier(criterion ='gini', max_depth = n, min_samples_leaf=20)
deci_with_tune.fit(train_features, train_labels )
# prediciton on test data
predict_tuned = deci_with_tune.predict(test_features)
accu_score = accuracy_score(predict_tuned , test_labels)# clasification accuracy
acc_test.append(accu_score)
# prediciton on train data
train_predict_tuned = deci_with_tune.predict(train_features)
train_accu_score = accuracy_score(train_predict_tuned , train_labels)# clasification accuracy
acc_train.append(train_accu_score)
#creates figure
plt.figure(figsize=(10,7))
x = np.arange(1, 30) # for x-axis
plt.plot(x, acc_train, label='training Accuracy' )
plt.plot(x, acc_test, label='testing Accuracy' )
plt.legend()
plt.show()
At first, with the max_depth equal to 1, the model acuracy is low. Nerverthless, as the no. of depth goes on increasing, the model get better, but after max_depth = 9(nearly), the model perfectly predict the training data but fails to generalize the new data as the max_depth get increased. Thus, this shows that the model will be overfitted if we increase max_depth.
'''==== Show Top 12 features contribution to target labels ==='''
coeffs = model_dt.feature_importances_
list_of_coeffs = list(sorted(zip(coeffs, X_train.columns), reverse=True))
x_val = [x[0] for x in list_of_coeffs[:12]]
y_val = [x[1] for x in list_of_coeffs[:12]]
#create figure
plt.figure(figsize=(10,7))
plt.barh( y_val, x_val, align='center', color='blue')
plt.title('Top 12 Importance Features')
plt.gca().invert_yaxis() # first with higest contributed one
plt.show()
"""==Another way=="""
#feat_importances = pd.Series(coeffs, index=X_train.columns)
#feat_importances.nlargest(10).plot(kind='barh')
'''=== Another way to draw ROC curve, so handy and efficent === '''
import scikitplot as skplt
y_test_prob_tune = model_dt.predict_proba(X_test)#class probabilities for Auc_score
y_true = y_test# ground truth labels
skplt.metrics.plot_roc(y_true, y_test_prob_tune) # takes true and predicted lalels
plt.show()
#parameter
parameter_ran = {
'n_estimators': [int(x) for x in np.linspace(start = 50, stop = 1000, num = 10)], #number of trees in the forest
#'max_features':['auto', 'sqrt'],
'max_depth' : [int(x) for x in np.linspace(5, 50, num = 5)],
'min_samples_leaf': [int(x) for x in np.linspace(5,105, num =5)]
}
%%time
#random forest model
algorithm_rf = RandomForestClassifier(random_state=1)
model_rf =train_evaluate_model(algorithm_rf, parameter_ran, cv, train_features, train_labels,test_features, test_labels, label, pos_label )
# save the model
save_model(model_rf, 'final_model_rf')
#show features importance in diagram
feat_import_rn = pd.Series(model_rf.feature_importances_.flatten(), index=X_train.columns)
feat_import.nlargest(23).plot(kind='barh',figsize=(15,13))
plt.show()
# show in tree-structure
churnTree = tree.export_graphviz(model_rf.estimators_[0], out_file=None,
feature_names = list(X_train.columns.values),
class_names = [ 'No Churn', 'Churn'],
filled=True,
rounded=True,
special_characters=True)
graph = graphviz.Source(churnTree)
#graph.render('decision_tree.gv', view=True)
pydot_graph = pydotplus.graph_from_dot_data(churnTree)
pydot_graph.write_png('random_tree.png') # shave figure
graph
#parameter
parameter_grd = {
'max_depth' : [int(x) for x in np.linspace(5, 50, num = 5)],
'min_samples_leaf': [int(x) for x in np.linspace(5,105, num =5)],
'max_features':['auto', 'sqrt'],
'learning_rate':[0.001, 0.01, 0.1, 1, 10, 100] # Boosting learning rate
}
%%time
algorithm_grad_boosting = GradientBoostingClassifier(random_state=1)
model_gbm_boosting =train_evaluate_model(algorithm_grad_boosting, parameter_grd, cv, train_features, train_labels,test_features, test_labels, label, pos_label )
# save the model
save_model(model_gbm_boosting, 'final_gbm_boosting')
#show features importance in diagram
feat_import_gbm = pd.Series(model_gbm_boosting.feature_importances_, index=X_train.columns)
feat_import_gbm.nlargest(15).plot(kind='barh',figsize=(10,9)).invert_yaxis()# show most import first
plt.show()
#parameters
parameter_xgboost = {
'max_depth' : [int(x) for x in np.linspace(5, 50, num = 5)],
'num_leaves': [int(x) for x in np.linspace(5,105, num =5)],
#'max_features':['auto', 'sqrt'],
#'booster': ['gbtree'],
'learning_rate':[0.001, 0.01, 0.1, 1, 10],
'reg_lambda': [0, 0.1, 1, 5, 10, 20, 50] #L2 regularization term on weights
}
#tune parameters and show performance on test set
algorithm_xgboost = XGBClassifier(random_state=1)
model_xgboost =train_evaluate_model(algorithm_xgboost, parameter_xgboost, cv, train_features, train_labels,test_features, test_labels, label, pos_label )
#show features importance in diagram
feat_import_gbm = pd.Series(model_xgboost.feature_importances_, index=X_train.columns)
feat_import_gbm.nlargest(15).plot(kind='barh',figsize=(10,9)).invert_yaxis()# show most import first
plt.show()
# save the model
save_model(model_xgboost, 'final_model_xgboost')
parameter_lightgbm = {
'n_estimators': [int(x) for x in np.linspace(start = 50, stop = 1000, num = 5)], #number boosted tre to fit
'max_depth' : [int(x) for x in np.linspace(5, 50, num = 5)],
'num_leaves': [int(x) for x in np.linspace(5,105, num =5)],
#'max_features':['auto', 'sqrt'],
'learning_rate':[0.001, 0.01, 0.1, 1, 10],
#'booster': ['gbtree'],
'reg_lambda': [0, 0.1, 1, 5, 10, 20, 50] #L2 regularization term on weights
}
%%time
algorithm_lightgbm = LGBMClassifier(random_state=1)
model_lgbm =train_evaluate_model(algorithm_lightgbm, parameter_lightgbm,cv, train_features, train_labels,test_features, test_labels, label, pos_label )
#save model
save_model(model_lgbm, 'final_model_lgbm')
#show features importance in diagram
feat_import_gbm = pd.Series(model_lgbm.feature_importances_, index=X_train.columns)
feat_import_gbm.nlargest(15).plot(kind='barh',figsize=(10,9)).invert_yaxis()# show most import first
plt.show()
Compare model perfomance
#dict to store all metrics
model_metric = {"Model": [],
"Accuracy_score": [],
"Auc_score": [],
"Recall_score" : [],
"Precision" : [],
"f1_score" : []
}
# model(with best parameter) perfomance compare
def compare_alg( model_name, model, test_feat, test_label):
"""
model_name = algorithm name,
model = model(with best parameter),
test_feat = test dataset (features),
test_label = test labels
"""
#predict with test data
predict_val = model.predict(test_feat)
accuracy = accuracy_score(test_label, predict_val) # accuracy score
roc_auc = roc_auc_score(np.where(test_labels == 'Yes',1, 0), np.where(predict_val == 'Yes',1, 0)) # auc score(takes only number)
rec_score = recall_score(test_label, predict_val, pos_label = 'Yes') # recall score
precio_score = precision_score(test_label, predict_val, pos_label = 'Yes') # precious score
f1score = f1_score(test_label, predict_val, pos_label = 'Yes') # f1 score
# append values to dict
model_metric["Model"].append(model_name)
model_metric["Accuracy_score"].append(accuracy)
model_metric["Auc_score"].append(roc_auc)
model_metric["Recall_score"].append(rec_score)
model_metric["Precision"].append(precio_score)
model_metric["f1_score"].append(f1score)
print('metric appended')
# call compare_alg()
model_list = [model_lr, model_dt, model_rf, model_gbm_boosting, model_xgboost, model_lgbm]
model_name =['Logistic Regression', 'Decision Tree', 'Random Forest','GradientBoosting','XGBoost Classifier','LightGBM classifier']
for model, name in zip(model_list, model_name):
compare_alg(name, model, test_features, test_labels)
print('All metric appended')
#show(mertics) in panda dataframe
metric_df = pd.DataFrame(model_metric) # change into pandas dataframe
metric_df
"""== using plotly to create table== """
init_notebook_mode(connected=True) # initiate the Plotly Notebook mode
table = ff.create_table(np.round(metric_df, 3)) #create tabel
iplot(table)
In this section, i am looking which models perfome better in term of positive (Churn) class prediciton. Thus, like to look on metric like recall and precious score.
Suprisinngly, Logistic regression, Decison tree and Random forest did very well in in recall score, and Random Forest, Gradient boosting and XGBoost have bigger precious score. Logistic Regression has the least precious score.
In addition, Random forest did very well in the overall model classification with higest auc_score.