Number: 400
Type: Customer churn data for telecom company
•Gender: 0 for male & 1 for female
•Tenure: Value in months
•Phone service: 0 for no & 1 for yes
•Internet service: 0 for no, 1 for DPS & 2 for Fiber Optic
•Contract: 0 for monthly, 1 for yearly & 2 for bi-yearly
•Paperless Billing: 0 for no & 1 for yes
•Monthly charges: In dollars
•Churn: 0 for no & 1 for yes
Box plot & Descriptive Statistics
The all-inclusive monthly charges had a mean of $66.14
For customers who left the service, mean monthly charges were $73.41
On the other hand, customers continuing the service were paying $9.75 lower on average
There is also a big gap between the two groups' lower quartile.
Those who left the service had a Q1, which was $25.17 more.
ANOVA: Stability
To determine if the means of the two groups (churn vs no churn) are same, an ANOVA test can be conducted.
The first step is to check the stability of the data.
The control chart shows that the data is stable, and no Minitab rules have been broken.
Null hypothesis: Same mean for two groups
Alternate hypothesis: Different means for two groups
ANOVA: Shape
From the probability plot of tenure (in months) & monthly charges, the p-value for the plots can be used to determine if the data are normally distributed.
In this case, p value is lower than 0.05 for them which means that the data are not normally distributed.
ANOVA: Spread
As the data for monthly charges is non-normal, the Levene’s test will be used for the test of equal variance.
From the chart, p value of Levene’s test is less than 0.05 which means that the null hypothesis will be rejected.
There is unequal variance.
ANOVA: Centering
As the data is non-normal and has unequal variance, Krushkal-Wallis test is used to determine if the medians are equal or not.
Null hypothesis is rejected as p value is lower than 0.05.
The medians are not equal.
There is difference between the monthly charges mean for people who were in the churn category and for those who were not in the churn category.
Regression Analysis: Tenure vs Monthly Charges
Regression Analysis can be done for tenure and monthly charges to determine if the two variables are statistically different.
The average tenure for all 400 customers is 31.48 months. The highest tenure is 6 years and the lowest is one month.
The data for tenure is stable and does not break Minitab rules.
Regression Analysis: Tenure vs Monthly Charges
Both the linear and quadratic regression analysis has very low R square. Tenure only explains only about 5% of the variation in monthly charges for a customer.
For quadratic regression, the R square is slightly higher at 5.6%.
With p value= 0.026, the quadratic term in this analysis is significant.
Chi-Square Test: Internet Service vs Churn
Of the people who chose Fiber Optic, 110 customers churned.
If everything was independent, the expected no. of customers churned would have 133.
Whether a customer churns or not is dependent on the type of internet service he or she uses.