Bellabeat Capstone Case Study

Author

Adrianne Padua

Published

October 5, 2025

1 Introduction

This capstone project analyzes smart device usage data to explore health behaviors and trends among Bellabeat users.

Bellabeat is a high-tech company that manufactures health-focused smart products, specifically designed for women. The goal is to derive actionable insights that support Bellabeat’s marketing strategies to further enhance their models, such as the Leaf series and IVY+.

2 About the Data

The dataset used in this case study is the FitBit Fitness Tracker Data (2016), publicly available on Kaggle.

It includes 30 participants and provides daily activity, sleep tracking, and weight logs. Though limited in sample size and collection year, it is useful for exploratory analysis.

Code
library(tidyverse)
Warning: package 'tibble' was built under R version 4.3.3
Warning: package 'tidyr' was built under R version 4.3.2
Warning: package 'purrr' was built under R version 4.3.3
Warning: package 'lubridate' was built under R version 4.3.3
Code
library(lubridate)

daily_activity <- read_csv("data/cleaned/dailyActivity_all.csv")
sleep_data    <- read_csv("data/cleaned/sleep_all.csv")
weight_data   <- read_csv("data/cleaned/weight_all.csv")

glimpse(daily_activity)
Rows: 1,397
Columns: 17
$ Id                       <dbl> 1503960366, 1503960366, 1503960366, 150396036…
$ ActivityDate             <date> 2016-03-25, 2016-03-26, 2016-03-27, 2016-03-…
$ TotalSteps               <dbl> 11004, 17609, 12736, 13231, 12041, 10970, 122…
$ TotalDistance            <dbl> 7.11, 11.55, 8.53, 8.93, 7.85, 7.16, 7.86, 7.…
$ TrackerDistance          <dbl> 7.11, 11.55, 8.53, 8.93, 7.85, 7.16, 7.86, 7.…
$ LoggedActivitiesDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ VeryActiveDistance       <dbl> 2.57, 6.92, 4.66, 3.19, 2.16, 2.36, 2.29, 3.3…
$ ModeratelyActiveDistance <dbl> 0.46, 0.73, 0.16, 0.79, 1.09, 0.51, 0.49, 0.8…
$ LightActiveDistance      <dbl> 4.07, 3.91, 3.71, 4.95, 4.61, 4.29, 5.04, 3.6…
$ SedentaryActiveDistance  <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.0…
$ VeryActiveMinutes        <dbl> 33, 89, 56, 39, 28, 30, 33, 47, 40, 15, 43, 3…
$ FairlyActiveMinutes      <dbl> 12, 17, 5, 20, 28, 13, 12, 21, 11, 30, 18, 18…
$ LightlyActiveMinutes     <dbl> 205, 274, 268, 224, 243, 223, 239, 200, 244, …
$ SedentaryMinutes         <dbl> 804, 588, 605, 1080, 763, 1174, 820, 866, 636…
$ Calories                 <dbl> 1819, 2154, 1944, 1932, 1886, 1820, 1889, 186…
$ ActiveMinutes            <dbl> 250, 380, 329, 283, 299, 266, 284, 268, 295, …
$ SedentaryRatio           <dbl> 0.7628083, 0.6074380, 0.6477516, 0.7923698, 0…

3 Data Cleaning

The following steps were applied during data cleaning:

  • Checked for missing or null values
  • Converted date columns into proper Date format
  • Removed duplicates
  • Filtered users with complete activity and sleep logs
  • Merged relevant tables for deeper insights
Code
# Convert ActivityDate only if not already a Date
if (!inherits(daily_activity$ActivityDate, "Date")) {
  daily_activity <- daily_activity %>%
    mutate(ActivityDate = as.Date(ActivityDate, format = "%Y-%m-%d"))
}

4 Analysis

4.2 Sleep Behavior

Code
sleep_summary <- sleep_data %>%
  summarise(
    avg_minutes_asleep = mean(TotalMinutesAsleep, na.rm=TRUE),
    avg_time_in_bed = mean(TotalTimeInBed, na.rm=TRUE)
  )
sleep_summary
# A tibble: 1 × 2
  avg_minutes_asleep avg_time_in_bed
               <dbl>           <dbl>
1               419.            458.

Key Observations:

  • Many users were in bed longer than they were asleep, indicating low sleep efficiency.
  • The average user slept about 6.9 hours, below the CDC’s recommended 7+ hours for adults.

4.3 Weight and Activity

  • Fewer participants logged weight data, limiting analysis.
  • Those who did often showed irregular tracking behavior.
  • There’s potential to encourage habit-building for long-term fitness goals.

5 Insights

  • Users are not consistently meeting physical activity recommendations.
  • Sedentary behavior dominates most user profiles.
  • Sleep patterns are inconsistent, and sleep tracking could benefit from deeper analysis.
  • Data logging (especially weight) is inconsistent, which limits feedback effectiveness.

6 Recommendations

  1. Nutrition & Sugar Intake Logging
    Expand the Bellabeat app to allow manual logging of nutrition and sugar intake, helping users monitor how their diet affects their health goals.

  2. Integration with Glucose Meters
    Add support for third-party glucose monitoring devices (e.g., OneTouch Verio), inspired by existing solutions like Kaiser Permanente’s KP Health Ally app. This feature can help diabetic users (like myself) track patterns more holistically.

  3. Expanded Pregnancy & Postpartum Support for Ivy+
    Enhance the IVY+ smart wearable with features tailored to pregnant and postpartum users, such as gentle movement goals, hydration reminders, and sleep tracking optimized for these stages of life.

  4. Condition-Specific Product Innovations
    Consider designing specialized Leaf devices (e.g., Leaf Diabetic) aimed at chronic conditions. These could be developed in collaboration with healthcare providers and even prescribed as part of digital wellness programs.

  5. Data-Driven Behavior Nudges
    Incorporate personalized AI-based nudges in the Bellabeat ecosystem to encourage positive habits—like hitting step goals, improving sleep hygiene, or managing sugar intake.

7 Conclusion

While the dataset had limitations (small sample, 2016 data), the findings provide useful direction for Bellabeat’s product and marketing teams.

By focusing on improving sleep feedback, daily movement, and user engagement, Bellabeat can better tailor its offerings to support women’s health and wellness habits through its smart products.