Project Overview
This project utilized data “The Instacart Online Grocery Shopping
Dataset 2017” or the instacart
dataset from package
p8105.datasets
to explore the trends of offerings and
purchases in online grocery shopping. The project utilized R version
4.3.1 and R packages dplyr
, ggplot2
,
tidyverse
, and p8105.datasets
.
The instacart dataset has 1384617 observations and 15 variables, including:
order_id
: order identifier, class integer.product_id
: product identifier, class integer.add_to_cart_order
: order in which each product was added to cart, class integer.reordered
: 1 if this prodcut has been ordered by this user in the past, 0 otherwise, class integer.user_id
: customer identifier, class integer.eval_set
: which evaluation set this order belongs in (Note that the data for use in this class is exclusively from the “train” eval_set), class character.order_number
: the order sequence number for this user (1=first, n=nth), class integer.order_dow
: the day of the week on which the order was placed, class integer.order_hour_of_day
: the hour of the day on which the order was placed, class integer.days_since_prior_order
: days since the last order, capped at 30, NA if order_number=1, class integer.product_name
: name of the product, class character.aisle_id
: aisle identifier, class integer.department_id
: department identifier, class integer.aisle
: the name of the aisle, class character.department
: the name of the department, class character.
Visualizations and EDA
Table for Number of Orders, by Day of Week
order_dow | n_obs |
---|---|
0 | 324026 |
1 | 205978 |
2 | 160562 |
3 | 154381 |
4 | 155481 |
5 | 176910 |
6 | 207279 |
There are 131209 unique users and 131209 unique orders. On average, there are 197802.43 items sold per day. There are 134 aisles and 21 departments. There are 134 aisles, and most items are ordered from aisle fresh vegetables.
Department Barplot
Aisles Barplot, colored by departments
Number of items ordered in each aisle, for aisles with more than 10000 items ordered.
There are 39 aisles with more than 10,000 items ordered. Most aisles have less than 40,000 items ordered. There are 5 aisles that have more than 40,000 items sold. They are: fresh fruits, fresh vegetables, packaged cheese, packaged vegetables fruits, yogurt.
Three most popular items in aisles “baking ingredients”, “dog food care”, and “packaged vegetables fruits”
aisle | product_name | n_obs |
---|---|---|
packaged vegetables fruits | Organic Baby Spinach | 9784 |
packaged vegetables fruits | Organic Raspberries | 5546 |
packaged vegetables fruits | Organic Blueberries | 4966 |
baking ingredients | Light Brown Sugar | 499 |
baking ingredients | Pure Baking Soda | 387 |
baking ingredients | Cane Sugar | 336 |
dog food care | Snack Sticks Chicken & Rice Recipe Dog Treats | 30 |
dog food care | Organix Chicken & Brown Rice Recipe | 28 |
dog food care | Small Dog Biscuits | 26 |
Among the three aisles, Organic Baby Spinach has the most sales of 9784.
Mean hour of day at which Pink Lady Apples and Coffee Ice Cream are ordered on each day of the week
Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday |
---|---|---|---|---|---|---|
13.6 | 12.17391 | 12.83824 | 14.68519 | 13.17308 | 12.64286 | 13.25 |
Wednesday has the latest mean hour of day at which Pink Lady Apples and Coffee Ice Cream are ordered.