Lets first have a look at provided datasets:
This file gives a list of all orders we have in the dataset. 1 row per order.
order_id | user_id | eval_set | order_number | order_dow | order_hour_of_day | days_since_prior_order |
---|---|---|---|---|---|---|
2539329 | 1 | prior | 1 | 2 | 8 | NA |
2398795 | 1 | prior | 2 | 3 | 7 | 15 |
473747 | 1 | prior | 3 | 3 | 12 | 21 |
2254736 | 1 | prior | 4 | 4 | 7 | 29 |
431534 | 1 | prior | 5 | 4 | 15 | 28 |
Observations: 3,421,083
Variables: 7
$ order_id <int> 2539329, 2398795, 473747, 2254736, 4315...
$ user_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, ...
$ eval_set <chr> "prior", "prior", "prior", "prior", "pr...
$ order_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 2...
$ order_dow <int> 2, 3, 3, 4, 4, 2, 1, 1, 1, 4, 4, 2, 5, ...
$ order_hour_of_day <int> 8, 7, 12, 7, 15, 7, 9, 14, 16, 8, 8, 11...
$ days_since_prior_order <dbl> NA, 15, 21, 29, 28, 19, 20, 14, 0, 30, ...
This file gives us information about which products (product_id) were ordered. It also contains information of the order (add_to_cart_order) in which the products were put into the cart and information of whether this product is a re-order(1) or not(0).
order_id | product_id | add_to_cart_order | reordered |
---|---|---|---|
1 | 49302 | 1 | 1 |
1 | 11109 | 2 | 1 |
1 | 10246 | 3 | 0 |
1 | 49683 | 4 | 0 |
1 | 43633 | 5 | 1 |
Observations: 1,384,617
Variables: 4
$ order_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 36, 36, 36, 36, 36, ...
$ product_id <int> 49302, 11109, 10246, 49683, 43633, 13176, 47...
$ add_to_cart_order <int> 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7,...
$ reordered <int> 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1,...
This file contains the names of the products with their corresponding product_id.
product_id | product_name | aisle_id | department_id |
---|---|---|---|
1 | Chocolate Sandwich Cookies | 61 | 19 |
2 | All-Seasons Salt | 104 | 13 |
3 | Robust Golden Unsweetened Oolong Tea | 94 | 7 |
4 | Smart Ones Classic Favorites Mini Rigatoni With Vodka Cream Sauce | 38 | 1 |
5 | Green Chile Anytime Sauce | 5 | 13 |
Observations: 49,688
Variables: 4
$ product_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1...
$ product_name <chr> "Chocolate Sandwich Cookies", "All-Seasons Salt"...
$ aisle_id <int> 61, 104, 94, 38, 5, 11, 98, 116, 120, 115, 31, 1...
$ department_id <int> 19, 13, 7, 1, 13, 11, 7, 1, 16, 7, 7, 1, 11, 17,...
This file is structurally the same as the other_products_train.csv.
order_id | product_id | add_to_cart_order | reordered |
---|---|---|---|
2 | 33120 | 1 | 1 |
2 | 28985 | 2 | 1 |
2 | 9327 | 3 | 0 |
2 | 45918 | 4 | 1 |
2 | 30035 | 5 | 0 |
Observations: 32,434,489
Variables: 4
$ order_id <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3,...
$ product_id <int> 33120, 28985, 9327, 45918, 30035, 17794, 401...
$ add_to_cart_order <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6,...
$ reordered <int> 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,...
This file contains the different aisles.
aisle_id | aisle |
---|---|
1 | prepared soups salads |
2 | specialty cheeses |
3 | energy granola bars |
4 | instant foods |
5 | marinades meat preparation |
Observations: 134
Variables: 2
$ aisle_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16...
$ aisle <chr> "prepared soups salads", "specialty cheeses", "energy...
department_id | department |
---|---|
1 | frozen |
2 | other |
3 | bakery |
4 | produce |
5 | alcohol |
Observations: 21
Variables: 2
$ department_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1...
$ department <chr> "frozen", "other", "bakery", "produce", "alcohol...
When people buy groceries online.
Sunday and Monday are the days when people order most on Instacart
Banana is most purchase item followed Organic Strawberries and Baby Spinach.
lhs | rhs | support | confidence | lift |
---|---|---|---|---|
{Organic Hass Avocado,Organic Raspberries,Organic Strawberries} | {Bag of Organic Bananas} | 0.0017377 | 0.5984252 | 5.072272 |
{Organic Cucumber,Organic Hass Avocado,Organic Strawberries} | {Organic Strawberries, Organic Cucumber} | 0.0010670 | 0.5468750 | 4.635331 |
{Organic Hass Avocado,Organic Kiwi} | {Organic Whole String Cheese} | 0.0014481 | 0.5459770 | 4.627720 |
{Organic Navel Orange,Organic Raspberries} | {Bag of Organic Bananas} | 0.0011508 | 0.5412186 | 4.587387 |
{Organic Hass Avocado,Organic Whole String Cheese} | {Organic Navel Orange} | 0.0011585 | 0.5314685 | 4.504745 |
{Organic Hass Avocado,Organic Navel Orange} | {Bag of Organic Bananas} | 0.0014938 | 0.5283019 | 4.477905 |
{Organic Hass Avocado,Organic Raspberries} | {Yellow Onions} | 0.0040470 | 0.5210991 | 4.416854 |
{Organic D’Anjou Pears,Organic Hass Avocado} | {Bag of Organic Bananas} | 0.0013871 | 0.5170455 | 4.382495 |
{Organic Hass Avocado,Organic Unsweetened Almond Milk} | {Bag of Organic Bananas} | 0.0012499 | 0.5141066 | 4.357585 |
{Organic Broccoli,Organic Hass Avocado} | {Bag of Organic Bananas} | 0.0011966 | 0.5348232 | 3.758898 |
Observations: 11
Variables: 5
$ lhs <chr> "{Organic Hass Avocado,Organic Raspberries,Organic ...
$ rhs <chr> "{Bag of Organic Bananas}", "{Bag of Organic Banana...
$ support <dbl> 0.001737686, 0.001067000, 0.001448071, 0.001150836,...
$ confidence <dbl> 0.5984252, 0.5468750, 0.5459770, 0.5412186, 0.53146...
$ lift <dbl> 5.072272, 4.635331, 4.627719, 4.587387, 4.504745, 4...
lhs | rhs | support | confidence | lift |
---|---|---|---|---|
{cereal,lunch meat} | {bread} | 0.0076595 | 0.4574420 | 2.793210 |
{chips pretzels,lunch meat,packaged cheese} | {bread} | 0.0073699 | 0.4472710 | 2.731105 |
{lunch meat,milk,yogurt} | {bread} | 0.0080406 | 0.4429051 | 2.704446 |
{lunch meat,milk,packaged cheese} | {bread} | 0.0084826 | 0.4427208 | 2.703320 |
{packaged cheese,preserved dips spreads} | {chips pretzels} | 0.0072708 | 0.4760479 | 2.694408 |
{cookies cakes,crackers} | {chips pretzels} | 0.0077129 | 0.4755639 | 2.691669 |
{eggs,fresh fruits,milk,packaged cheese} | {bread} | 0.0071413 | 0.4337963 | 2.648826 |
{lunch meat,pasta sauce} | {packaged cheese} | 0.0080254 | 0.6290323 | 2.645427 |
{fresh dips tapenades,ice cream ice} | {chips pretzels} | 0.0073699 | 0.4662488 | 2.638946 |
{fresh fruits,lunch meat,packaged cheese,yogurt} | {bread} | 0.0078348 | 0.4321143 | 2.638556 |
Observations: 10,022
Variables: 5
$ lhs <chr> "{cereal,lunch meat}", "{chips pretzels,lunch meat,...
$ rhs <chr> "{bread}", "{bread}", "{bread}", "{bread}", "{chips...
$ support <dbl> 0.007659536, 0.007369921, 0.008040607, 0.008482650,...
$ confidence <dbl> 0.4574420, 0.4472710, 0.4429051, 0.4427208, 0.47604...
$ lift <dbl> 2.793210, 2.731105, 2.704446, 2.703320, 2.694408, 2...
[‘Soda’, ‘Organic String Cheese’, ‘0% Greek Strained Yogurt’, ‘XL Pick-A-Size Paper Towel Rolls’, ‘Milk Chocolate Almonds’, ‘Pistachios’, ‘Cinnamon Toast Crunch’, ‘Aged White Cheddar Popcorn’, ‘Organic Whole Milk’, ‘Organic Half & Half’, ‘Zero Calorie Cola’]
[‘Soda’, ‘0% Greek Strained Yogurt’, ‘Clementines’, ‘Bag of Organic Bananas’, ‘Organic Half & Half’, ‘Apples’, ‘Zero Calorie Cola’, “Crunchy Oats ’n Honey Granola Bars”, ‘Extra Fancy Unsalted Mixed Nuts’, ‘Reduced Fat 2% Milk’]
[‘Sugar, Organic’, ‘Penne Rigate #41 Pasta’, ‘Half & Half’, ‘Shredded Parmesan’, ‘Sustainably Soft Bath Tissue’, ‘Organic Seasoned Yukon Select Potatoes Hashed Browns’, ‘Classic Mild Cheddar Macaroni & Cheese’, ‘Original Coconut Milk Creamer’, ‘Coconut Milk Non Dairy Frozen Dessert No Sugar Added Mint Chip’, ‘Dairy Free Coconut Milk Yogurt Alternative’, ‘Organic Hass Avocado’, ‘Organic Extra Large Grade AA Brown Eggs’, ‘Organic Lightly Salted Sea Salt Thin & Crispy Restaurant Style Tortilla Chips’, ‘Basil Dish Soap’, ‘Shredded Mild Cheddar Cheese’, ‘Moisturizing Yuzu Shower Gel’, ‘Organic Tomato Cluster’, ‘Organic Russet Potato’, ‘Spinach’]
[‘Organic Hass Avocado’, ‘Sparkling Water Grapefruit’, ‘Half & Half’, ‘Lime Sparkling Water’, ‘Sparkling Lemon Water’, ‘2% Reduced Fat Milk’, ‘Organic Yellow Onion’, ‘Pure Sparkling Water’, ‘Organic Grape Tomatoes’, ‘Organic Garlic’]
[‘Extra Creamy Whipped Cream’, ‘Ibuprofen 200 Mg’, ‘La Lechera Sweetened Condensed Milk, Fat Free’, ‘Vitamin D Added Evaporated Milk’, ‘Organic Whipping Cream’, ‘Blueberries’, ‘All Natural Virgin Lemonade’, ‘White Paper Towels’]
[‘Half & Half’, ‘Organic Avocado’, ‘Organic Fuji Apple’, ‘Banana’, ‘Large Lemon’, ‘Organic Strawberries’, ‘Bag of Organic Bananas’, ‘Unsweetened Almondmilk’, ‘Raspberries’, ‘Organic Large Brown Grade AA Cage Free Eggs’]
[‘Strained Non-Fat Strawberry Icelandic Style Skyr Yogurt’, ‘Icelandic Style Skyr Blueberry Non-fat Yogurt’, ‘French Vanilla Half+Half’, ‘Spring Water’, ‘Organic Iceberg Lettuce’, ‘Organic Granny Smith Apple’, ‘Organic Roma Tomato’, ‘Pico De Gallo’, ‘No Salt Added Black Beans’, ‘Organic Sour Cream’, ‘Brussels Sprouts’, ‘Organic Rainbow Carrots’, “S’mores Chocolate Ice Cream”, ‘Organic Cucumber’, ‘Organic Sliced Mozzarella’, ‘Organic California Style Sprouted Bread’]
[‘Spring Water’, ‘Cucumber Kirby’, ‘Organic Gala Apples’, ‘Organic Granny Smith Apple’, ‘Icelandic Style Skyr Blueberry Non-fat Yogurt’, ‘Non Fat Raspberry Yogurt’, ‘Vanilla Skyr Nonfat Yogurt’, ‘Organic Kiwi’, ‘Organic Grape Tomatoes’, ‘Nonfat Icelandic Style Strawberry Yogurt’]
[‘Mexican Finely Shredded Cheese’, ‘Organic Unsweetened Black Tea’, ‘Organic Hot Salsa’, ‘Soft Taco Size White Flour Tortillas’]
[‘Strawberries’, ‘Organic Yellow Onion’, ‘Organic Garlic’, ‘Large Alfresco Eggs’, ‘Raspberries’, ‘Organic Garnet Sweet Potato (Yam)’, ‘Organic Grape Tomatoes’, ‘Fat Free Milk’, ‘Icelandic Style Skyr Blueberry Non-fat Yogurt’, ‘Organic Small Bunch Celery’]