I was solving Danny Ma's Pizza Runner SQL Case study. In one of the sections Pricing and Ratings, I got stuck in 2 questions and I'm unable to arrive at the right solution. Link to the case study: https://8weeksqlchallenge.com/case-study-2/
[cleaned_customer_orders table](https://i.stack.imgur.com/fEgXb.png)
[cleaned_runner_orders table](https://i.stack.imgur.com/sl6kd.png)
[extras table](https://i.stack.imgur.com/Ggm0B.png)
CREATE TABLE customer_orders (
"order_id" INTEGER,
"customer_id" INTEGER,
"pizza_id" INTEGER,
"exclusions" VARCHAR(4),
"extras" VARCHAR(4),
"order_time" TIMESTAMP
);
INSERT INTO customer_orders
("order_id", "customer_id", "pizza_id", "exclusions", "extras", "order_time")
VALUES
('1', '101', '1', '', '', '2020-01-01 18:05:02'),
('2', '101', '1', '', '', '2020-01-01 19:00:52'),
('3', '102', '1', '', '', '2020-01-02 23:51:23'),
('3', '102', '2', '', NULL, '2020-01-02 23:51:23'),
('4', '103', '1', '4', '', '2020-01-04 13:23:46'),
('4', '103', '1', '4', '', '2020-01-04 13:23:46'),
('4', '103', '2', '4', '', '2020-01-04 13:23:46'),
('5', '104', '1', 'null', '1', '2020-01-08 21:00:29'),
('6', '101', '2', 'null', 'null', '2020-01-08 21:03:13'),
('7', '105', '2', 'null', '1', '2020-01-08 21:20:29'),
('8', '102', '1', 'null', 'null', '2020-01-09 23:54:33'),
('9', '103', '1', '4', '1, 5', '2020-01-10 11:22:59'),
('10', '104', '1', 'null', 'null', '2020-01-11 18:34:49'),
('10', '104', '1', '2, 6', '1, 4', '2020-01-11 18:34:49');
CREATE TABLE runner_orders (
"order_id" INTEGER,
"runner_id" INTEGER,
"pickup_time" VARCHAR(19),
"distance" VARCHAR(7),
"duration" VARCHAR(10),
"cancellation" VARCHAR(23)
);
INSERT INTO runner_orders
("order_id", "runner_id", "pickup_time", "distance", "duration", "cancellation")
VALUES
('1', '1', '2020-01-01 18:15:34', '20km', '32 minutes', ''),
('2', '1', '2020-01-01 19:10:54', '20km', '27 minutes', ''),
('3', '1', '2020-01-03 00:12:37', '13.4km', '20 mins', NULL),
('4', '2', '2020-01-04 13:53:03', '23.4', '40', NULL),
('5', '3', '2020-01-08 21:10:57', '10', '15', NULL),
('6', '3', 'null', 'null', 'null', 'Restaurant Cancellation'),
('7', '2', '2020-01-08 21:30:45', '25km', '25mins', 'null'),
('8', '2', '2020-01-10 00:15:02', '23.4 km', '15 minute', 'null'),
('9', '2', 'null', 'null', 'null', 'Customer Cancellation'),
('10', '1', '2020-01-11 18:50:20', '10km', '10minutes', 'null');
select order_id, customer_id, pizza_id,
case
when exclusions in ('null', '') then null
else exclusions
end as exclusions,
case
when extras in ('null', '') then null
else extras
end as extras,
order_time
into cleaned_customer_orders
from customer_orders
select
order_id,
runner_id,
cast(case
when pickup_time = 'null' then null
else pickup_time
end as timestamp) as pickup_time,
cast(case
when distance = 'null' then null
else trim('km' from distance)
end as float) as distance,
cast(case
when duration = 'null' then null
else substring(duration, 1, 2)
end as int)as duration,
case
when cancellation in ('null', '') then null
else cancellation
end as cancellation
into cleaned_runner_orders
from runner_orders;
select * from cleaned_customer_orders
select * from cleaned_runner_orders
alter table cleaned_customer_orders
add column record_id serial;
select * from cleaned_customer_orders;
-- to generate extras table
select
cco.record_id,
trim(e.value) as topping_id
into extras
from
cleaned_customer_orders as cco
cross join lateral unnest(string_to_array(cco.extras, ',')) as e(value);
Q1. Query
select
sum(case when cco.pizza_id = 1 then 12 else 10 end) as pizza_cost
from cleaned_customer_orders cco
join pizza_names pn on pn.pizza_id = cco.pizza_id
join cleaned_runner_orders cro on cro.order_id = cco.order_id
where cro.cancellation is null
Q2. Query
with cte_extras_cost as (select
sum(case when pn.pizza_id = 1 then 12 else 10 end) as pizza_cost,
count (e.topping_id)*1 as extras_cost
from cleaned_customer_orders cco
join pizza_names pn on pn.pizza_id = cco.pizza_id
left join extras e on e.record_id = cco.record_id
join cleaned_runner_orders cro on cro.order_id = cco.order_id
where cro.cancellation is null)
select pizza_cost, coalesce(extras_cost, 0) as extras_cost, (pizza_cost+coalesce(extras_cost, 0))
as total_cost
from cte_extras_cost;
Q5. Query
with cte as (
select
sum(cro.distance*0.30) as delivery_cost,
sum(case when cco.pizza_id = 1 then 12 else 10 end) as total_cost
from cleaned_runner_orders cro
join cleaned_customer_orders cco on cro.order_id = cco.order_id
where cro.cancellation is null)
select cte.total_cost, cte.delivery_cost, (cte.total_cost - cte.delivery_cost) as actual_cost
from cte
My output: Q1. 138 Q2. pizza_cost=150, extras_cost=4, total_cost=154 Q5. total_cost=138, delivery_cost=64, actual_cost=73
The correct answers are Q1. 138 Q2. pizza_cost=138, extras_cost=4, total_cost=142 Q5. total_cost=138, delivery_cost=43, actual_cost=94
Answers
Let's go through the questions one by one and identify the issues in your queries.
Q1. Query:
select
sum(case when cco.pizza_id = 1 then 12 else 10 end) as pizza_cost
from cleaned_customer_orders cco
join pizza_names pn on pn.pizza_id = cco.pizza_id
join cleaned_runner_orders cro on cro.order_id = cco.order_id
where cro.cancellation is null
This query calculates the total cost of pizzas. There doesn't seem to be any reference to the 'pizza_names' table, and it's not clear where the pizza prices are coming from. Assuming that pizza_id directly correlates with pizza cost, the logic to calculate the total pizza cost seems correct.
Q2. Query:
with cte_extras_cost as (
select
sum(case when pn.pizza_id = 1 then 12 else 10 end) as pizza_cost,
count(e.topping_id)*1 as extras_cost
from cleaned_customer_orders cco
join pizza_names pn on pn.pizza_id = cco.pizza_id
left join extras e on e.record_id = cco.record_id
join cleaned_runner_orders cro on cro.order_id = cco.order_id
where cro.cancellation is null
)
select
pizza_cost,
coalesce(extras_cost, 0) as extras_cost,
(pizza_cost + coalesce(extras_cost, 0)) as total_cost
from cte_extras_cost;
In this query, the pizza cost calculation logic seems correct. However, the extras cost calculation logic doesn't look accurate. You're counting the number of toppings and multiplying it by 1, which doesn't seem right. Instead, you should sum up the prices of the toppings. Assuming each topping has a price, you should join with the 'toppings' table (which is not provided in your data) and sum up the prices.
Q5. Query:
with cte as (
select
sum(cro.distance * 0.30) as delivery_cost,
sum(case when cco.pizza_id = 1 then 12 else 10 end) as total_cost
from cleaned_runner_orders cro
join cleaned_customer_orders cco on cro.order_id = cco.order_id
where cro.cancellation is null
)
select
cte.total_cost,
cte.delivery_cost,
(cte.total_cost - cte.delivery_cost) as actual_cost
from cte;
In this query, the logic to calculate the delivery cost seems correct. However, the total cost calculation logic seems to be similar to Q1, which doesn't seem right according to the expected output. You might need to review the calculation of the total cost.
For Q2 and Q5, you need additional information about the toppings' prices and more clarification on how to calculate the total cost. Once you have that information, you can adjust your queries accordingly.