I've been investigating an issue with one of our application queries running very slowly and it seems to be related to a particular view we're using.
Here is the full query with the selected columns omitted (there are 408 columns being selected)
SELECT `skus`.`yom_sku`
-- omitted columns (408)
FROM `skus`
LEFT JOIN `product_families_view`
ON `product_families_view`.`product_id` = `skus`.`yom_sku`
LEFT JOIN `product_profit_projection_view`
ON `product_profit_projection_view`.`product_id` =
`skus`.`yom_sku`
LEFT JOIN `product_media_types_view`
ON `product_media_types_view`.`product_id` = `skus`.`yom_sku`
LEFT JOIN `product`
ON `product`.`yom_sku` = `skus`.`yom_sku`
LEFT JOIN `product_listings`
ON `product_listings`.`asin` = `skus`.`asin`
AND `product_listings`.`seller_sku` = `skus`.`upc`
LEFT JOIN `analyst`
ON `analyst`.`yom_sku` = `skus`.`yom_sku`
LEFT JOIN `collection`
ON `collection`.`yom_sku` = `skus`.`yom_sku`
LEFT JOIN `supplier_pricing`
ON `supplier_pricing`.`yom_sku` = `skus`.`yom_sku`
LEFT JOIN `amazon_order`
ON `amazon_order`.`yom_sku` = `skus`.`yom_sku`
LEFT JOIN `amazon_pricing`
ON `amazon_pricing`.`yom_sku` = `skus`.`yom_sku`
WHERE `product`.`retired` = false
ORDER BY `skus`.`upc` ASC
LIMIT 100
The query is quite large, but one view in particular (the 'product_media_types_view') is taking a significant amount of read resources for some reason. This is particularly odd because it's nearly identical in structure to another view called "product_families_view" and there are no apparent indexing issues.
Here are the two views for comparison:
-- pricetooldb.product_families_view source
CREATE OR REPLACE
ALGORITHM = UNDEFINED VIEW `pricetooldb`.`product_families_view` AS
select
`pfp`.`product_id` AS `product_id`,
`pfp`.`part_number` AS `part_number`,
group_concat(`pf`.`family` order by `pf`.`family` ASC separator ', ') AS `families`
from
(`pricetooldb`.`product_families` `pf`
join `pricetooldb`.`product_families_products` `pfp` on
((`pfp`.`family_id` = `pf`.`id`)))
group by
`pfp`.`product_id`;
-- pricetooldb.product_media_types_view source
CREATE OR REPLACE
ALGORITHM = UNDEFINED VIEW `pricetooldb`.`product_media_types_view` AS
select
`pmt`.`product_id` AS `product_id`,
group_concat(`mt`.`name` order by `mt`.`name` ASC separator ' ') AS `media_types`
from
(`pricetooldb`.`media_types` `mt`
join `pricetooldb`.`product_media_types` `pmt` on
((`pmt`.`media_type_id` = `mt`.`id`)))
group by
`pmt`.`product_id`;
I've run EXPLAIN EXTENDED on the problem SQL query above and found the product_media_types_view takes significantly more resources than the highly similar product_families_view
[
{
"table": {
"table_name": "product_families_view",
"access_type": "ref",
"possible_keys": [
"<auto_key0>"
],
"key": "<auto_key0>",
"used_key_parts": [
"product_id"
],
"key_length": "4",
"ref": [
"pricetooldb.skus.yom_sku"
],
"rows_examined_per_scan": 10,
"rows_produced_per_join": 75142,
"filtered": "100.00",
"cost_info": {
"read_cost": "45073.14",
"eval_cost": "15028.53",
"prefix_cost": "165159.79",
"data_read_per_join": "4M"
},
"used_columns": [
"product_id",
"part_number",
"families"
],
"materialized_from_subquery": {
"using_temporary_table": true,
"dependent": false,
"cacheable": true,
"query_block": {
"select_id": 2,
"cost_info": {
"query_cost": "48405.37"
},
"grouping_operation": {
"using_temporary_table": true,
"using_filesort": true,
"cost_info": {
"sort_cost": "21756.26"
},
"nested_loop": [
{
"table": {
"table_name": "pf",
"access_type": "index",
"possible_keys": [
"PRIMARY"
],
"key": "product_families_family_unique",
"used_key_parts": [
"family"
],
"key_length": "257",
"rows_examined_per_scan": 2653,
"rows_produced_per_join": 2653,
"filtered": "100.00",
"using_index": true,
"cost_info": {
"read_cost": "11.00",
"eval_cost": "530.60",
"prefix_cost": "541.60",
"data_read_per_join": "683K"
},
"used_columns": [
"id",
"family"
]
}
},
{
"table": {
"table_name": "pfp",
"access_type": "ref",
"possible_keys": [
"product_entry_unique",
"product_families_products_family_id_foreign"
],
"key": "product_families_products_family_id_foreign",
"used_key_parts": [
"family_id"
],
"key_length": "4",
"ref": [
"pricetooldb.pf.id"
],
"rows_examined_per_scan": 8,
"rows_produced_per_join": 21756,
"filtered": "100.00",
"cost_info": {
"read_cost": "21756.26",
"eval_cost": "4351.25",
"prefix_cost": "26649.11",
"data_read_per_join": "1M"
},
"used_columns": [
"id",
"product_id",
"family_id",
"part_number"
]
}
}
]
}
}
}
}
},
{
"table": {
"table_name": "product_media_types_view",
"access_type": "ref",
"possible_keys": [
"<auto_key0>"
],
"key": "<auto_key0>",
"used_key_parts": [
"product_id"
],
"key_length": "4",
"ref": [
"pricetooldb.skus.yom_sku"
],
"rows_examined_per_scan": 10,
"rows_produced_per_join": 1214678,
"filtered": "100.00",
"cost_info": {
"read_cost": "728729.15",
"eval_cost": "242935.67",
"prefix_cost": "1321922.89",
"data_read_per_join": "27M"
},
"used_columns": [
"product_id",
"media_types"
],
"materialized_from_subquery": {
"using_temporary_table": true,
"dependent": false,
"cacheable": true,
"query_block": {
"select_id": 5,
"cost_info": {
"query_cost": "122384.12"
},
"grouping_operation": {
"using_filesort": false,
"nested_loop": [
{
"table": {
"table_name": "p",
"access_type": "index",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"yom_sku"
],
"key_length": "4",
"rows_examined_per_scan": 46669,
"rows_produced_per_join": 46669,
"filtered": "100.00",
"using_index": true,
"cost_info": {
"read_cost": "867.00",
"eval_cost": "9333.80",
"prefix_cost": "10200.80",
"data_read_per_join": "74M"
},
"used_columns": [
"yom_sku"
]
}
},
{
"table": {
"table_name": "pmt",
"access_type": "ref",
"possible_keys": [
"product_media_types_product_id_media_type_id_unique"
],
"key": "product_media_types_product_id_media_type_id_unique",
"used_key_parts": [
"product_id"
],
"key_length": "5",
"ref": [
"pricetooldb.p.yom_sku"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 46795,
"filtered": "100.00",
"using_index": true,
"cost_info": {
"read_cost": "46669.22",
"eval_cost": "9359.16",
"prefix_cost": "66229.17",
"data_read_per_join": "731K"
},
"used_columns": [
"id",
"product_id",
"media_type_id"
]
}
},
{
"table": {
"table_name": "mt",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"pricetooldb.pmt.media_type_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 46795,
"filtered": "100.00",
"cost_info": {
"read_cost": "46795.79",
"eval_cost": "9359.16",
"prefix_cost": "122384.12",
"data_read_per_join": "731K"
},
"used_columns": [
"id",
"name"
]
}
}
]
}
}
}
}
}
]
It isn't clear to me why. I'll provide the 'product', 'product_media_types', and 'media_types' DDLs below (with irrelevant columns from 'product' redacted)
-- pricetooldb.product definition
CREATE TABLE `product` (
`yom_sku` int(11) unsigned NOT NULL,
PRIMARY KEY (`yom_sku`),
CONSTRAINT `yom_sku_prod_fk` FOREIGN KEY (`yom_sku`) REFERENCES `skus` (`yom_sku`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
-- pricetooldb.product_media_types definition
CREATE TABLE `product_media_types` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`product_id` int(11) unsigned DEFAULT NULL,
`media_type_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `product_media_types_product_id_media_type_id_unique` (`product_id`,`media_type_id`),
KEY `product_media_types_media_type_id_foreign` (`media_type_id`),
CONSTRAINT `product_media_types_media_type_id_foreign` FOREIGN KEY (`media_type_id`) REFERENCES `media_types` (`id`) ON DELETE CASCADE,
CONSTRAINT `product_media_types_product_id_foreign` FOREIGN KEY (`product_id`) REFERENCES `product` (`yom_sku`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=46247 DEFAULT CHARSET=latin1;
-- pricetooldb.media_types definition
CREATE TABLE `media_types` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
Please let me know if you have any suggestions or ideas about why the "product_media_types_view" is so expensive.
Answers
The performance difference between the "product_families_view" and the "product_media_types_view" views could be due to several factors. Here are some potential reasons:
-
Data Volume: The "product_media_types_view" may have a significantly larger volume of data compared to the "product_families_view". This could result in longer processing times for queries involving the "product_media_types_view".
-
Join Complexity: The join conditions and structure of the underlying tables in the "product_media_types_view" may be more complex than those in the "product_families_view". This complexity can impact query performance, especially when dealing with large datasets.
-
Indexing: Check if the tables involved in the "product_media_types_view" have appropriate indexes on the columns used in join and filter conditions. Inefficient indexing can lead to slow query execution.
-
Subquery Performance: The subquery within the "product_media_types_view" may be less optimized compared to the subquery in the "product_families_view". Review the execution plans and indexes used in the subqueries to identify any potential optimization opportunities.
-
Data Distribution: Analyze the data distribution in the underlying tables of the "product_media_types_view". Skewed data distribution or data outliers can affect query performance.
To improve the performance of the "product_media_types_view", you can consider the following steps:
- Optimize the underlying table structures, indexes, and join conditions to reduce query execution time.
- Use appropriate indexing strategies to speed up data retrieval.
- Consider partitioning or sharding large tables to distribute the data more evenly.
- Rewrite the view definition or underlying queries to minimize unnecessary calculations or joins.
- Monitor and analyze query execution plans to identify bottlenecks and areas for optimization.
Additionally, using tools like MySQL EXPLAIN and profiling queries can provide valuable insights into query performance and help identify areas for improvement.