-
Notifications
You must be signed in to change notification settings - Fork 3
๐ Database Performance Improvement Story(EN)
- There was a logic that performed a full scan of all pair rooms to check for duplicate access codes during pair room creation.
- This issue was not noticeable with a small amount of data, but as the data grew, it put a strain on the server, resulting in long response times for creation requests.
- Similarly, during link retrieval, the logic involved fetching all links into memory and then filtering for pair rooms.
- Just like with pair room creation, this became problematic as the amount of dummy data increased, leading to server strain.
- A bottleneck occurred during the execution of
findAll()
, taking over 5 minutes.
- An index was created on
PROVIDER_USER_ID
to optimize queries.
- Due to the issues mentioned above, we will check and resolve the database bottlenecks across all tables.
- Loaded Data: 5 million records
- Measurement Target: Query Time (sec)
-
Create Pair Room (pair_room)
INSERT INTO pair_room (access_code, created_at, driver, navigator, status, updated_at, id) VALUES (?, ?, ?, ?, ?, ?, default);
-
Replace Pair and Change Pair Room Status (pair_room)
UPDATE pair_room SET access_code=?, driver=?, navigator=?, status=?, updated_at=? WHERE id=?
-
Retrieve Pair Room Data by Specific Access Code (pair_room)
SELECT pair_room.id, pair_room.access_code, pair_room.created_at, pair_room.driver, pair_room.navigator, pair_room.status, pair_room.updated_at FROM pair_room WHERE pair_room.access_code = ?
-
Check Existence of Pair Room with Specific Access Code (pair_room)
SELECT pair_room.id FROM pair_room WHERE pair_room.access_code= ? LIMIT ?
-
Retrieve Pair Room-Member Reference Table Data by Specific Member ID (pair_room_member)
SELECT pair_room_member.id, pair_room_member.member_id, pair_room_member.pair_room_id FROM pair_room_member WHERE pair_room_member.member_id = ?
The columns used in the query conditions for the pair room service are as follows:
-
pair_room
access_code
-
pair_room_member
member_id
Since both columns have FK
& UNIQUE
constraints, it was determined that separate index configurations were unnecessary. In fact, executing the retrieval queries confirms that the data is retrieved at a very fast speed.
-
Performance Test for Retrieving Pair Room Data by Specific Access Code (pair_room)
- Query Time:
0.01 sec
With the
Unique Index
already set, the retrieval performance is excellent. - Query Time:
-
Performance Test for Retrieving Pair Room-Member Reference Table Data (pair_room_member)
- Query Time:
0.00 sec
With the
Index
already set, the retrieval performance is excellent. - Query Time:
It was decided not to set additional indexes for the tables related to the pair room service.
-
Create Member (member)
SELECT m1_0.id, m1_0.access_token, m1_0.created_at, m1_0.provider_login_id, m1_0.profile_image, m1_0.updated_at, m1_0.provider_user_id, m1_0.user_name FROM member m1_0 WHERE m1_0.provider_user_id=?
-
Retrieve Specific Member Information by Provider User ID (member)
SELECT m1_0.id, m1_0.access_token, m1_0.created_at, m1_0.provider_login_id, m1_0.profile_image, m1_0.updated_at, m1_0.provider_user_id, m1_0.user_name FROM member m1_0 WHERE m1_0.provider_user_id=?
-
Check Existence of Specific Member by Provider User ID (member)
SELECT m1_0.id FROM member m1_0 WHERE m1_0.provider_user_id=? LIMIT ?
-
Check Existence of Specific Member by User ID (member)
SELECT m1_0.id FROM member m1_0 WHERE m1_0.user_id=? LIMIT ?
-
Check Existence of Non-Deleted Member Information by User ID (member)
SELECT m1_0.id FROM member m1_0 WHERE m1_0.provider_user_id=? AND deleted_at IS NULL LIMIT ?
The columns used in the query conditions for the member service are as follows:
-
member
user_id
provider_user_id
Before and after configuring the index for the columns used in the conditions, the query times were measured by executing the same query.
-
The two queries used in the member service were executed almost identically, so only one query was tested.
-
The column using
user_id
was excluded because it already has a unique constraint. -
Performance Test for Retrieving Specific Member Information by Provider User ID (member)
![Before Index Configuration](https://github.com/user-attachments/assets/50c45547-d192-4139-831f-88d0218f9155)
- Time Taken: `1.45 sec`
![After Index Configuration](https://github.com/user-attachments/assets/6ccf25f0-37c6-4a95-adac-ad6cf650806d)
- Time Taken: `0.01 sec`
The query time was significantly reduced from 1.45 sec
to 0.01 sec
.
It was decided to configure an index for the provider_user_id
column in the member
table.
- Performance improvement was observed when the index was applied.
- Since the values in that column are not frequently changed, it is not necessary to be overly concerned about performance issues related to the index.
-
Retrieve all reference link data based on pair room ID (reference_link)
SELECT rl.id, rl.category_id, rl.created_at, rl.pair_room_id, rl.updated_at, rl.url FROM reference_link rl WHERE rl.pair_room_id=?
-
Delete a single reference link data based on reference link ID (reference_link)
DELETE FROM reference_link WHERE id=?
-
Retrieve all reference link category data based on pair room ID (category)
SELECT c.id, c.category_name, c.created_at, c.pair_room_id, c.updated_at FROM category c WHERE c.pair_room_id=?
-
Check the existence of a category based on pair room ID and category name (category)
SELECT c.id FROM category c WHERE c.category_name=? AND c.pair_room_id=? LIMIT ?
-
Check the existence of a category based on category ID and category name (category)
SELECT c.id FROM category c WHERE c.id=? AND c.pair_room_id=? LIMIT ?
-
Delete a single category data based on reference link ID and pair room ID (category)
DELETE FROM category WHERE pair_room_id=? AND id=?
-
Retrieve all open graph data based on reference link ID (open_graph)
SELECT og.id, og.created_at, og.description, og.head_title, og.image, og.open_graph_title, og.reference_link_id, og.updated_at FROM open_graph og WHERE og.reference_link_id=?
-
Delete a single open graph data based on reference link ID (open_graph)
DELETE FROM open_graph WHERE reference_link_id=?
The columns used for query conditions in the member service are as follows:
- reference_link
id
pair_room_id
- category
pair_room_id
-
pair_room_id
&category_name
-
category_id
&category_name
-
reference_link_id
&pair_room_id
- open_graph
reference_link_id
By default, we considered index configuration for the category_name
column, excluding the columns already indexed as PK
& FK
.
Since category_name
is not used alone in conditions but alongside FK
values, we thought it might still allow for fast queries without a separate index, so we conducted an experiment.
-
We only tested queries that utilize
pair_room_id
&category_name
due to their similar conditions. -
Check the existence of a category based on pair room ID and category name (category)
- Time taken for the query:
0.01 sec
We confirmed that the performance was fast when querying together with the already indexed pair_room_id
.
In conclusion, we decided not to create separate index settings for the tables related to the reference link service.
-
Retrieve all to-do data based on to-do ID (todo)
SELECT * FROM todo WHERE id = ?;
-
Retrieve all to-do data based on pair room ID in ascending order (todo)
SELECT * FROM todo td WHERE td.pair_room_id = ? ORDER BY td.sort ASC;
-
Retrieve the largest sort value in ascending order based on pair room ID (todo)
SELECT * FROM todo td WHERE td.pair_room_id = ? ORDER BY td.sort DESC LIMIT 1;
The columns used for query conditions in the to-do service are as follows:
- todo
id
pair_room_id
Since all columns used in the conditions are already indexed as PK
& FK
, there is no need to consider additional index configurations.
-
Retrieve timer data based on pair room ID (timer)
SELECT te1_0.id, te1_0.created_at, te1_0.duration, te1_0.pair_room_id, te1_0.remaining_time, te1_0.updated_at FROM timer te1_0 WHERE te1_0.pair_room_id=?
-
Retrieve timer data based on pair room access code (timer, pair_room)
SELECT te1_0.id, te1_0.created_at, te1_0.duration, te1_0.pair_room_id, te1_0.remaining_time, te1_0.updated_at FROM timer te1_0 LEFT JOIN pair_room pre1_0 ON pre1_0.id=te1_0.pair_room_id WHERE pre1_0.access_code=?
SELECT pre1_0.id, pre1_0.access_code, pre1_0.created_at, pre1_0.driver, pre1_0.navigator, pre1_0.status, pre1_0.updated_at FROM pair_room pre1_0 WHERE pre1_0.id=?
The columns used for query conditions in the timer service are as follows:
- timer
pair_room_id
Since the FK
is already indexed by default, we decided not to set an additional index.
- ๐ ํ๋ก์ ํธ ์๊ฐ
- ๐ ์ฌ์ฉ์ ์๋๋ฆฌ์ค
- โ๏ธ ๊ธฐ๋ฅ ๋ช ์ธ์
- ๐๏ธ ํ ๊ทธ๋ผ์ด๋ ๋ฃฐ
- ๐ ๏ธ ๊ธฐ์ ์คํ
- ๐ ๋ฐฑ์๋ ์ปจ๋ฒค์
- ๐ฅ๏ธ ๋ก๊น &๋ชจ๋ํฐ๋ง
- ๐ CI&CD
- ๐ DB ๊ถํ ์ค์
- ๐งง ์คํค๋ง
- ๐จ ๋๋ฏธ ๋ฐ์ดํฐ ์ฝ์
- ๐ DB ์ฑ๋ฅ ๊ฐ์ ๊ธฐ
- ๐ DB ์ฟผ๋ฆฌ ๋ฐ ์ธ๋ฑ์ค
- ๐งTPS ๋ฐ ํฐ์บฃ & HikariCP ์ค์
- ๐๋ฌด์ค๋จ ๋ฐฐํฌ
- ๐ก ์ฌ์ฉ์ ์์ ๋ฐ๋ ์ ์ง์ ๊ฐ์ ๋ฐฉ์