A Guide to Upsert in MySQL

This post covers everything you need to know about upserts in MySQL.

What is upsert in MySQL?

Upsert, a portmanteau of "update" and "insert", is a database operation that inserts rows into a database table if they do not already exist, or updates them if they do. In MySQL, you typically do this with either the INSERT ... ON DUPLICATE KEY UPDATE syntax or the REPLACE INTO statement.

When should you use upsert?

Upsert is important when you want to avoid duplicate rows in your table while also making sure that the latest data is always present. It's especially useful in data synchronization tasks, bulk inserts, or when handling unique constraint violations.

Insert ... on duplicate key update

This is the most common approach for upsert operations in MySQL. It works by attempting an insert, and if a duplicate key error occurs (i.e., a row with the same primary or unique key already exists), it performs an update instead.

Syntax

INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...) ON DUPLICATE KEY UPDATE column1 = value1, column2 = value2, ...;

Example

INSERT INTO users (id, name, email) VALUES (1, 'John Doe', 'john@example.com') ON DUPLICATE KEY UPDATE name = 'John Doe', email = 'john@example.com';

In this example, if a user with id 1 already exists, their name and email will be updated.

Replace into

You can also do upserts in MySQL by using the REPLACE INTO statement. This approach first tries to insert a new row into the table. If a duplicate key error occurs, it deletes the old row and then inserts the new row.

Syntax

REPLACE INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);

Example

REPLACE INTO users (id, name, email) VALUES (1, 'Jane Doe', 'jane@example.com');

Here, if a user with id 1 exists, they are deleted, and the new data is inserted.

How to determine fields affected by MySQL upsert

Using Information Functions

MySQL provides information functions that you can use to determine the effect of the last executed statement:

  • ROW_COUNT(): This function returns the number of rows affected by the last statement. In the context of upsert:
    • For INSERT ... ON DUPLICATE KEY UPDATE, it returns 1 if a row is inserted, and 2 if an existing row is updated.
    • For REPLACE INTO, it returns 1 for a new insert, and 2 if it replaces an existing row (one deletion and one insertion).

Examining the Last Insert ID

  • LAST_INSERT_ID(): Useful primarily with INSERT ... ON DUPLICATE KEY UPDATE, this function returns the AUTO_INCREMENT id of the last row inserted. If an update occurs instead of an insert, the function returns the AUTO_INCREMENT id of the updated row, if the table has an AUTO_INCREMENT column, otherwise 0.

Logging Affected Fields

For more detailed tracking:

  1. Triggers: Implementing triggers on your table can help log changes. You can create a trigger for both insert and update operations that logs the old and new values into a separate audit table.
  2. Application Logic: If the upsert is executed via an application, you can add logic to the application to check the current state of the row and log the changes accordingly.

Choosing between the two methods

You should use INSERT ... ON DUPLICATE KEY UPDATE if you need to preserve other column values that are not part of the unique key. This method only updates the specified columns and leaves others untouched.

Use REPLACE INTO when you want to replace the entire row with new data. But remember that this deletes the existing row and inserts a new one, which you might want to be careful of (changing auto-incremented keys, etc.)

A couple other things to consider

  • Make sure your table has a PRIMARY KEY or UNIQUE index. Upsert operations rely on these to determine duplicates.
  • For INSERT ... ON DUPLICATE KEY UPDATE, you can use the VALUES() function to refer to the new row’s values.
  • Performance varies between the two methods, especially for large datasets, due to the differences in how they handle existing rows.

Basedash and MySQL upsert operations

Basedash is a solid tool to manage operations on your MySQL database. It lets you:

  • View and edit data in your MySQL database with an admin panel that takes 2 minutes to build.
  • Share and manage database access with your team.
  • Write, optimize, and share SQL queries (including upsert operations!).
  • Create great charts and dashboards.

Invite only

We're building the next generation of data visualization.