یافته های شخصی من

طبقه بندی موضوعی
کلمات کلیدی

How To Delete Duplicate Rows in PostgreSQL

سه شنبه, ۱۴ فروردين ۱۳۹۷، ۱۱:۰۳ ق.ظ

Summary: in this tutorial, you will learn how to use various techniques to delete duplicate rows in PostgreSQL.

Preparing sample data

First, create a new table named basket that stores fruits:

Second, insert some fruits into the basket table.

Third, query data from the basket table:

fruit table

As you can see, we have some duplicate rows such as 2 apples and 3 oranges in the basket table.

Finding duplicate rows

If the table has few rows, you can see which ones are duplicate immediately. However, it is not the case with the big table.

The find the duplicate rows, you use the following statement:

Deleting duplicate rows using DELETE USING statement

The following statement uses the DELETE USING statement to remove duplicate rows:

In this example, we joined the basket table to itself and checked if two different rows (a.id < b.id) have the same value in the fruit column.

Let’s query the basket table again to verify whether the duplicate rows were deleted:

delete duplicate rows in postgresql example - keep highest id

As you can see, the statement removed the duplicate rows with lowest ids and keep the one with the highest id.

If you want to keep the duplicate rows with the lowest id, you use just need to flip the operator in the WHERE clause:

To check whether the statement works correctly, let’s verify the data in the basket table:

Result:

delete duplicate rows in postgresql example - keep lowest id

Perfect! the duplicate rows with the lowest ids are retained.

Deleting duplicate rows using subquery

The following statement uses a suquery to delete duplicate rows and keep the row with the lowest id.

In this example, the subquery returned the duplicate rows except for the first row in the duplicate group. And the outer DELETE statement deleted the duplicate rows returned by the subquery.

If you want to keep the duplicate row with highest id, just change the order in the subquery:

In case you want to delete duplicate based on values of multiple columns, here is the query template:

In this case, the statement will delete all rows with duplicate values in the column_1 and column_2columns.

Deleting duplicate rows using an immediate table

To delete rows using an immediate table, you use the following steps:

  1. Create a new table with the same structure as the one whose duplicate rows should be removed.
  2. Insert distinct rows from the source table to the immediate table.
  3. Drop the source table.
  4. Rename the immediate table to the name of the source table.

The following illustrates the steps of removing duplicate rows from the basket table:

In this tutorial, you have learned how to delete duplicate rows in PostgreSQL using the DELETE USINGstatement, subquery, and the immediate table techniques.

  • علی امین زاده

نظرات  (۰)

هیچ نظری هنوز ثبت نشده است
ارسال نظر آزاد است، اما اگر قبلا در بیان ثبت نام کرده اید می توانید ابتدا وارد شوید.
شما میتوانید از این تگهای html استفاده کنید:
<b> یا <strong>، <em> یا <i>، <u>، <strike> یا <s>، <sup>، <sub>، <blockquote>، <code>، <pre>، <hr>، <br>، <p>، <a href="" title="">، <span style="">، <div align="">
تجدید کد امنیتی