How to remove duplicates in pandas?

作者

2023年08月22日

更新时间

10.26 分钟

阅读时间

阅读量

To remove duplicates in a pandas DataFrame, you can use the drop_duplicates() method. This method returns a new DataFrame with duplicate rows removed, based on one or more columns. Here is an example:

import pandas as pd

# Create a DataFrame with duplicate rows
df = pd.DataFrame({'col1': ['A', 'B', 'A'], 'col2': [1, 2, 1]})

# Remove duplicates based on col1 and col2 columns
df = df.drop_duplicates(['col1', 'col2'])

# Print the new DataFrame
print(df)

In this code, we create a DataFrame df with duplicate rows, and then use the drop_duplicates() method to remove duplicates based on the col1 and col2 columns. The resulting DataFrame has only the unique rows.

If you want to remove duplicates based on all columns, you can call drop_duplicates() without any arguments:

# Remove duplicates based on all columns
df = df.drop_duplicates()

# Print the new DataFrame
print(df)

In this code, we call drop_duplicates() without any arguments to remove duplicates based on all columns.

How to remove duplicates in pandas?

相关标签

How to check the shape of a PyTorch tensor?

How to drop columns in pandas?

博客作者

GLM 是真敢删啊？！说好的 P0 安全规范呢？

如果要投票一个最弱智的ai模型一定是千问

告别手动拼接：PromptForge 如何重新定义你的 AI 工作流

Privacy Policy for TerryVoiceRead Chrome Extension

告别龟速！NAS迅雷内测体验，速度起飞，附邀请码！