How to remove duplicate lines from a text file using Linux command line?

作者

2023年08月22日

更新时间

10.75 分钟

阅读时间

阅读量

To remove duplicate lines from a text file using Linux command line, you can use the sort and uniq commands together. Here’s the command:

sort file.txt | uniq > output_file.txt

This command sorts the lines in file.txt and then passes them to the uniq command, which removes any duplicate lines. The result is then written to output_file.txt.

Alternatively, you can use the awk command to achieve the same result:

awk '!seen[$0]++' file.txt > output_file.txt

This command uses an Awk script which builds an associative array called “seen” to keep track of the lines that have already been seen. If the line has not been seen before, it is printed to the output file.

Note that the sort command is not necessary if the file is already sorted or you don’t otherwise care about the order of the lines.

Also, be careful when using these commands on large files, as they may consume a large amount of memory or take a long time to run. In such cases, you may need to use more specialized tools or scripts to remove duplicates efficiently.

How to remove duplicate lines from a text file using Linux command line?

相关标签

How to send log files to a remote server in Linux?

How to delete a line from a text file in Linux?

博客作者

GLM 是真敢删啊？！说好的 P0 安全规范呢？

如果要投票一个最弱智的ai模型一定是千问

告别手动拼接：PromptForge 如何重新定义你的 AI 工作流

Privacy Policy for TerryVoiceRead Chrome Extension

告别龟速！NAS迅雷内测体验，速度起飞，附邀请码！