深度阅读

How to Remove non-ASCII characters Python

作者
作者
2023年08月22日
更新时间
9.89 分钟
阅读时间
0
阅读量

To remove non-ASCII characters from a string in Python, you can use regular expressions or the string.printable attribute. Here are a few examples:

Using regular expressions:

import re

my_string = "Héllo wörld!"
my_string = re.sub(r'[^\x00-\x7F]+', '', my_string)

print(my_string)

In this example, re.sub() replaces any character that is not within the range of \x00-\x7F with an empty string. The output will be “Hello world!”.

Using string.printable:

import string

my_string = "Héllo wörld!"
my_string = ''.join(filter(lambda x: x in string.printable, my_string))

print(my_string)

In this example, string.printable contains all the ASCII characters that are considered printable. filter() is used to keep only the characters in my_string that are within string.printable. The output will also be “Hello world!”.

Note that the second method may give you unexpected results if you have non-ASCII characters that are considered printable.

博客作者

热爱技术,乐于分享,持续学习。专注于Web开发、系统架构设计和人工智能领域。