深度阅读

How to extract a numeric value from a string column in a Pandas DataFrame?

作者
作者
2023年08月22日
更新时间
11.88 分钟
阅读时间
0
阅读量

To extract a numeric value from a string column in a Pandas DataFrame, you can use regular expressions with the str.extract() method. Here is an example:

import pandas as pd

# Create a sample DataFrame with a column of strings containing numeric values
data = {'value': ['10 units', '5.6 kg', '22.5%', '3']}
df = pd.DataFrame(data)

# Extract the numeric values from the 'value' column
df['value_numeric'] = df['value'].str.extract('(\d+\.?\d*)').astype(float)

# Print the updated DataFrame
print(df)

This code will output the following DataFrame with a new ‘value_numeric’ column containing the extracted numeric values:

value  value_numeric
0  10 units           10.0
1    5.6 kg            5.6
2     22.5%           22.5
3         3            3.0

In this example, the regular expression captures one or more digits (plus an optional decimal point and more digits) using the \d+\.?\d* pattern. The astype(float) method is used to convert the extracted values to floating point numbers.

Note that depending on the format of the strings in your DataFrame, you may need to adjust the regular expression pattern to correctly capture the numeric values.

相关标签

博客作者

热爱技术,乐于分享,持续学习。专注于Web开发、系统架构设计和人工智能领域。