一、描述
pandas执行astype报如下错误
Traceback (most recent call last):
File "pull_dsp_data.py", line 130, in
run(args)
File "pull_dsp_data.py", line 106, in run
tidbDF = get_tidb_data(args,official_conn)
File "pull_dsp_data.py", line 34, in get_tidb_data
return dspDF.astype(dtype=base_columns)
File "/usr/local/lib/python2.7/dist-packages/pandas/util/_decorators.py", line 178, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 4990, in astype
results.append(col.astype(dtype[col_name], copy=copy))
File "/usr/local/lib/python2.7/dist-packages/pandas/util/_decorators.py", line 178, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 5001, in astype
**kwargs)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 3714, in astype
return self.apply('astype', dtype=dtype, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 3581, in apply
applied = getattr(b, f)(**kwargs)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 575, in astype
**kwargs)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 664, in _astype
values = astype_nansafe(values.ravel(), dtype, copy=True)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/dtypes/cast.py", line 702, in astype_nansafe
raise ValueError('Cannot convert non-finite values (NA or inf) to '
ValueError: Cannot convert non-finite values (NA or inf) to integer
二、分析
1、打印dataframe结果集,发现click_count列有一个是NaN,如下
advertiser_id click_count day impr_count revenue \
0 adv3427130351616 219.0 20200809 40637 5.737697600
1 adv3282638354816 19.0 20200809 5951 0.687649090
2 adv3427187237632 1.0 20200809 8 0.001196250
3 adv3282638354816 56.0 20200808 10165 1.188067360
4 adv3427130351616 262.0 20200808 75339 10.952638380
5 adv3427187237632 NaN 20200808 83 0.009640220
np.nan或者np.inf都是float的类型,而且无法转成int,所以就报错了
三、解决方法
把nan或者inf替换成0,如下
# 方式一
df1=df.fillna(0)
# 方式二
df1=df.replace(np.nan, 0, inplace=True)
df2=df.replace(np.inf, 0, inplace=True)
注意:本文归作者所有,未经作者允许,不得转载