Why does Tensorflow 2 give a warning (but still work anyway) when the input is a pandas dataframe?

use*_*623 7 numpy dataframe pandas tensorflow tensorflow2.0

On Tensorflow 2.0, whenever I pass a Pandas DataFrame as the input, then Tensorflow works fine but prints out a warning WARNING:tensorflow:Falling back from v2 loop because of error: Failed to find data adapter that can handle input: <class 'pandas.core.frame.DataFrame'>, <class 'NoneType'>. I don't recall ever getting that error with TF 1.x so this must be new. But why is it a warning?

I understand what it's asking for, and yes, converting that DataFrame to a pure numpy array does make the error go away. But why does TF care? Despite the warning, it's clearly able to work fine with a DataFrame. Scikit-learn also expects a numpy array, yet that works fine when you pass a DataFrame. TF 1.x worked fine with a DataFrame too. Pandas is incredibly common, so why does TF 2.0 pretend it can't handle it (even though it clearly can)? Is it just an efficiency thing where TF didn't want to pay the cost of converting that DataFrame to a TF.DataSet? But TF is now asking me to do that conversion instead, so how is that anymore efficient than just letting TF do the conversion itself? (and besides, surely the overhead of converting pandas input a single time at the start is negligible compared to the billions of multiplications during training?)

import tensorflow as tf
import numpy as np

#Make some fake data
df = pd.DataFrame()
NUM_ROWS = 1000
NUM_FEATURES = 50
random_data = np.random.normal(size=(NUM_ROWS, NUM_FEATURES))
df = pd.DataFrame(data=random_data, columns=['x_' + str(ii) for ii in range(NUM_FEATURES)])
y = df.sum(axis=1) + np.random.normal(size=(NUM_ROWS))

model = tf.keras.Sequential([
            tf.keras.layers.Dense(40, input_dim=df.shape[1], activation='relu'),
            tf.keras.layers.Dense(1, activation='linear')
        ])
NUM_EPOCHS = 500

model.compile(optimizer='adam', loss='mean_squared_error');
hist = model.fit(df, y, epochs=1, verbose=0) ###This gives the warning (but still works fine anyway)
Run Code Online (Sandbox Code Playgroud)

What is the purpose of that warning?

小智 1

我能够在 2019 年 11 月 23 日的Tensorflow 版本 2.1TF 2.0的提交617f788中重新创建您的问题并已修复

因此,请将您的Tensorflow版本升级到2.12.2,问题就会得到解决。

工作代码如下:

!pip install tensorflow==2.2.0

import tensorflow as tf
import numpy as np
import pandas as pd

print(tf.__version__)

#Make some fake data
df = pd.DataFrame()
NUM_ROWS = 1000
NUM_FEATURES = 50
random_data = np.random.normal(size=(NUM_ROWS, NUM_FEATURES))
df = pd.DataFrame(data=random_data, columns=['x_' + str(ii) for ii in range(NUM_FEATURES)])
y = df.sum(axis=1) + np.random.normal(size=(NUM_ROWS))

model = tf.keras.Sequential([
            tf.keras.layers.Dense(40, input_dim=df.shape[1], activation='relu'),
            tf.keras.layers.Dense(1, activation='linear')
        ])
NUM_EPOCHS = 500

model.compile(optimizer='adam', loss='mean_squared_error')

hist = model.fit(df, y, epochs=1, verbose=1)
Run Code Online (Sandbox Code Playgroud)

输出:

2.2.0
Train on 1000 samples
1000/1000 [==============================] - 0s 411us/sample - loss: 49.0524
Run Code Online (Sandbox Code Playgroud)

如果您观察到输出警告不再出现。