I am using minmaxscaler trying to scaling each column. The scaled result for each column is always all zero. For example , below the values of df_test_1 after finishing scaling is all zero. But even with all values of zero, using inverse_transferm from this values of zero can still revert back to original values. But why the results of scaled are shown all zero?
from sklearn.preprocessing import MinMaxScaler
df_dict={'A':[-1,-0.5,0,1],'B':[2,6,10,18]}
df_test=pd.DataFrame(df_dict)
print('original scale data')
print(df_test)
scaler_model_list=[]
df_test_1=df_test.copy()
for col in df_test.columns:
scaler = MinMaxScaler()
scaler_model_list.append(scaler) # need to save scalerfor each column since there are different if we want to use inverse_transform() later
df_test_1.loc[:,col]=scaler.fit_transform(df_test_1.loc[:,col].values.reshape(1,-1))[0]
print('after finishing scaling')
print(df_test_1)
print('after inverse transformation')
print(scaler_model_list[0].inverse_transform(df_test_1.iloc[:,0].values.reshape(1,-1)))
print(scaler_model_list[1].inverse_transform(df_test_1.iloc[:,1].values.reshape(1,-1)))
original scale data
A B
0 -1.0 2
1 -0.5 6
2 0.0 10
3 1.0 18
after finishing scaling
A B
0 0.0 0
1 0.0 0
2 0.0 0
3 0.0 0
after inverse transformation
[[-1. -0.5 0. 1. ]]
[[ 2. 6. 10. 18.]]
I am using minmaxscaler trying to scaling each column. The scaled result for each column is always all zero. For example , below the values of df_test_1 after finishing scaling is all zero. But even with all values of zero, using inverse_transferm from this values of zero can still revert back to original values. But why the results of scaled are shown all zero?
from sklearn.preprocessing import MinMaxScaler
df_dict={'A':[-1,-0.5,0,1],'B':[2,6,10,18]}
df_test=pd.DataFrame(df_dict)
print('original scale data')
print(df_test)
scaler_model_list=[]
df_test_1=df_test.copy()
for col in df_test.columns:
scaler = MinMaxScaler()
scaler_model_list.append(scaler) # need to save scalerfor each column since there are different if we want to use inverse_transform() later
df_test_1.loc[:,col]=scaler.fit_transform(df_test_1.loc[:,col].values.reshape(1,-1))[0]
print('after finishing scaling')
print(df_test_1)
print('after inverse transformation')
print(scaler_model_list[0].inverse_transform(df_test_1.iloc[:,0].values.reshape(1,-1)))
print(scaler_model_list[1].inverse_transform(df_test_1.iloc[:,1].values.reshape(1,-1)))
original scale data
A B
0 -1.0 2
1 -0.5 6
2 0.0 10
3 1.0 18
after finishing scaling
A B
0 0.0 0
1 0.0 0
2 0.0 0
3 0.0 0
after inverse transformation
[[-1. -0.5 0. 1. ]]
[[ 2. 6. 10. 18.]]
According to MinMaxScaler DOC:
X : array-like of shape (n_samples, n_features) The data used to compute the per-feature minimum and maximum used for later scaling along the features axis.
When you reshape your data here:
df_test_1.loc[:,df_test.columns[1]].values.reshape(1,-1)
you get 1 row data with 4 columns in your case (and only 1 value in each of them) instead of 1 column with 4 rows .
You can fix your code one of the following ways:
df_test_1.loc[:,col]=scaler.fit_transform(df_test_1.loc[:,col].values.reshape(-1,1))
df_test_1.loc[:,col]=scaler.fit_transform(df_test_1.loc[:,[col]])
df_test_2=df_test.copy()
scaler = MinMaxScaler()
cols = df_test.columns
df_test_2.loc[:,cols]=scaler.fit_transform(df_test_2.loc[:,cols])