from sklearn.impute import SimpleImputer imputer = SimpleImputer(strategy='median') # for the entire df df = pd.DataFrame(imputer.fit_transform(df), columns=df.columns) # for specific column df['Embarked'] = pd.DataFrame(imputer.fit_transform(pd.DataFrame(df['Embarked'], columns=['Embarked'])), columns=['Embarked'])
Here is what the above code is Doing:
1. We create an instance of the SimpleImputer class.
2. We specify the imputation strategy to be ‘median’.
3. We fit the imputer instance to the df.
4. We transform the df by replacing missing values with the learned medians.
5. We create a new df with the transformed data.