FeatureDataCleaner¶

from eflow.data_pipeline_segments.feature_data_cleaner import FeatureDataCleaner

class FeatureDataCleaner(segment_id=None, create_file=True)[source]¶

Designed for a multipurpose data cleaner.

drop_feature(df, df_features, feature_name, _add_to_que=True)[source]¶

Drop a feature in the dataframe.

Args:

df: pd.Dataframe: Pandas Dataframe
df_features: DataFrameType from eflow: Organizes feature types into groups.
feature_name: string: Name of the feature in the datatframe
_add_to_que: bool: Pushes the function to pipeline segment parent if set to ‘True’.

fill_nan_by_distribution(df, df_features, feature_name, percentile, z_score=None, _add_to_que=True)[source]¶

Fill nan by the distribution of data.

Args:

percentile: float or int

z_score:

_add_to_que: bool: Pushes the function to pipeline segment parent if set to ‘True’.

ignore_feature(df, df_features, feature_name, _add_to_que=True)[source]¶

Ignore the given feature.

Args:

df: pd.Dataframe: Pandas Dataframe
df_features: DataFrameType from eflow: Organizes feature types into groups.
feature_name: string: Name of the feature in the datatframe
_add_to_que: bool: Pushes the function to pipeline segment parent if set to ‘True’.

make_nan_assertions(df, df_features, feature_name, _add_to_que=True)[source]¶

Make nan assertions for boolean features.

Args:

df: pd.Dataframe: Pandas Dataframe
df_features: DataFrameType from eflow: Organizes feature types into groups.
feature_name: string: Name of the feature in the datatframe
_add_to_que: bool: Pushes the function to pipeline segment parent if set to ‘True’.

remove_nans(df, df_features, feature_name, _add_to_que=True)[source]¶

Remove rows of data based on the given feature.

Args:

df: pd.Dataframe: Pandas Dataframe
df_features: DataFrameType from eflow: Organizes feature types into groups.
feature_name: string: Name of the feature in the datatframe
_add_to_que: bool: Pushes the function to pipeline segment parent if set to ‘True’.

run_widget(df, df_features, nan_feature_names=[])[source]¶