eflow.data_pipeline_segments.data_encoder

Functions

get_parameters(f)

Get a the parameters of a given function definition

get_synonyms(word)

Classes

BOOL_STRINGS

alias of eflow._hidden.constants.Enum

DataEncoder([segment_id, create_file])

Attempts to convert features to the correct types.

DataPipelineSegment(object_type[, …])

Holds the function name’s and arguments to be pushed to a json file.

Exceptions

UnsatisfiedRequirments([error_message])

class DataEncoder(segment_id=None, create_file=True)[source]

Attempts to convert features to the correct types. Will update the dataframe and df_features.

apply_value_representation(df, df_features, _add_to_que=True)[source]

Translate features into most understandable/best representation

Args:
df: pd.Dataframe

Pandas dataframe.

df_features: DataFrameTypes from eflow

DataFrameTypes object.

_add_to_que: bool

Hidden variable to determine if the function should be pushed to the pipeline segment.

decode_data(df, df_features, apply_value_representation=True, _add_to_que=True)[source]

Decode the data into non-numerical values for more descriptive analysis.

Args:
df: pd.Dataframe

Pandas dataframe.

df_features: DataFrameTypes from eflow

DataFrameTypes object.

apply_value_representation: bool

Translate features into most understandable/best representation/

_add_to_que: bool

Hidden variable to determine if the function should be pushed to the pipeline segment.

encode_data(df, df_features, apply_value_representation=True, _add_to_que=True)[source]

Encode the data into numerical values for machine learning processes.

Args:
df: pd.Dataframe

Pandas dataframe.

df_features: DataFrameTypes from eflow

DataFrameTypes object.

apply_value_representation: bool

Translate features into most understandable/best representation/

_add_to_que: bool

Hidden variable to determine if the function should be pushed to the pipeline segment.

make_dummies(df, df_features, qualitative_features=[], _feature_values_dict=None, _add_to_que=True)[source]

Create dummies features of based on qualtative feature data and removes the original feature.

Note

_feature_values_dict does not need to be init. Used for backend resource.

Args:
df: pd.Dataframe

Pandas dataframe.

df_features: DataFrameTypes from eflow

DataFrameTypes object.

qualtative_features: collection of strings

Feature names to convert the feature data into dummy features.

_add_to_que: bool

Hidden variable to determine if the function should be pushed to the pipeline segment.

make_values_bool(df, df_features, _add_to_que=True)[source]

Convert all string bools to numeric bool value

Args:
df: pd.Dataframe

Pandas dataframe.

df_features: DataFrameTypes from eflow

DataFrameTypes object.

_add_to_que: bool

Hidden variable to determine if the function should be pushed to the pipeline segment.

revert_dummies(df, df_features, qualitative_features=[], _add_to_que=True)[source]

Convert dummies features back to the original feature.

Args:
df: pd.Dataframe

Pandas dataframe.

df_features: DataFrameTypes from eflow

DataFrameTypes object.

qualitative_features: collection of strings

Feature names to convert the dummy features into original feature data.

_add_to_que: bool

Hidden variable to determine if the function should be pushed to the pipeline segment.

revert_value_representation(df, df_features, _add_to_que=True)[source]

Translate features back into worst representation

Args:
df: pd.Dataframe

Pandas dataframe.

df_features: DataFrameTypes from eflow

DataFrameTypes object.

_add_to_que: bool

Hidden variable to determine if the function should be pushed to the pipeline segment.