feast.infra package

Subpackages

Submodules

feast.infra.aws module

class feast.infra.aws.AwsProvider(config: feast.repo_config.RepoConfig)[source]

Bases: feast.infra.passthrough_provider.PassthroughProvider

get_feature_server_endpoint() Optional[str][source]

Returns endpoint for the feature server, if it exists.

teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity]) None[source]

Tears down all cloud resources for the specified set of Feast objects.

Parameters
  • project – Feast project to which the objects belong.

  • tables – Feature views whose corresponding infrastructure should be deleted.

  • entities – Entities whose corresponding infrastructure should be deleted.

update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]

Reconciles cloud resources with the specified set of Feast objects.

Parameters
  • project – Feast project to which the objects belong.

  • tables_to_delete – Feature views whose corresponding infrastructure should be deleted.

  • tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.

  • entities_to_delete – Entities whose corresponding infrastructure should be deleted.

  • entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.

  • partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.

feast.infra.gcp module

class feast.infra.gcp.GcpProvider(config: feast.repo_config.RepoConfig)[source]

Bases: feast.infra.passthrough_provider.PassthroughProvider

This class only exists for backwards compatibility.

feast.infra.infra_object module

class feast.infra.infra_object.Infra(infra_objects: List[feast.infra.infra_object.InfraObject] = <factory>)[source]

Bases: object

Represents the set of infrastructure managed by Feast.

Parameters

infra_objects – A list of InfraObjects, each representing one infrastructure object.

classmethod from_proto(infra_proto: feast.core.InfraObject_pb2.Infra)[source]

Returns an Infra object created from a protobuf representation.

infra_objects: List[feast.infra.infra_object.InfraObject]
to_proto() feast.core.InfraObject_pb2.Infra[source]

Converts Infra to its protobuf representation.

Returns

An InfraProto protobuf.

class feast.infra.infra_object.InfraObject(name: str)[source]

Bases: abc.ABC

Represents a single infrastructure object (e.g. online store table) managed by Feast.

abstract static from_infra_object_proto(infra_object_proto: feast.core.InfraObject_pb2.InfraObject) Any[source]

Returns an InfraObject created from a protobuf representation.

Parameters

infra_object_proto – A protobuf representation of an InfraObject.

Raises

FeastInvalidInfraObjectType – The type of InfraObject could not be identified.

static from_proto(infra_object_proto: Any) Any[source]

Converts a protobuf representation of a subclass to an object of that subclass.

Parameters

infra_object_proto – A protobuf representation of an InfraObject.

Raises

FeastInvalidInfraObjectType – The type of InfraObject could not be identified.

property name: str
abstract teardown()[source]

Tears down the infrastructure object.

abstract to_infra_object_proto() feast.core.InfraObject_pb2.InfraObject[source]

Converts an InfraObject to its protobuf representation, wrapped in an InfraObjectProto.

abstract to_proto() Any[source]

Converts an InfraObject to its protobuf representation.

abstract update()[source]

Deploys or updates the infrastructure object.

feast.infra.key_encoding_utils module

feast.infra.key_encoding_utils.serialize_entity_key(entity_key: feast.types.EntityKey_pb2.EntityKey, entity_key_serialization_version=1) bytes[source]

Serialize entity key to a bytestring so it can be used as a lookup key in a hash table.

We need this encoding to be stable; therefore we cannot just use protobuf serialization here since it does not guarantee that two proto messages containing the same data will serialize to the same byte string[1].

[1] https://developers.google.com/protocol-buffers/docs/encoding

feast.infra.key_encoding_utils.serialize_entity_key_prefix(entity_keys: List[str]) bytes[source]

Serialize keys to a bytestring, so it can be used to prefix-scan through items stored in the online store using serialize_entity_key.

This encoding is a partial implementation of serialize_entity_key, only operating on the keys of entities, and not the values.

feast.infra.local module

class feast.infra.local.LocalProvider(config: feast.repo_config.RepoConfig)[source]

Bases: feast.infra.passthrough_provider.PassthroughProvider

This class only exists for backwards compatibility.

plan_infra(config: feast.repo_config.RepoConfig, desired_registry_proto: feast.core.Registry_pb2.Registry) feast.infra.infra_object.Infra[source]

Returns the Infra required to support the desired registry.

Parameters
  • config – The RepoConfig for the current FeatureStore.

  • desired_registry_proto – The desired registry, in proto form.

feast.infra.passthrough_provider module

class feast.infra.passthrough_provider.PassthroughProvider(config: feast.repo_config.RepoConfig)[source]

Bases: feast.infra.provider.Provider

The passthrough provider delegates all operations to the underlying online and offline stores.

property batch_engine: feast.infra.materialization.batch_materialization_engine.BatchMaterializationEngine
get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.infra.registry.base_registry.BaseRegistry, project: str, full_feature_names: bool) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Retrieves the point-in-time correct historical feature values for the specified entity rows.

Parameters
  • config – The config for the current feature store.

  • feature_views – A list containing all feature views that are referenced in the entity rows.

  • feature_refs – The features to be retrieved.

  • entity_df – A collection of rows containing all entity columns on which features need to be joined, as well as the timestamp column used for point-in-time joins. Either a pandas dataframe can be provided or a SQL query.

  • registry – The registry for the current feature store.

  • project – Feast project to which the feature views belong.

  • full_feature_names – If True, feature names will be prefixed with the corresponding feature view name, changing them from the format “feature” to “feature_view__feature” (e.g. “daily_transactions” changes to “customer_fv__daily_transactions”).

Returns

A RetrievalJob that can be executed to get the features.

ingest_df(feature_view: feast.feature_view.FeatureView, df: pandas.core.frame.DataFrame)[source]

Persists a dataframe to the online store.

Parameters
  • feature_view – The feature view to which the dataframe corresponds.

  • df – The dataframe to be persisted.

ingest_df_to_offline_store(feature_view: feast.feature_view.FeatureView, table: pyarrow.lib.Table)[source]

Persists a dataframe to the offline store.

Parameters
  • feature_view – The feature view to which the dataframe corresponds.

  • df – The dataframe to be persisted.

materialize_single_feature_view(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, start_date: datetime.datetime, end_date: datetime.datetime, registry: feast.infra.registry.base_registry.BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm.std.tqdm]) None[source]

Writes latest feature values in the specified time range to the online store.

Parameters
  • config – The config for the current feature store.

  • feature_view – The feature view to materialize.

  • start_date – The start of the time range.

  • end_date – The end of the time range.

  • registry – The registry for the current feature store.

  • project – Feast project to which the objects belong.

  • tqdm_builder – A function to monitor the progress of materialization.

property offline_store
offline_write_batch(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, data: pyarrow.lib.Table, progress: Optional[Callable[[int], Any]]) None[source]
online_read(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, entity_keys: List[feast.types.EntityKey_pb2.EntityKey], requested_features: List[str] = None) List[source]

Reads features values for the given entity keys.

Parameters
  • config – The config for the current feature store.

  • table – The feature view whose feature values should be read.

  • entity_keys – The list of entity keys for which feature values should be read.

  • requested_features – The list of features that should be read.

Returns

A list of the same length as entity_keys. Each item in the list is a tuple where the first item is the event timestamp for the row, and the second item is a dict mapping feature names to values, which are returned in proto format.

property online_store
online_write_batch(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, data: List[Tuple[feast.types.EntityKey_pb2.EntityKey, Dict[str, feast.types.Value_pb2.Value], datetime.datetime, Optional[datetime.datetime]]], progress: Optional[Callable[[int], Any]]) None[source]

Writes a batch of feature rows to the online store.

If a tz-naive timestamp is passed to this method, it is assumed to be UTC.

Parameters
  • config – The config for the current feature store.

  • table – Feature view to which these feature rows correspond.

  • data – A list of quadruplets containing feature data. Each quadruplet contains an entity key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.

  • progress – Function to be called once a batch of rows is written to the online store, used to show progress.

retrieve_feature_service_logs(feature_service: feast.feature_service.FeatureService, start_date: datetime.datetime, end_date: datetime.datetime, config: feast.repo_config.RepoConfig, registry: feast.infra.registry.base_registry.BaseRegistry) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Reads logged features for the specified time window.

Parameters
  • feature_service – The feature service whose logs should be retrieved.

  • start_date – The start of the window.

  • end_date – The end of the window.

  • config – The config for the current feature store.

  • registry – The registry for the current feature store.

Returns

A RetrievalJob that can be executed to get the feature service logs.

retrieve_saved_dataset(config: feast.repo_config.RepoConfig, dataset: feast.saved_dataset.SavedDataset) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Reads a saved dataset.

Parameters
  • config – The config for the current feature store.

  • dataset – A SavedDataset object containing all parameters necessary for retrieving the dataset.

Returns

A RetrievalJob that can be executed to get the saved dataset.

teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity]) None[source]

Tears down all cloud resources for the specified set of Feast objects.

Parameters
  • project – Feast project to which the objects belong.

  • tables – Feature views whose corresponding infrastructure should be deleted.

  • entities – Entities whose corresponding infrastructure should be deleted.

update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]

Reconciles cloud resources with the specified set of Feast objects.

Parameters
  • project – Feast project to which the objects belong.

  • tables_to_delete – Feature views whose corresponding infrastructure should be deleted.

  • tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.

  • entities_to_delete – Entities whose corresponding infrastructure should be deleted.

  • entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.

  • partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.

write_feature_service_logs(feature_service: feast.feature_service.FeatureService, logs: Union[pyarrow.lib.Table, str], config: feast.repo_config.RepoConfig, registry: feast.infra.registry.base_registry.BaseRegistry)[source]

Writes features and entities logged by a feature server to the offline store.

The schema of the logs table is inferred from the specified feature service. Only feature services with configured logging are accepted.

Parameters
  • feature_service – The feature service to be logged.

  • logs – The logs, either as an arrow table or as a path to a parquet directory.

  • config – The config for the current feature store.

  • registry – The registry for the current feature store.

feast.infra.provider module

class feast.infra.provider.Provider(config: feast.repo_config.RepoConfig)[source]

Bases: abc.ABC

A provider defines an implementation of a feature store object. It orchestrates the various components of a feature store, such as the offline store, online store, and materialization engine. It is configured through a RepoConfig object.

get_feature_server_endpoint() Optional[str][source]

Returns endpoint for the feature server, if it exists.

abstract get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.infra.registry.base_registry.BaseRegistry, project: str, full_feature_names: bool) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Retrieves the point-in-time correct historical feature values for the specified entity rows.

Parameters
  • config – The config for the current feature store.

  • feature_views – A list containing all feature views that are referenced in the entity rows.

  • feature_refs – The features to be retrieved.

  • entity_df – A collection of rows containing all entity columns on which features need to be joined, as well as the timestamp column used for point-in-time joins. Either a pandas dataframe can be provided or a SQL query.

  • registry – The registry for the current feature store.

  • project – Feast project to which the feature views belong.

  • full_feature_names – If True, feature names will be prefixed with the corresponding feature view name, changing them from the format “feature” to “feature_view__feature” (e.g. “daily_transactions” changes to “customer_fv__daily_transactions”).

Returns

A RetrievalJob that can be executed to get the features.

ingest_df(feature_view: feast.feature_view.FeatureView, df: pandas.core.frame.DataFrame)[source]

Persists a dataframe to the online store.

Parameters
  • feature_view – The feature view to which the dataframe corresponds.

  • df – The dataframe to be persisted.

ingest_df_to_offline_store(feature_view: feast.feature_view.FeatureView, df: pyarrow.lib.Table)[source]

Persists a dataframe to the offline store.

Parameters
  • feature_view – The feature view to which the dataframe corresponds.

  • df – The dataframe to be persisted.

abstract materialize_single_feature_view(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, start_date: datetime.datetime, end_date: datetime.datetime, registry: feast.infra.registry.base_registry.BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm.std.tqdm]) None[source]

Writes latest feature values in the specified time range to the online store.

Parameters
  • config – The config for the current feature store.

  • feature_view – The feature view to materialize.

  • start_date – The start of the time range.

  • end_date – The end of the time range.

  • registry – The registry for the current feature store.

  • project – Feast project to which the objects belong.

  • tqdm_builder – A function to monitor the progress of materialization.

abstract online_read(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, entity_keys: List[feast.types.EntityKey_pb2.EntityKey], requested_features: Optional[List[str]] = None) List[Tuple[Optional[datetime.datetime], Optional[Dict[str, feast.types.Value_pb2.Value]]]][source]

Reads features values for the given entity keys.

Parameters
  • config – The config for the current feature store.

  • table – The feature view whose feature values should be read.

  • entity_keys – The list of entity keys for which feature values should be read.

  • requested_features – The list of features that should be read.

Returns

A list of the same length as entity_keys. Each item in the list is a tuple where the first item is the event timestamp for the row, and the second item is a dict mapping feature names to values, which are returned in proto format.

abstract online_write_batch(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, data: List[Tuple[feast.types.EntityKey_pb2.EntityKey, Dict[str, feast.types.Value_pb2.Value], datetime.datetime, Optional[datetime.datetime]]], progress: Optional[Callable[[int], Any]]) None[source]

Writes a batch of feature rows to the online store.

If a tz-naive timestamp is passed to this method, it is assumed to be UTC.

Parameters
  • config – The config for the current feature store.

  • table – Feature view to which these feature rows correspond.

  • data – A list of quadruplets containing feature data. Each quadruplet contains an entity key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.

  • progress – Function to be called once a batch of rows is written to the online store, used to show progress.

plan_infra(config: feast.repo_config.RepoConfig, desired_registry_proto: feast.core.Registry_pb2.Registry) feast.infra.infra_object.Infra[source]

Returns the Infra required to support the desired registry.

Parameters
  • config – The RepoConfig for the current FeatureStore.

  • desired_registry_proto – The desired registry, in proto form.

abstract retrieve_feature_service_logs(feature_service: feast.feature_service.FeatureService, start_date: datetime.datetime, end_date: datetime.datetime, config: feast.repo_config.RepoConfig, registry: feast.infra.registry.base_registry.BaseRegistry) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Reads logged features for the specified time window.

Parameters
  • feature_service – The feature service whose logs should be retrieved.

  • start_date – The start of the window.

  • end_date – The end of the window.

  • config – The config for the current feature store.

  • registry – The registry for the current feature store.

Returns

A RetrievalJob that can be executed to get the feature service logs.

abstract retrieve_saved_dataset(config: feast.repo_config.RepoConfig, dataset: feast.saved_dataset.SavedDataset) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Reads a saved dataset.

Parameters
  • config – The config for the current feature store.

  • dataset – A SavedDataset object containing all parameters necessary for retrieving the dataset.

Returns

A RetrievalJob that can be executed to get the saved dataset.

abstract teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity])[source]

Tears down all cloud resources for the specified set of Feast objects.

Parameters
  • project – Feast project to which the objects belong.

  • tables – Feature views whose corresponding infrastructure should be deleted.

  • entities – Entities whose corresponding infrastructure should be deleted.

abstract update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]

Reconciles cloud resources with the specified set of Feast objects.

Parameters
  • project – Feast project to which the objects belong.

  • tables_to_delete – Feature views whose corresponding infrastructure should be deleted.

  • tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.

  • entities_to_delete – Entities whose corresponding infrastructure should be deleted.

  • entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.

  • partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.

abstract write_feature_service_logs(feature_service: feast.feature_service.FeatureService, logs: Union[pyarrow.lib.Table, pathlib.Path], config: feast.repo_config.RepoConfig, registry: feast.infra.registry.base_registry.BaseRegistry)[source]

Writes features and entities logged by a feature server to the offline store.

The schema of the logs table is inferred from the specified feature service. Only feature services with configured logging are accepted.

Parameters
  • feature_service – The feature service to be logged.

  • logs – The logs, either as an arrow table or as a path to a parquet directory.

  • config – The config for the current feature store.

  • registry – The registry for the current feature store.

feast.infra.provider.get_provider(config: feast.repo_config.RepoConfig) feast.infra.provider.Provider[source]

Module contents