feast.infra package

Subpackages

Submodules

feast.infra.aws module

class feast.infra.aws.AwsProvider(config: RepoConfig)[source]

Bases: PassthroughProvider

get_feature_server_endpoint() str | None[source]

Returns endpoint for the feature server, if it exists.

teardown_infra(project: str, tables: Sequence[FeatureView], entities: Sequence[Entity]) None[source]

Tears down all cloud resources for the specified set of Feast objects.

Parameters:
  • project – Feast project to which the objects belong.

  • tables – Feature views whose corresponding infrastructure should be deleted.

  • entities – Entities whose corresponding infrastructure should be deleted.

update_infra(project: str, tables_to_delete: Sequence[FeatureView], tables_to_keep: Sequence[FeatureView], entities_to_delete: Sequence[Entity], entities_to_keep: Sequence[Entity], partial: bool)[source]

Reconciles cloud resources with the specified set of Feast objects.

Parameters:
  • project – Feast project to which the objects belong.

  • tables_to_delete – Feature views whose corresponding infrastructure should be deleted.

  • tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.

  • entities_to_delete – Entities whose corresponding infrastructure should be deleted.

  • entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.

  • partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.

feast.infra.gcp module

class feast.infra.gcp.GcpProvider(config: RepoConfig)[source]

Bases: PassthroughProvider

This class only exists for backwards compatibility.

feast.infra.infra_object module

class feast.infra.infra_object.Infra(infra_objects: ~typing.List[~feast.infra.infra_object.InfraObject] = <factory>)[source]

Bases: object

Represents the set of infrastructure managed by Feast.

Parameters:

infra_objects – A list of InfraObjects, each representing one infrastructure object.

classmethod from_proto(infra_proto: Infra)[source]

Returns an Infra object created from a protobuf representation.

infra_objects: List[InfraObject]
to_proto() Infra[source]

Converts Infra to its protobuf representation.

Returns:

An InfraProto protobuf.

class feast.infra.infra_object.InfraObject(name: str)[source]

Bases: ABC

Represents a single infrastructure object (e.g. online store table) managed by Feast.

abstract static from_infra_object_proto(infra_object_proto: InfraObject) Any[source]

Returns an InfraObject created from a protobuf representation.

Parameters:

infra_object_proto – A protobuf representation of an InfraObject.

Raises:

FeastInvalidInfraObjectType – The type of InfraObject could not be identified.

static from_proto(infra_object_proto: Any) Any[source]

Converts a protobuf representation of a subclass to an object of that subclass.

Parameters:

infra_object_proto – A protobuf representation of an InfraObject.

Raises:

FeastInvalidInfraObjectType – The type of InfraObject could not be identified.

property name: str
abstract teardown()[source]

Tears down the infrastructure object.

abstract to_infra_object_proto() InfraObject[source]

Converts an InfraObject to its protobuf representation, wrapped in an InfraObjectProto.

abstract to_proto() Any[source]

Converts an InfraObject to its protobuf representation.

abstract update()[source]

Deploys or updates the infrastructure object.

feast.infra.key_encoding_utils module

feast.infra.key_encoding_utils.serialize_entity_key(entity_key: EntityKey, entity_key_serialization_version=1) bytes[source]

Serialize entity key to a bytestring so it can be used as a lookup key in a hash table.

We need this encoding to be stable; therefore we cannot just use protobuf serialization here since it does not guarantee that two proto messages containing the same data will serialize to the same byte string[1].

[1] https://developers.google.com/protocol-buffers/docs/encoding

feast.infra.key_encoding_utils.serialize_entity_key_prefix(entity_keys: List[str]) bytes[source]

Serialize keys to a bytestring, so it can be used to prefix-scan through items stored in the online store using serialize_entity_key.

This encoding is a partial implementation of serialize_entity_key, only operating on the keys of entities, and not the values.

feast.infra.local module

class feast.infra.local.LocalProvider(config: RepoConfig)[source]

Bases: PassthroughProvider

This class only exists for backwards compatibility.

plan_infra(config: RepoConfig, desired_registry_proto: Registry) Infra[source]

Returns the Infra required to support the desired registry.

Parameters:
  • config – The RepoConfig for the current FeatureStore.

  • desired_registry_proto – The desired registry, in proto form.

feast.infra.passthrough_provider module

class feast.infra.passthrough_provider.PassthroughProvider(config: RepoConfig)[source]

Bases: Provider

The passthrough provider delegates all operations to the underlying online and offline stores.

property batch_engine: BatchMaterializationEngine
get_historical_features(config: RepoConfig, feature_views: List[FeatureView], feature_refs: List[str], entity_df: DataFrame | str, registry: BaseRegistry, project: str, full_feature_names: bool) RetrievalJob[source]

Retrieves the point-in-time correct historical feature values for the specified entity rows.

Parameters:
  • config – The config for the current feature store.

  • feature_views – A list containing all feature views that are referenced in the entity rows.

  • feature_refs – The features to be retrieved.

  • entity_df – A collection of rows containing all entity columns on which features need to be joined, as well as the timestamp column used for point-in-time joins. Either a pandas dataframe can be provided or a SQL query.

  • registry – The registry for the current feature store.

  • project – Feast project to which the feature views belong.

  • full_feature_names – If True, feature names will be prefixed with the corresponding feature view name, changing them from the format “feature” to “feature_view__feature” (e.g. “daily_transactions” changes to “customer_fv__daily_transactions”).

Returns:

A RetrievalJob that can be executed to get the features.

ingest_df(feature_view: FeatureView, df: DataFrame)[source]

Persists a dataframe to the online store.

Parameters:
  • feature_view – The feature view to which the dataframe corresponds.

  • df – The dataframe to be persisted.

ingest_df_to_offline_store(feature_view: FeatureView, table: Table)[source]

Persists a dataframe to the offline store.

Parameters:
  • feature_view – The feature view to which the dataframe corresponds.

  • df – The dataframe to be persisted.

materialize_single_feature_view(config: RepoConfig, feature_view: FeatureView, start_date: datetime, end_date: datetime, registry: BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm]) None[source]

Writes latest feature values in the specified time range to the online store.

Parameters:
  • config – The config for the current feature store.

  • feature_view – The feature view to materialize.

  • start_date – The start of the time range.

  • end_date – The end of the time range.

  • registry – The registry for the current feature store.

  • project – Feast project to which the objects belong.

  • tqdm_builder – A function to monitor the progress of materialization.

property offline_store
offline_write_batch(config: RepoConfig, feature_view: FeatureView, data: Table, progress: Callable[[int], Any] | None) None[source]
online_read(config: RepoConfig, table: FeatureView, entity_keys: List[EntityKey], requested_features: List[str] = None) List[source]

Reads features values for the given entity keys.

Parameters:
  • config – The config for the current feature store.

  • table – The feature view whose feature values should be read.

  • entity_keys – The list of entity keys for which feature values should be read.

  • requested_features – The list of features that should be read.

Returns:

A list of the same length as entity_keys. Each item in the list is a tuple where the first item is the event timestamp for the row, and the second item is a dict mapping feature names to values, which are returned in proto format.

property online_store
online_write_batch(config: RepoConfig, table: FeatureView, data: List[Tuple[EntityKey, Dict[str, Value], datetime, datetime | None]], progress: Callable[[int], Any] | None) None[source]

Writes a batch of feature rows to the online store.

If a tz-naive timestamp is passed to this method, it is assumed to be UTC.

Parameters:
  • config – The config for the current feature store.

  • table – Feature view to which these feature rows correspond.

  • data – A list of quadruplets containing feature data. Each quadruplet contains an entity key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.

  • progress – Function to be called once a batch of rows is written to the online store, used to show progress.

retrieve_feature_service_logs(feature_service: FeatureService, start_date: datetime, end_date: datetime, config: RepoConfig, registry: BaseRegistry) RetrievalJob[source]

Reads logged features for the specified time window.

Parameters:
  • feature_service – The feature service whose logs should be retrieved.

  • start_date – The start of the window.

  • end_date – The end of the window.

  • config – The config for the current feature store.

  • registry – The registry for the current feature store.

Returns:

A RetrievalJob that can be executed to get the feature service logs.

retrieve_saved_dataset(config: RepoConfig, dataset: SavedDataset) RetrievalJob[source]

Reads a saved dataset.

Parameters:
  • config – The config for the current feature store.

  • dataset – A SavedDataset object containing all parameters necessary for retrieving the dataset.

Returns:

A RetrievalJob that can be executed to get the saved dataset.

teardown_infra(project: str, tables: Sequence[FeatureView], entities: Sequence[Entity]) None[source]

Tears down all cloud resources for the specified set of Feast objects.

Parameters:
  • project – Feast project to which the objects belong.

  • tables – Feature views whose corresponding infrastructure should be deleted.

  • entities – Entities whose corresponding infrastructure should be deleted.

update_infra(project: str, tables_to_delete: Sequence[FeatureView], tables_to_keep: Sequence[FeatureView], entities_to_delete: Sequence[Entity], entities_to_keep: Sequence[Entity], partial: bool)[source]

Reconciles cloud resources with the specified set of Feast objects.

Parameters:
  • project – Feast project to which the objects belong.

  • tables_to_delete – Feature views whose corresponding infrastructure should be deleted.

  • tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.

  • entities_to_delete – Entities whose corresponding infrastructure should be deleted.

  • entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.

  • partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.

write_feature_service_logs(feature_service: FeatureService, logs: Table | str, config: RepoConfig, registry: BaseRegistry)[source]

Writes features and entities logged by a feature server to the offline store.

The schema of the logs table is inferred from the specified feature service. Only feature services with configured logging are accepted.

Parameters:
  • feature_service – The feature service to be logged.

  • logs – The logs, either as an arrow table or as a path to a parquet directory.

  • config – The config for the current feature store.

  • registry – The registry for the current feature store.

feast.infra.provider module

class feast.infra.provider.Provider(config: RepoConfig)[source]

Bases: ABC

A provider defines an implementation of a feature store object. It orchestrates the various components of a feature store, such as the offline store, online store, and materialization engine. It is configured through a RepoConfig object.

get_feature_server_endpoint() str | None[source]

Returns endpoint for the feature server, if it exists.

abstract get_historical_features(config: RepoConfig, feature_views: List[FeatureView], feature_refs: List[str], entity_df: DataFrame | str, registry: BaseRegistry, project: str, full_feature_names: bool) RetrievalJob[source]

Retrieves the point-in-time correct historical feature values for the specified entity rows.

Parameters:
  • config – The config for the current feature store.

  • feature_views – A list containing all feature views that are referenced in the entity rows.

  • feature_refs – The features to be retrieved.

  • entity_df – A collection of rows containing all entity columns on which features need to be joined, as well as the timestamp column used for point-in-time joins. Either a pandas dataframe can be provided or a SQL query.

  • registry – The registry for the current feature store.

  • project – Feast project to which the feature views belong.

  • full_feature_names – If True, feature names will be prefixed with the corresponding feature view name, changing them from the format “feature” to “feature_view__feature” (e.g. “daily_transactions” changes to “customer_fv__daily_transactions”).

Returns:

A RetrievalJob that can be executed to get the features.

ingest_df(feature_view: FeatureView, df: DataFrame)[source]

Persists a dataframe to the online store.

Parameters:
  • feature_view – The feature view to which the dataframe corresponds.

  • df – The dataframe to be persisted.

ingest_df_to_offline_store(feature_view: FeatureView, df: Table)[source]

Persists a dataframe to the offline store.

Parameters:
  • feature_view – The feature view to which the dataframe corresponds.

  • df – The dataframe to be persisted.

abstract materialize_single_feature_view(config: RepoConfig, feature_view: FeatureView, start_date: datetime, end_date: datetime, registry: BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm]) None[source]

Writes latest feature values in the specified time range to the online store.

Parameters:
  • config – The config for the current feature store.

  • feature_view – The feature view to materialize.

  • start_date – The start of the time range.

  • end_date – The end of the time range.

  • registry – The registry for the current feature store.

  • project – Feast project to which the objects belong.

  • tqdm_builder – A function to monitor the progress of materialization.

abstract online_read(config: RepoConfig, table: FeatureView, entity_keys: List[EntityKey], requested_features: List[str] | None = None) List[Tuple[datetime | None, Dict[str, Value] | None]][source]

Reads features values for the given entity keys.

Parameters:
  • config – The config for the current feature store.

  • table – The feature view whose feature values should be read.

  • entity_keys – The list of entity keys for which feature values should be read.

  • requested_features – The list of features that should be read.

Returns:

A list of the same length as entity_keys. Each item in the list is a tuple where the first item is the event timestamp for the row, and the second item is a dict mapping feature names to values, which are returned in proto format.

abstract online_write_batch(config: RepoConfig, table: FeatureView, data: List[Tuple[EntityKey, Dict[str, Value], datetime, datetime | None]], progress: Callable[[int], Any] | None) None[source]

Writes a batch of feature rows to the online store.

If a tz-naive timestamp is passed to this method, it is assumed to be UTC.

Parameters:
  • config – The config for the current feature store.

  • table – Feature view to which these feature rows correspond.

  • data – A list of quadruplets containing feature data. Each quadruplet contains an entity key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.

  • progress – Function to be called once a batch of rows is written to the online store, used to show progress.

plan_infra(config: RepoConfig, desired_registry_proto: Registry) Infra[source]

Returns the Infra required to support the desired registry.

Parameters:
  • config – The RepoConfig for the current FeatureStore.

  • desired_registry_proto – The desired registry, in proto form.

abstract retrieve_feature_service_logs(feature_service: FeatureService, start_date: datetime, end_date: datetime, config: RepoConfig, registry: BaseRegistry) RetrievalJob[source]

Reads logged features for the specified time window.

Parameters:
  • feature_service – The feature service whose logs should be retrieved.

  • start_date – The start of the window.

  • end_date – The end of the window.

  • config – The config for the current feature store.

  • registry – The registry for the current feature store.

Returns:

A RetrievalJob that can be executed to get the feature service logs.

abstract retrieve_saved_dataset(config: RepoConfig, dataset: SavedDataset) RetrievalJob[source]

Reads a saved dataset.

Parameters:
  • config – The config for the current feature store.

  • dataset – A SavedDataset object containing all parameters necessary for retrieving the dataset.

Returns:

A RetrievalJob that can be executed to get the saved dataset.

abstract teardown_infra(project: str, tables: Sequence[FeatureView], entities: Sequence[Entity])[source]

Tears down all cloud resources for the specified set of Feast objects.

Parameters:
  • project – Feast project to which the objects belong.

  • tables – Feature views whose corresponding infrastructure should be deleted.

  • entities – Entities whose corresponding infrastructure should be deleted.

abstract update_infra(project: str, tables_to_delete: Sequence[FeatureView], tables_to_keep: Sequence[FeatureView], entities_to_delete: Sequence[Entity], entities_to_keep: Sequence[Entity], partial: bool)[source]

Reconciles cloud resources with the specified set of Feast objects.

Parameters:
  • project – Feast project to which the objects belong.

  • tables_to_delete – Feature views whose corresponding infrastructure should be deleted.

  • tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.

  • entities_to_delete – Entities whose corresponding infrastructure should be deleted.

  • entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.

  • partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.

abstract write_feature_service_logs(feature_service: FeatureService, logs: Table | Path, config: RepoConfig, registry: BaseRegistry)[source]

Writes features and entities logged by a feature server to the offline store.

The schema of the logs table is inferred from the specified feature service. Only feature services with configured logging are accepted.

Parameters:
  • feature_service – The feature service to be logged.

  • logs – The logs, either as an arrow table or as a path to a parquet directory.

  • config – The config for the current feature store.

  • registry – The registry for the current feature store.

feast.infra.provider.get_provider(config: RepoConfig) Provider[source]

Module contents