feast.infra package

Subpackages

Submodules

feast.infra.aws module

class feast.infra.aws.AwsProvider(config: feast.repo_config.RepoConfig)[source]

Bases: feast.infra.passthrough_provider.PassthroughProvider

get_feature_server_endpoint() Optional[str][source]

Returns endpoint for the feature server, if it exists.

teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity]) None[source]

Tear down all cloud resources for a repo.

Parameters
  • project – Feast project to which tables belong

  • tables – Tables that are declared in the feature repo.

  • entities – Entities that are declared in the feature repo.

update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]

Reconcile cloud resources with the objects declared in the feature repo.

Parameters
  • project – Project to which tables belong

  • tables_to_delete – Tables that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.

  • tables_to_keep – Tables that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.

  • entities_to_delete – Entities that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.

  • entities_to_keep – Entities that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.

  • partial – if true, then tables_to_delete and tables_to_keep are not exhaustive lists. There may be other tables that are not touched by this update.

class feast.infra.aws.S3RegistryStore(registry_config: feast.repo_config.RegistryConfig, repo_path: pathlib.Path)[source]

Bases: feast.registry_store.RegistryStore

get_registry_proto()[source]

Retrieves the registry proto from the registry path. If there is no file at that path, raises a FileNotFoundError.

Returns

Returns either the registry proto stored at the registry path, or an empty registry proto.

teardown()[source]

Tear down the registry.

update_registry_proto(registry_proto: feast.core.Registry_pb2.Registry)[source]

Overwrites the current registry proto with the proto passed in. This method writes to the registry path.

Parameters

registry_proto – the new RegistryProto

feast.infra.gcp module

class feast.infra.gcp.GCSRegistryStore(registry_config: feast.repo_config.RegistryConfig, repo_path: pathlib.Path)[source]

Bases: feast.registry_store.RegistryStore

get_registry_proto()[source]

Retrieves the registry proto from the registry path. If there is no file at that path, raises a FileNotFoundError.

Returns

Returns either the registry proto stored at the registry path, or an empty registry proto.

teardown()[source]

Tear down the registry.

update_registry_proto(registry_proto: feast.core.Registry_pb2.Registry)[source]

Overwrites the current registry proto with the proto passed in. This method writes to the registry path.

Parameters

registry_proto – the new RegistryProto

class feast.infra.gcp.GcpProvider(config: feast.repo_config.RepoConfig)[source]

Bases: feast.infra.passthrough_provider.PassthroughProvider

This class only exists for backwards compatibility.

feast.infra.infra_object module

class feast.infra.infra_object.Infra(infra_objects: List[feast.infra.infra_object.InfraObject] = <factory>)[source]

Bases: object

Represents the set of infrastructure managed by Feast.

Parameters

infra_objects – A list of InfraObjects, each representing one infrastructure object.

classmethod from_proto(infra_proto: feast.core.InfraObject_pb2.Infra)[source]

Returns an Infra object created from a protobuf representation.

infra_objects: List[feast.infra.infra_object.InfraObject]
to_proto() feast.core.InfraObject_pb2.Infra[source]

Converts Infra to its protobuf representation.

Returns

An InfraProto protobuf.

class feast.infra.infra_object.InfraObject(name: str)[source]

Bases: abc.ABC

Represents a single infrastructure object (e.g. online store table) managed by Feast.

abstract static from_infra_object_proto(infra_object_proto: feast.core.InfraObject_pb2.InfraObject) Any[source]

Returns an InfraObject created from a protobuf representation.

Parameters

infra_object_proto – A protobuf representation of an InfraObject.

Raises

FeastInvalidInfraObjectType – The type of InfraObject could not be identified.

static from_proto(infra_object_proto: Any) Any[source]

Converts a protobuf representation of a subclass to an object of that subclass.

Parameters

infra_object_proto – A protobuf representation of an InfraObject.

Raises

FeastInvalidInfraObjectType – The type of InfraObject could not be identified.

property name: str
abstract teardown()[source]

Tears down the infrastructure object.

abstract to_infra_object_proto() feast.core.InfraObject_pb2.InfraObject[source]

Converts an InfraObject to its protobuf representation, wrapped in an InfraObjectProto.

abstract to_proto() Any[source]

Converts an InfraObject to its protobuf representation.

abstract update()[source]

Deploys or updates the infrastructure object.

feast.infra.key_encoding_utils module

feast.infra.key_encoding_utils.serialize_entity_key(entity_key: feast.types.EntityKey_pb2.EntityKey, entity_key_serialization_version=1) bytes[source]

Serialize entity key to a bytestring so it can be used as a lookup key in a hash table.

We need this encoding to be stable; therefore we cannot just use protobuf serialization here since it does not guarantee that two proto messages containing the same data will serialize to the same byte string[1].

[1] https://developers.google.com/protocol-buffers/docs/encoding

feast.infra.key_encoding_utils.serialize_entity_key_prefix(entity_keys: List[str]) bytes[source]

Serialize keys to a bytestring, so it can be used to prefix-scan through items stored in the online store using serialize_entity_key.

This encoding is a partial implementation of serialize_entity_key, only operating on the keys of entities, and not the values.

feast.infra.local module

class feast.infra.local.LocalProvider(config: feast.repo_config.RepoConfig)[source]

Bases: feast.infra.passthrough_provider.PassthroughProvider

This class only exists for backwards compatibility.

plan_infra(config: feast.repo_config.RepoConfig, desired_registry_proto: feast.core.Registry_pb2.Registry) feast.infra.infra_object.Infra[source]

Returns the Infra required to support the desired registry.

Parameters
  • config – The RepoConfig for the current FeatureStore.

  • desired_registry_proto – The desired registry, in proto form.

class feast.infra.local.LocalRegistryStore(registry_config: feast.repo_config.RegistryConfig, repo_path: pathlib.Path)[source]

Bases: feast.registry_store.RegistryStore

get_registry_proto()[source]

Retrieves the registry proto from the registry path. If there is no file at that path, raises a FileNotFoundError.

Returns

Returns either the registry proto stored at the registry path, or an empty registry proto.

teardown()[source]

Tear down the registry.

update_registry_proto(registry_proto: feast.core.Registry_pb2.Registry)[source]

Overwrites the current registry proto with the proto passed in. This method writes to the registry path.

Parameters

registry_proto – the new RegistryProto

feast.infra.passthrough_provider module

class feast.infra.passthrough_provider.PassthroughProvider(config: feast.repo_config.RepoConfig)[source]

Bases: feast.infra.provider.Provider

The Passthrough provider delegates all operations to the underlying online and offline stores.

property batch_engine: feast.infra.materialization.batch_materialization_engine.BatchMaterializationEngine
get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.BaseRegistry, project: str, full_feature_names: bool) feast.infra.offline_stores.offline_store.RetrievalJob[source]
ingest_df(feature_view: feast.feature_view.FeatureView, entities: List[feast.entity.Entity], df: pandas.core.frame.DataFrame)[source]

Ingests a DataFrame directly into the online store

ingest_df_to_offline_store(feature_view: feast.feature_view.FeatureView, table: pyarrow.lib.Table)[source]

Ingests a DataFrame directly into the offline store

materialize_single_feature_view(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, start_date: datetime.datetime, end_date: datetime.datetime, registry: feast.registry.BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm.std.tqdm]) None[source]
property offline_store
offline_write_batch(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, data: pyarrow.lib.Table, progress: Optional[Callable[[int], Any]]) None[source]
online_read(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, entity_keys: List[feast.types.EntityKey_pb2.EntityKey], requested_features: List[str] = None) List[source]

Read feature values given an Entity Key. This is a low level interface, not expected to be used by the users directly.

Returns

Data is returned as a list, one item per entity key. Each item in the list is a tuple of event_ts for the row, and the feature data as a dict from feature names to values. Values are returned as Value proto message.

property online_store
online_write_batch(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, data: List[Tuple[feast.types.EntityKey_pb2.EntityKey, Dict[str, feast.types.Value_pb2.Value], datetime.datetime, Optional[datetime.datetime]]], progress: Optional[Callable[[int], Any]]) None[source]

Write a batch of feature rows to the online store. This is a low level interface, not expected to be used by the users directly.

If a tz-naive timestamp is passed to this method, it is assumed to be UTC.

Parameters
  • config – The RepoConfig for the current FeatureStore.

  • table – Feast FeatureView

  • data – a list of quadruplets containing Feature data. Each quadruplet contains an Entity Key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.

  • progress – Optional function to be called once every mini-batch of rows is written to the online store. Can be used to display progress.

retrieve_feature_service_logs(feature_service: feast.feature_service.FeatureService, start_date: datetime.datetime, end_date: datetime.datetime, config: feast.repo_config.RepoConfig, registry: feast.registry.BaseRegistry) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Read logged features from an offline store for a given time window [from, to). Target table is determined based on logging configuration from the feature service.

Returns

RetrievalJob object, which wraps the query to the offline store.

retrieve_saved_dataset(config: feast.repo_config.RepoConfig, dataset: feast.saved_dataset.SavedDataset) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Read saved dataset from offline store. All parameters for retrieval (like path, datetime boundaries, column names for both keys and features, etc) are determined from SavedDataset object.

Returns

RetrievalJob object, which is lazy wrapper for actual query performed under the hood.

teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity]) None[source]

Tear down all cloud resources for a repo.

Parameters
  • project – Feast project to which tables belong

  • tables – Tables that are declared in the feature repo.

  • entities – Entities that are declared in the feature repo.

update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]

Reconcile cloud resources with the objects declared in the feature repo.

Parameters
  • project – Project to which tables belong

  • tables_to_delete – Tables that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.

  • tables_to_keep – Tables that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.

  • entities_to_delete – Entities that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.

  • entities_to_keep – Entities that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.

  • partial – if true, then tables_to_delete and tables_to_keep are not exhaustive lists. There may be other tables that are not touched by this update.

write_feature_service_logs(feature_service: feast.feature_service.FeatureService, logs: Union[pyarrow.lib.Table, str], config: feast.repo_config.RepoConfig, registry: feast.registry.BaseRegistry)[source]

Write features and entities logged by a feature server to an offline store.

Schema of logs table is being inferred from the provided feature service. Only feature services with configured logging are accepted.

Logs dataset can be passed as Arrow Table or path to parquet directory.

feast.infra.provider module

class feast.infra.provider.Provider(config: feast.repo_config.RepoConfig)[source]

Bases: abc.ABC

get_feature_server_endpoint() Optional[str][source]

Returns endpoint for the feature server, if it exists.

abstract get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.BaseRegistry, project: str, full_feature_names: bool) feast.infra.offline_stores.offline_store.RetrievalJob[source]
ingest_df(feature_view: feast.feature_view.FeatureView, entities: List[feast.entity.Entity], df: pandas.core.frame.DataFrame)[source]

Ingests a DataFrame directly into the online store

ingest_df_to_offline_store(feature_view: feast.feature_view.FeatureView, df: pyarrow.lib.Table)[source]

Ingests a DataFrame directly into the offline store

abstract materialize_single_feature_view(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, start_date: datetime.datetime, end_date: datetime.datetime, registry: feast.registry.BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm.std.tqdm]) None[source]
abstract online_read(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, entity_keys: List[feast.types.EntityKey_pb2.EntityKey], requested_features: Optional[List[str]] = None) List[Tuple[Optional[datetime.datetime], Optional[Dict[str, feast.types.Value_pb2.Value]]]][source]

Read feature values given an Entity Key. This is a low level interface, not expected to be used by the users directly.

Returns

Data is returned as a list, one item per entity key. Each item in the list is a tuple of event_ts for the row, and the feature data as a dict from feature names to values. Values are returned as Value proto message.

abstract online_write_batch(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, data: List[Tuple[feast.types.EntityKey_pb2.EntityKey, Dict[str, feast.types.Value_pb2.Value], datetime.datetime, Optional[datetime.datetime]]], progress: Optional[Callable[[int], Any]]) None[source]

Write a batch of feature rows to the online store. This is a low level interface, not expected to be used by the users directly.

If a tz-naive timestamp is passed to this method, it is assumed to be UTC.

Parameters
  • config – The RepoConfig for the current FeatureStore.

  • table – Feast FeatureView

  • data – a list of quadruplets containing Feature data. Each quadruplet contains an Entity Key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.

  • progress – Optional function to be called once every mini-batch of rows is written to the online store. Can be used to display progress.

plan_infra(config: feast.repo_config.RepoConfig, desired_registry_proto: feast.core.Registry_pb2.Registry) feast.infra.infra_object.Infra[source]

Returns the Infra required to support the desired registry.

Parameters
  • config – The RepoConfig for the current FeatureStore.

  • desired_registry_proto – The desired registry, in proto form.

abstract retrieve_feature_service_logs(feature_service: feast.feature_service.FeatureService, start_date: datetime.datetime, end_date: datetime.datetime, config: feast.repo_config.RepoConfig, registry: feast.registry.BaseRegistry) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Read logged features from an offline store for a given time window [from, to). Target table is determined based on logging configuration from the feature service.

Returns

RetrievalJob object, which wraps the query to the offline store.

abstract retrieve_saved_dataset(config: feast.repo_config.RepoConfig, dataset: feast.saved_dataset.SavedDataset) feast.infra.offline_stores.offline_store.RetrievalJob[source]

Read saved dataset from offline store. All parameters for retrieval (like path, datetime boundaries, column names for both keys and features, etc) are determined from SavedDataset object.

Returns

RetrievalJob object, which is lazy wrapper for actual query performed under the hood.

abstract teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity])[source]

Tear down all cloud resources for a repo.

Parameters
  • project – Feast project to which tables belong

  • tables – Tables that are declared in the feature repo.

  • entities – Entities that are declared in the feature repo.

abstract update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]

Reconcile cloud resources with the objects declared in the feature repo.

Parameters
  • project – Project to which tables belong

  • tables_to_delete – Tables that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.

  • tables_to_keep – Tables that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.

  • entities_to_delete – Entities that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.

  • entities_to_keep – Entities that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.

  • partial – if true, then tables_to_delete and tables_to_keep are not exhaustive lists. There may be other tables that are not touched by this update.

abstract write_feature_service_logs(feature_service: feast.feature_service.FeatureService, logs: Union[pyarrow.lib.Table, pathlib.Path], config: feast.repo_config.RepoConfig, registry: feast.registry.BaseRegistry)[source]

Write features and entities logged by a feature server to an offline store.

Schema of logs table is being inferred from the provided feature service. Only feature services with configured logging are accepted.

Logs dataset can be passed as Arrow Table or path to parquet directory.

feast.infra.provider.get_provider(config: feast.repo_config.RepoConfig, repo_path: pathlib.Path) feast.infra.provider.Provider[source]

Module contents