feast.infra package
Subpackages
- feast.infra.materialization package
- feast.infra.offline_stores package
- Subpackages
- Submodules
- feast.infra.offline_stores.bigquery module
- feast.infra.offline_stores.bigquery_source module
- feast.infra.offline_stores.file module
- feast.infra.offline_stores.file_source module
- feast.infra.offline_stores.offline_store module
- feast.infra.offline_stores.offline_utils module
- feast.infra.offline_stores.redshift module
- feast.infra.offline_stores.redshift_source module
- feast.infra.offline_stores.snowflake module
- feast.infra.offline_stores.snowflake_source module
- Module contents
- feast.infra.online_stores package
- Subpackages
- Submodules
- feast.infra.online_stores.datastore module
- feast.infra.online_stores.dynamodb module
- feast.infra.online_stores.helpers module
- feast.infra.online_stores.online_store module
- feast.infra.online_stores.redis module
- feast.infra.online_stores.snowflake module
- feast.infra.online_stores.sqlite module
- Module contents
- feast.infra.registry_stores package
- feast.infra.utils package
Submodules
feast.infra.aws module
- class feast.infra.aws.AwsProvider(config: feast.repo_config.RepoConfig)[source]
Bases:
feast.infra.passthrough_provider.PassthroughProvider
- get_feature_server_endpoint() Optional[str] [source]
Returns endpoint for the feature server, if it exists.
- teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity]) None [source]
Tear down all cloud resources for a repo.
- Parameters
project – Feast project to which tables belong
tables – Tables that are declared in the feature repo.
entities – Entities that are declared in the feature repo.
- update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]
Reconcile cloud resources with the objects declared in the feature repo.
- Parameters
project – Project to which tables belong
tables_to_delete – Tables that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.
tables_to_keep – Tables that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.
entities_to_delete – Entities that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.
entities_to_keep – Entities that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.
partial – if true, then tables_to_delete and tables_to_keep are not exhaustive lists. There may be other tables that are not touched by this update.
- class feast.infra.aws.S3RegistryStore(registry_config: feast.repo_config.RegistryConfig, repo_path: pathlib.Path)[source]
feast.infra.gcp module
- class feast.infra.gcp.GCSRegistryStore(registry_config: feast.repo_config.RegistryConfig, repo_path: pathlib.Path)[source]
- class feast.infra.gcp.GcpProvider(config: feast.repo_config.RepoConfig)[source]
Bases:
feast.infra.passthrough_provider.PassthroughProvider
This class only exists for backwards compatibility.
feast.infra.infra_object module
- class feast.infra.infra_object.Infra(infra_objects: List[feast.infra.infra_object.InfraObject] = <factory>)[source]
Bases:
object
Represents the set of infrastructure managed by Feast.
- Parameters
infra_objects – A list of InfraObjects, each representing one infrastructure object.
- classmethod from_proto(infra_proto: feast.core.InfraObject_pb2.Infra)[source]
Returns an Infra object created from a protobuf representation.
- infra_objects: List[feast.infra.infra_object.InfraObject]
- class feast.infra.infra_object.InfraObject(name: str)[source]
Bases:
abc.ABC
Represents a single infrastructure object (e.g. online store table) managed by Feast.
- abstract static from_infra_object_proto(infra_object_proto: feast.core.InfraObject_pb2.InfraObject) Any [source]
Returns an InfraObject created from a protobuf representation.
- Parameters
infra_object_proto – A protobuf representation of an InfraObject.
- Raises
FeastInvalidInfraObjectType – The type of InfraObject could not be identified.
- static from_proto(infra_object_proto: Any) Any [source]
Converts a protobuf representation of a subclass to an object of that subclass.
- Parameters
infra_object_proto – A protobuf representation of an InfraObject.
- Raises
FeastInvalidInfraObjectType – The type of InfraObject could not be identified.
feast.infra.key_encoding_utils module
- feast.infra.key_encoding_utils.serialize_entity_key(entity_key: feast.types.EntityKey_pb2.EntityKey, entity_key_serialization_version=1) bytes [source]
Serialize entity key to a bytestring so it can be used as a lookup key in a hash table.
We need this encoding to be stable; therefore we cannot just use protobuf serialization here since it does not guarantee that two proto messages containing the same data will serialize to the same byte string[1].
[1] https://developers.google.com/protocol-buffers/docs/encoding
- feast.infra.key_encoding_utils.serialize_entity_key_prefix(entity_keys: List[str]) bytes [source]
Serialize keys to a bytestring, so it can be used to prefix-scan through items stored in the online store using serialize_entity_key.
This encoding is a partial implementation of serialize_entity_key, only operating on the keys of entities, and not the values.
feast.infra.local module
- class feast.infra.local.LocalProvider(config: feast.repo_config.RepoConfig)[source]
Bases:
feast.infra.passthrough_provider.PassthroughProvider
This class only exists for backwards compatibility.
- plan_infra(config: feast.repo_config.RepoConfig, desired_registry_proto: feast.core.Registry_pb2.Registry) feast.infra.infra_object.Infra [source]
Returns the Infra required to support the desired registry.
- Parameters
config – The RepoConfig for the current FeatureStore.
desired_registry_proto – The desired registry, in proto form.
- class feast.infra.local.LocalRegistryStore(registry_config: feast.repo_config.RegistryConfig, repo_path: pathlib.Path)[source]
feast.infra.passthrough_provider module
- class feast.infra.passthrough_provider.PassthroughProvider(config: feast.repo_config.RepoConfig)[source]
Bases:
feast.infra.provider.Provider
The Passthrough provider delegates all operations to the underlying online and offline stores.
- property batch_engine: feast.infra.materialization.batch_materialization_engine.BatchMaterializationEngine
- get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.BaseRegistry, project: str, full_feature_names: bool) feast.infra.offline_stores.offline_store.RetrievalJob [source]
- ingest_df(feature_view: feast.feature_view.FeatureView, entities: List[feast.entity.Entity], df: pandas.core.frame.DataFrame)[source]
Ingests a DataFrame directly into the online store
- ingest_df_to_offline_store(feature_view: feast.feature_view.FeatureView, table: pyarrow.lib.Table)[source]
Ingests a DataFrame directly into the offline store
- materialize_single_feature_view(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, start_date: datetime.datetime, end_date: datetime.datetime, registry: feast.registry.BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm.std.tqdm]) None [source]
- property offline_store
- offline_write_batch(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, data: pyarrow.lib.Table, progress: Optional[Callable[[int], Any]]) None [source]
- online_read(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, entity_keys: List[feast.types.EntityKey_pb2.EntityKey], requested_features: List[str] = None) List [source]
Read feature values given an Entity Key. This is a low level interface, not expected to be used by the users directly.
- Returns
Data is returned as a list, one item per entity key. Each item in the list is a tuple of event_ts for the row, and the feature data as a dict from feature names to values. Values are returned as Value proto message.
- property online_store
- online_write_batch(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, data: List[Tuple[feast.types.EntityKey_pb2.EntityKey, Dict[str, feast.types.Value_pb2.Value], datetime.datetime, Optional[datetime.datetime]]], progress: Optional[Callable[[int], Any]]) None [source]
Write a batch of feature rows to the online store. This is a low level interface, not expected to be used by the users directly.
If a tz-naive timestamp is passed to this method, it is assumed to be UTC.
- Parameters
config – The RepoConfig for the current FeatureStore.
table – Feast FeatureView
data – a list of quadruplets containing Feature data. Each quadruplet contains an Entity Key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.
progress – Optional function to be called once every mini-batch of rows is written to the online store. Can be used to display progress.
- retrieve_feature_service_logs(feature_service: feast.feature_service.FeatureService, start_date: datetime.datetime, end_date: datetime.datetime, config: feast.repo_config.RepoConfig, registry: feast.registry.BaseRegistry) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Read logged features from an offline store for a given time window [from, to). Target table is determined based on logging configuration from the feature service.
- Returns
RetrievalJob object, which wraps the query to the offline store.
- retrieve_saved_dataset(config: feast.repo_config.RepoConfig, dataset: feast.saved_dataset.SavedDataset) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Read saved dataset from offline store. All parameters for retrieval (like path, datetime boundaries, column names for both keys and features, etc) are determined from SavedDataset object.
- Returns
RetrievalJob object, which is lazy wrapper for actual query performed under the hood.
- teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity]) None [source]
Tear down all cloud resources for a repo.
- Parameters
project – Feast project to which tables belong
tables – Tables that are declared in the feature repo.
entities – Entities that are declared in the feature repo.
- update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]
Reconcile cloud resources with the objects declared in the feature repo.
- Parameters
project – Project to which tables belong
tables_to_delete – Tables that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.
tables_to_keep – Tables that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.
entities_to_delete – Entities that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.
entities_to_keep – Entities that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.
partial – if true, then tables_to_delete and tables_to_keep are not exhaustive lists. There may be other tables that are not touched by this update.
- write_feature_service_logs(feature_service: feast.feature_service.FeatureService, logs: Union[pyarrow.lib.Table, str], config: feast.repo_config.RepoConfig, registry: feast.registry.BaseRegistry)[source]
Write features and entities logged by a feature server to an offline store.
Schema of logs table is being inferred from the provided feature service. Only feature services with configured logging are accepted.
Logs dataset can be passed as Arrow Table or path to parquet directory.
feast.infra.provider module
- class feast.infra.provider.Provider(config: feast.repo_config.RepoConfig)[source]
Bases:
abc.ABC
- get_feature_server_endpoint() Optional[str] [source]
Returns endpoint for the feature server, if it exists.
- abstract get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.BaseRegistry, project: str, full_feature_names: bool) feast.infra.offline_stores.offline_store.RetrievalJob [source]
- ingest_df(feature_view: feast.feature_view.FeatureView, entities: List[feast.entity.Entity], df: pandas.core.frame.DataFrame)[source]
Ingests a DataFrame directly into the online store
- ingest_df_to_offline_store(feature_view: feast.feature_view.FeatureView, df: pyarrow.lib.Table)[source]
Ingests a DataFrame directly into the offline store
- abstract materialize_single_feature_view(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, start_date: datetime.datetime, end_date: datetime.datetime, registry: feast.registry.BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm.std.tqdm]) None [source]
- abstract online_read(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, entity_keys: List[feast.types.EntityKey_pb2.EntityKey], requested_features: Optional[List[str]] = None) List[Tuple[Optional[datetime.datetime], Optional[Dict[str, feast.types.Value_pb2.Value]]]] [source]
Read feature values given an Entity Key. This is a low level interface, not expected to be used by the users directly.
- Returns
Data is returned as a list, one item per entity key. Each item in the list is a tuple of event_ts for the row, and the feature data as a dict from feature names to values. Values are returned as Value proto message.
- abstract online_write_batch(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, data: List[Tuple[feast.types.EntityKey_pb2.EntityKey, Dict[str, feast.types.Value_pb2.Value], datetime.datetime, Optional[datetime.datetime]]], progress: Optional[Callable[[int], Any]]) None [source]
Write a batch of feature rows to the online store. This is a low level interface, not expected to be used by the users directly.
If a tz-naive timestamp is passed to this method, it is assumed to be UTC.
- Parameters
config – The RepoConfig for the current FeatureStore.
table – Feast FeatureView
data – a list of quadruplets containing Feature data. Each quadruplet contains an Entity Key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.
progress – Optional function to be called once every mini-batch of rows is written to the online store. Can be used to display progress.
- plan_infra(config: feast.repo_config.RepoConfig, desired_registry_proto: feast.core.Registry_pb2.Registry) feast.infra.infra_object.Infra [source]
Returns the Infra required to support the desired registry.
- Parameters
config – The RepoConfig for the current FeatureStore.
desired_registry_proto – The desired registry, in proto form.
- abstract retrieve_feature_service_logs(feature_service: feast.feature_service.FeatureService, start_date: datetime.datetime, end_date: datetime.datetime, config: feast.repo_config.RepoConfig, registry: feast.registry.BaseRegistry) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Read logged features from an offline store for a given time window [from, to). Target table is determined based on logging configuration from the feature service.
- Returns
RetrievalJob object, which wraps the query to the offline store.
- abstract retrieve_saved_dataset(config: feast.repo_config.RepoConfig, dataset: feast.saved_dataset.SavedDataset) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Read saved dataset from offline store. All parameters for retrieval (like path, datetime boundaries, column names for both keys and features, etc) are determined from SavedDataset object.
- Returns
RetrievalJob object, which is lazy wrapper for actual query performed under the hood.
- abstract teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity])[source]
Tear down all cloud resources for a repo.
- Parameters
project – Feast project to which tables belong
tables – Tables that are declared in the feature repo.
entities – Entities that are declared in the feature repo.
- abstract update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]
Reconcile cloud resources with the objects declared in the feature repo.
- Parameters
project – Project to which tables belong
tables_to_delete – Tables that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.
tables_to_keep – Tables that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.
entities_to_delete – Entities that were deleted from the feature repo, so provider needs to clean up the corresponding cloud resources.
entities_to_keep – Entities that are still in the feature repo. Depending on implementation, provider may or may not need to update the corresponding resources.
partial – if true, then tables_to_delete and tables_to_keep are not exhaustive lists. There may be other tables that are not touched by this update.
- abstract write_feature_service_logs(feature_service: feast.feature_service.FeatureService, logs: Union[pyarrow.lib.Table, pathlib.Path], config: feast.repo_config.RepoConfig, registry: feast.registry.BaseRegistry)[source]
Write features and entities logged by a feature server to an offline store.
Schema of logs table is being inferred from the provided feature service. Only feature services with configured logging are accepted.
Logs dataset can be passed as Arrow Table or path to parquet directory.
- feast.infra.provider.get_provider(config: feast.repo_config.RepoConfig, repo_path: pathlib.Path) feast.infra.provider.Provider [source]