feast package¶
Subpackages¶
Submodules¶
feast.cli module¶
feast.client module¶
feast.config module¶
feast.constants module¶
feast.data_format module¶
- class feast.data_format.AvroFormat(schema_json: str)[source]¶
Bases:
feast.data_format.StreamFormat
Defines the Avro streaming data format that encodes data in Avro format
- class feast.data_format.FileFormat[source]¶
Bases:
abc.ABC
Defines an abtract file forma used to encode feature data in files
- class feast.data_format.ParquetFormat[source]¶
Bases:
feast.data_format.FileFormat
Defines the Parquet data format
- class feast.data_format.ProtoFormat(class_path: str)[source]¶
Bases:
feast.data_format.StreamFormat
Defines the Protobuf data format
feast.data_source module¶
- class feast.data_source.DataSource(event_timestamp_column: Optional[str] = None, created_timestamp_column: Optional[str] = None, field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = None)[source]¶
Bases:
abc.ABC
DataSource that can be used to source features.
- Parameters
event_timestamp_column (optional) – Event timestamp column used for point in time joins of feature values.
created_timestamp_column (optional) – Timestamp column indicating when the row was created, used for deduplicating rows.
field_mapping (optional) – A dictionary mapping of column names in this data source to feature names in a feature table or view. Only used for feature columns, not entity or timestamp columns.
date_partition_column (optional) – Timestamp column used for partitioning.
- property created_timestamp_column: str¶
Returns the created timestamp column of this data source.
- property date_partition_column: str¶
Returns the date partition column of this data source.
- property event_timestamp_column: str¶
Returns the event timestamp column of this data source.
- property field_mapping: Dict[str, str]¶
Returns the field mapping of this data source.
- abstract static from_proto(data_source: feast.core.DataSource_pb2.DataSource) Any [source]¶
Converts data source config in FeatureTable spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]¶
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- get_table_query_string() str [source]¶
Returns a string that can directly be used to reference this table in SQL.
- abstract static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]¶
Returns the callable method that returns Feast type given the raw column type.
- abstract to_proto() feast.core.DataSource_pb2.DataSource [source]¶
Converts an DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]¶
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.
- class feast.data_source.KafkaOptions(bootstrap_servers: str, message_format: feast.data_format.StreamFormat, topic: str)[source]¶
Bases:
object
DataSource Kafka options used to source features from Kafka messages
- property bootstrap_servers¶
Returns a comma-separated list of Kafka bootstrap servers
- classmethod from_proto(kafka_options_proto: feast.core.DataSource_pb2.KafkaOptions)[source]¶
Creates a KafkaOptions from a protobuf representation of a kafka option
- Parameters
kafka_options_proto – A protobuf representation of a DataSource
- Returns
Returns a BigQueryOptions object based on the kafka_options protobuf
- property message_format¶
Returns the data format that is used to encode the feature data in Kafka messages
- to_proto() feast.core.DataSource_pb2.KafkaOptions [source]¶
Converts an KafkaOptionsProto object to its protobuf representation.
- Returns
KafkaOptionsProto protobuf
- property topic¶
Returns the Kafka topic to collect feature data from
- class feast.data_source.KafkaSource(event_timestamp_column: str, bootstrap_servers: str, message_format: feast.data_format.StreamFormat, topic: str, created_timestamp_column: Optional[str] = '', field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = '')[source]¶
Bases:
feast.data_source.DataSource
- static from_proto(data_source: feast.core.DataSource_pb2.DataSource)[source]¶
Converts data source config in FeatureTable spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]¶
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- property kafka_options¶
Returns the kafka options of this data source
- static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]¶
Returns the callable method that returns Feast type given the raw column type.
- to_proto() feast.core.DataSource_pb2.DataSource [source]¶
Converts an DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]¶
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.
- class feast.data_source.KinesisOptions(record_format: feast.data_format.StreamFormat, region: str, stream_name: str)[source]¶
Bases:
object
DataSource Kinesis options used to source features from Kinesis records
- classmethod from_proto(kinesis_options_proto: feast.core.DataSource_pb2.KinesisOptions)[source]¶
Creates a KinesisOptions from a protobuf representation of a kinesis option
- Parameters
kinesis_options_proto – A protobuf representation of a DataSource
- Returns
Returns a KinesisOptions object based on the kinesis_options protobuf
- property record_format¶
Returns the data format used to encode the feature data in the Kinesis records.
- property region¶
Returns the AWS region of Kinesis stream
- property stream_name¶
Returns the Kinesis stream name to obtain feature data from
- class feast.data_source.KinesisSource(event_timestamp_column: str, created_timestamp_column: str, record_format: feast.data_format.StreamFormat, region: str, stream_name: str, field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = '')[source]¶
Bases:
feast.data_source.DataSource
- static from_proto(data_source: feast.core.DataSource_pb2.DataSource)[source]¶
Converts data source config in FeatureTable spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]¶
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- property kinesis_options¶
Returns the kinesis options of this data source
- static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]¶
Returns the callable method that returns Feast type given the raw column type.
- to_proto() feast.core.DataSource_pb2.DataSource [source]¶
Converts an DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]¶
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.
feast.driver_test_data module¶
- class feast.driver_test_data.EventTimestampType(value)[source]¶
Bases:
enum.Enum
An enumeration.
- TZ_AWARE_FIXED_OFFSET = 2¶
- TZ_AWARE_US_PACIFIC = 3¶
- TZ_AWARE_UTC = 1¶
- TZ_NAIVE = 0¶
- feast.driver_test_data.create_customer_daily_profile_df(customers, start_date, end_date) pandas.core.frame.DataFrame [source]¶
Example df generated by this function:
event_timestamp | customer_id | current_balance | avg_passenger_count | lifetime_trip_count | created ||------------------+-------------+-----------------+---------------------+---------------------+------------------| | 2021-03-17 19:31 | 1010 | 0.889188 | 0.049057 | 412 | 2021-03-24 19:38 | | 2021-03-18 19:31 | 1010 | 0.979273 | 0.212630 | 639 | 2021-03-24 19:38 | | 2021-03-19 19:31 | 1010 | 0.976549 | 0.176881 | 70 | 2021-03-24 19:38 | | 2021-03-20 19:31 | 1010 | 0.273697 | 0.325012 | 68 | 2021-03-24 19:38 | | 2021-03-21 19:31 | 1010 | 0.438262 | 0.313009 | 192 | 2021-03-24 19:38 | | | … | … | … | … | | | 2021-03-19 19:31 | 1001 | 0.738860 | 0.857422 | 344 | 2021-03-24 19:38 | | 2021-03-20 19:31 | 1001 | 0.848397 | 0.745989 | 106 | 2021-03-24 19:38 | | 2021-03-21 19:31 | 1001 | 0.301552 | 0.185873 | 812 | 2021-03-24 19:38 | | 2021-03-22 19:31 | 1001 | 0.943030 | 0.561219 | 322 | 2021-03-24 19:38 | | 2021-03-23 19:31 | 1001 | 0.354919 | 0.810093 | 273 | 2021-03-24 19:38 |
- feast.driver_test_data.create_driver_hourly_stats_df(drivers, start_date, end_date) pandas.core.frame.DataFrame [source]¶
Example df generated by this function:
event_timestamp | driver_id | conv_rate | acc_rate | avg_daily_trips | created ||------------------+-----------+-----------+----------+-----------------+------------------| | 2021-03-17 19:31 | 5010 | 0.229297 | 0.685843 | 861 | 2021-03-24 19:34 | | 2021-03-17 20:31 | 5010 | 0.781655 | 0.861280 | 769 | 2021-03-24 19:34 | | 2021-03-17 21:31 | 5010 | 0.150333 | 0.525581 | 778 | 2021-03-24 19:34 | | 2021-03-17 22:31 | 5010 | 0.951701 | 0.228883 | 570 | 2021-03-24 19:34 | | 2021-03-17 23:31 | 5010 | 0.819598 | 0.262503 | 473 | 2021-03-24 19:34 | | | … | … | … | … | | | 2021-03-24 16:31 | 5001 | 0.061585 | 0.658140 | 477 | 2021-03-24 19:34 | | 2021-03-24 17:31 | 5001 | 0.088949 | 0.303897 | 618 | 2021-03-24 19:34 | | 2021-03-24 18:31 | 5001 | 0.096652 | 0.747421 | 480 | 2021-03-24 19:34 | | 2021-03-17 19:31 | 5005 | 0.142936 | 0.707596 | 466 | 2021-03-24 19:34 | | 2021-03-17 19:31 | 5005 | 0.142936 | 0.707596 | 466 | 2021-03-24 19:34 |
feast.entity module¶
- class feast.entity.Entity(name: str, value_type: feast.value_type.ValueType = ValueType.UNKNOWN, description: str = '', join_key: Optional[str] = None, labels: Optional[Dict[str, str]] = None)[source]¶
Bases:
object
Represents a collection of entities and associated metadata.
- Parameters
name – Name of the entity.
value_type (optional) – The type of the entity, such as string or float.
description (optional) – Additional information to describe the entity.
join_key (optional) – A property that uniquely identifies different entities within the collection. Used as a key for joining entities with their associated features. If not specified, defaults to the name of the entity.
labels (optional) – User-defined metadata in dictionary form.
- property created_timestamp: Optional[datetime.datetime]¶
Gets the created_timestamp of this entity.
- property description: str¶
Gets the description of this entity.
- classmethod from_dict(entity_dict)[source]¶
Creates an entity from a dict.
- Parameters
entity_dict – A dict representation of an entity.
- Returns
An EntityV2 object based on the entity dict.
- classmethod from_proto(entity_proto: feast.core.Entity_pb2.Entity)[source]¶
Creates an entity from a protobuf representation of an entity.
- Parameters
entity_proto – A protobuf representation of an entity.
- Returns
An EntityV2 object based on the entity protobuf.
- classmethod from_yaml(yml: str)[source]¶
Creates an entity from a YAML string body or a file path.
- Parameters
yml – Either a file path containing a yaml file or a YAML string.
- Returns
An EntityV2 object based on the YAML file.
- is_valid()[source]¶
Validates the state of this entity locally.
- Raises
ValueError – The entity does not have a name or does not have a type.
- property join_key: str¶
Gets the join key of this entity.
- property labels: Dict[str, str]¶
Gets the labels of this entity.
- property last_updated_timestamp: Optional[datetime.datetime]¶
Gets the last_updated_timestamp of this entity.
- property name: str¶
Gets the name of this entity.
- to_dict() Dict [source]¶
Converts entity to dict.
- Returns
Dictionary object representation of entity.
- to_proto() feast.core.Entity_pb2.Entity [source]¶
Converts an entity object to its protobuf representation.
- Returns
An EntityV2Proto protobuf.
- to_spec_proto() feast.core.Entity_pb2.EntitySpecV2 [source]¶
Converts an EntityV2 object to its protobuf representation. Used when passing EntitySpecV2 object to Feast request.
- Returns
An EntitySpecV2 protobuf.
- to_yaml()[source]¶
Converts a entity to a YAML string.
- Returns
An entity string returned in YAML format.
- property value_type: feast.value_type.ValueType¶
Gets the type of this entity.
feast.errors module¶
- exception feast.errors.EntityTimestampInferenceException(expected_column_name: str)[source]¶
Bases:
Exception
- exception feast.errors.FeastClassImportError(module_name, class_name, class_type='provider')[source]¶
Bases:
Exception
- exception feast.errors.FeastClassInvalidName(class_name: str, class_type: str)[source]¶
Bases:
Exception
- exception feast.errors.FeastEntityDFMissingColumnsError(expected, missing)[source]¶
Bases:
Exception
- exception feast.errors.FeastExtrasDependencyImportError(extras_type: str, nested_error: str)[source]¶
Bases:
Exception
- exception feast.errors.FeastJoinKeysDuringMaterialization(source: str, join_key_columns: Set[str], source_columns: Set[str])[source]¶
Bases:
Exception
- exception feast.errors.FeastModuleImportError(module_name: str, module_type: str)[source]¶
Bases:
Exception
- exception feast.errors.FeastOfflineStoreUnsupportedDataSource(offline_store_name: str, data_source_name: str)[source]¶
Bases:
Exception
- exception feast.errors.FeastOnlineStoreInvalidName(online_store_class_name: str)[source]¶
Bases:
Exception
- exception feast.errors.FeastOnlineStoreUnsupportedDataSource(online_store_name: str, data_source_name: str)[source]¶
Bases:
Exception
- exception feast.errors.FeastProviderLoginError[source]¶
Bases:
Exception
Error class that indicates a user has not authenticated with their provider.
- exception feast.errors.FeatureNameCollisionError(feature_refs_collisions: List[str], full_feature_names: bool)[source]¶
Bases:
Exception
- exception feast.errors.RegistryInferenceFailure(repo_obj_type: str, specific_issue: str)[source]¶
Bases:
Exception
feast.feature module¶
- class feast.feature.Feature(name: str, dtype: feast.value_type.ValueType, labels: Optional[Dict[str, str]] = None)[source]¶
Bases:
object
A Feature represents a class of serveable feature.
- Parameters
name – Name of the feature.
dtype – The type of the feature, such as string or float.
labels (optional) – User-defined metadata in dictionary form.
- property dtype: feast.value_type.ValueType¶
Gets the data type of this feature.
- classmethod from_proto(feature_proto: feast.core.Feature_pb2.FeatureSpecV2)[source]¶
- Parameters
feature_proto – FeatureSpecV2 protobuf object
- Returns
Feature object
- property labels: Dict[str, str]¶
Gets the labels of this feature.
- property name¶
Gets the name of this feature.
- class feast.feature.FeatureRef(name: str, feature_table: str)[source]¶
Bases:
object
Feature Reference represents a reference to a specific feature.
- classmethod from_proto(proto: feast.serving.ServingService_pb2.FeatureReferenceV2)[source]¶
Construct a feature reference from the given FeatureReference proto
- Parameters
proto – Protobuf FeatureReference to construct from
- Returns
FeatureRef that refers to the given feature
- classmethod from_str(feature_ref_str: str)[source]¶
Parse the given string feature reference into FeatureRef model String feature reference should be in the format feature_table:feature. Where “feature_table” and “name” are the feature_table name and feature name respectively.
- Parameters
feature_ref_str – String representation of the feature reference
- Returns
FeatureRef that refers to the given feature
feast.feature_store module¶
- class feast.feature_store.FeatureStore(repo_path: Optional[str] = None, config: Optional[feast.repo_config.RepoConfig] = None)[source]¶
Bases:
object
A FeatureStore object is used to define, create, and retrieve features.
- Parameters
repo_path (optional) – Path to a feature_store.yaml used to configure the feature store.
config (optional) – Configuration object used to configure the feature store.
- apply(objects: Union[feast.entity.Entity, feast.feature_view.FeatureView, feast.feature_service.FeatureService, List[Union[feast.feature_view.FeatureView, feast.entity.Entity, feast.feature_service.FeatureService]]], commit: bool = True)[source]¶
Register objects to metadata store and update related infrastructure.
The apply method registers one or more definitions (e.g., Entity, FeatureView) and registers or updates these objects in the Feast registry. Once the registry has been updated, the apply method will update related infrastructure (e.g., create tables in an online store) in order to reflect these new definitions. All operations are idempotent, meaning they can safely be rerun.
- Parameters
objects – A single object, or a list of objects that should be registered with the Feature Store.
commit – whether to commit changes to the registry
- Raises
ValueError – The ‘objects’ parameter could not be parsed properly.
Examples
Register an Entity and a FeatureView.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id") >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view
- config: feast.repo_config.RepoConfig¶
- delete_feature_service(name: str)[source]¶
Deletes a feature service.
- Parameters
name – Name of feature service.
- Raises
FeatureServiceNotFoundException – The feature view could not be found.
- delete_feature_view(name: str)[source]¶
Deletes a feature view.
- Parameters
name – Name of feature view.
- Raises
FeatureViewNotFoundException – The feature view could not be found.
- get_entity(name: str) feast.entity.Entity [source]¶
Retrieves an entity.
- Parameters
name – Name of entity.
- Returns
The specified entity.
- Raises
EntityNotFoundException – The entity could not be found.
- get_feature_service(name: str) feast.feature_service.FeatureService [source]¶
Retrieves a feature service.
- Parameters
name – Name of feature service.
- Returns
The specified feature service.
- Raises
FeatureServiceNotFoundException – The feature service could not be found.
- get_feature_view(name: str) feast.feature_view.FeatureView [source]¶
Retrieves a feature view.
- Parameters
name – Name of feature view.
- Returns
The specified feature view.
- Raises
FeatureViewNotFoundException – The feature view could not be found.
- get_historical_features(entity_df: Union[pandas.core.frame.DataFrame, str], features: Optional[Union[List[str], feast.feature_service.FeatureService]] = None, feature_refs: Optional[List[str]] = None, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]¶
Enrich an entity dataframe with historical feature values for either training or batch scoring.
This method joins historical feature data from one or more feature views to an entity dataframe by using a time travel join.
Each feature view is joined to the entity dataframe using all entities configured for the respective feature view. All configured entities must be available in the entity dataframe. Therefore, the entity dataframe must contain all entities found in all feature views, but the individual feature views can have different entities.
Time travel is based on the configured TTL for each feature view. A shorter TTL will limit the amount of scanning that will be done in order to find feature data for a specific entity key. Setting a short TTL may result in null values being returned.
- Parameters
entity_df (Union[pd.DataFrame, str]) – An entity dataframe is a collection of rows containing all entity columns (e.g., customer_id, driver_id) on which features need to be joined, as well as a event_timestamp column used to ensure point-in-time correctness. Either a Pandas DataFrame can be provided or a string SQL query. The query must be of a format supported by the configured offline store (e.g., BigQuery)
features – A list of features, that should be retrieved from the offline store. Either a list of string feature references can be provided or a FeatureService object. Feature references are of the format “feature_view:feature”, e.g., “customer_fv:daily_transactions”.
full_feature_names – A boolean that provides the option to add the feature view prefixes to the feature names, changing them from the format “feature” to “feature_view__feature” (e.g., “daily_transactions” changes to “customer_fv__daily_transactions”). By default, this value is set to False.
- Returns
RetrievalJob which can be used to materialize the results.
- Raises
ValueError – Both or neither of features and feature_refs are specified.
Examples
Retrieve historical features from a local offline store.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> import pandas as pd >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> # Before retrieving historical features, we must register the appropriate entity and featureview. >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id") >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... features=[ ... Feature(name="conv_rate", dtype=ValueType.FLOAT), ... Feature(name="acc_rate", dtype=ValueType.FLOAT), ... Feature(name="avg_daily_trips", dtype=ValueType.INT64), ... ], ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view >>> entity_df = pd.DataFrame.from_dict( ... { ... "driver_id": [1001, 1002], ... "event_timestamp": [ ... datetime(2021, 4, 12, 10, 59, 42), ... datetime(2021, 4, 12, 8, 12, 10), ... ], ... } ... ) >>> retrieval_job = fs.get_historical_features( ... entity_df=entity_df, ... features=[ ... "driver_hourly_stats:conv_rate", ... "driver_hourly_stats:acc_rate", ... "driver_hourly_stats:avg_daily_trips", ... ], ... ) >>> feature_data = retrieval_job.to_df()
- get_online_features(features: Union[List[str], feast.feature_service.FeatureService], entity_rows: List[Dict[str, Any]], feature_refs: Optional[List[str]] = None, full_feature_names: bool = False) feast.online_response.OnlineResponse [source]¶
Retrieves the latest online feature data.
Note: This method will download the full feature registry the first time it is run. If you are using a remote registry like GCS or S3 then that may take a few seconds. The registry remains cached up to a TTL duration (which can be set to infinity). If the cached registry is stale (more time than the TTL has passed), then a new registry will be downloaded synchronously by this method. This download may introduce latency to online feature retrieval. In order to avoid synchronous downloads, please call refresh_registry() prior to the TTL being reached. Remember it is possible to set the cache TTL to infinity (cache forever).
- Parameters
features – List of feature references that will be returned for each entity. Each feature reference should have the following format: “feature_table:feature” where “feature_table” & “feature” refer to the feature and feature table names respectively. Only the feature name is required.
entity_rows – A list of dictionaries where each key-value is an entity-name, entity-value pair.
- Returns
OnlineResponse containing the feature data in records.
- Raises
Exception – No entity with the specified name exists.
Examples
Materialize all features into the online store over the interval from 3 hours ago to 10 minutes ago, and then retrieve these online features.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> import pandas as pd >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> # Before getting online features, we must register the appropriate entity and featureview and then materialize the features. >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",) >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... features=[ ... Feature(name="conv_rate", dtype=ValueType.FLOAT), ... Feature(name="acc_rate", dtype=ValueType.FLOAT), ... Feature(name="avg_daily_trips", dtype=ValueType.INT64), ... ], ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view >>> fs.materialize( ... start_date=datetime.utcnow() - timedelta(hours=3), end_date=datetime.utcnow() - timedelta(minutes=10) ... ) Materializing... ... >>> online_response = fs.get_online_features( ... features=[ ... "driver_hourly_stats:conv_rate", ... "driver_hourly_stats:acc_rate", ... "driver_hourly_stats:avg_daily_trips", ... ], ... entity_rows=[{"driver_id": 1001}, {"driver_id": 1002}, {"driver_id": 1003}, {"driver_id": 1004}], ... ) >>> online_response_dict = online_response.to_dict()
- list_entities(allow_cache: bool = False) List[feast.entity.Entity] [source]¶
Retrieves the list of entities from the registry.
- Parameters
allow_cache – Whether to allow returning entities from a cached registry.
- Returns
A list of entities.
- list_feature_services() List[feast.feature_service.FeatureService] [source]¶
Retrieves the list of feature services from the registry.
- Returns
A list of feature services.
- list_feature_views() List[feast.feature_view.FeatureView] [source]¶
Retrieves the list of feature views from the registry.
- Returns
A list of feature views.
- materialize(start_date: datetime.datetime, end_date: datetime.datetime, feature_views: Optional[List[str]] = None) None [source]¶
Materialize data from the offline store into the online store.
This method loads feature data in the specified interval from either the specified feature views, or all feature views if none are specified, into the online store where it is available for online serving.
- Parameters
start_date (datetime) – Start date for time range of data to materialize into the online store
end_date (datetime) – End date for time range of data to materialize into the online store
feature_views (List[str]) – Optional list of feature view names. If selected, will only run materialization for the specified feature views.
Examples
Materialize all features into the online store over the interval from 3 hours ago to 10 minutes ago.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> # Before materializing, we must register the appropriate entity and featureview. >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",) >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... features=[ ... Feature(name="conv_rate", dtype=ValueType.FLOAT), ... Feature(name="acc_rate", dtype=ValueType.FLOAT), ... Feature(name="avg_daily_trips", dtype=ValueType.INT64), ... ], ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view >>> fs.materialize( ... start_date=datetime.utcnow() - timedelta(hours=3), end_date=datetime.utcnow() - timedelta(minutes=10) ... ) Materializing... ...
- materialize_incremental(end_date: datetime.datetime, feature_views: Optional[List[str]] = None) None [source]¶
Materialize incremental new data from the offline store into the online store.
This method loads incremental new feature data up to the specified end time from either the specified feature views, or all feature views if none are specified, into the online store where it is available for online serving. The start time of the interval materialized is either the most recent end time of a prior materialization or (now - ttl) if no such prior materialization exists.
- Parameters
end_date (datetime) – End date for time range of data to materialize into the online store
feature_views (List[str]) – Optional list of feature view names. If selected, will only run materialization for the specified feature views.
- Raises
Exception – A feature view being materialized does not have a TTL set.
Examples
Materialize all features into the online store up to 5 minutes ago.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> # Before materializing, we must register the appropriate entity and featureview. >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",) >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... features=[ ... Feature(name="conv_rate", dtype=ValueType.FLOAT), ... Feature(name="acc_rate", dtype=ValueType.FLOAT), ... Feature(name="avg_daily_trips", dtype=ValueType.INT64), ... ], ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view >>> fs.materialize_incremental(end_date=datetime.utcnow() - timedelta(minutes=5)) Materializing... ...
- property project: str¶
Gets the project of this feature store.
- refresh_registry()[source]¶
Fetches and caches a copy of the feature registry in memory.
Explicitly calling this method allows for direct control of the state of the registry cache. Every time this method is called the complete registry state will be retrieved from the remote registry store backend (e.g., GCS, S3), and the cache timer will be reset. If refresh_registry() is run before get_online_features() is called, then get_online_feature() will use the cached registry instead of retrieving (and caching) the registry itself.
Additionally, the TTL for the registry cache can be set to infinity (by setting it to 0), which means that refresh_registry() will become the only way to update the cached registry. If the TTL is set to a value greater than 0, then once the cache becomes stale (more time than the TTL has passed), a new cache will be downloaded synchronously, which may increase latencies if the triggering method is get_online_features()
- property registry: feast.registry.Registry¶
Gets the registry of this feature store.
- repo_path: pathlib.Path¶
feast.feature_table module¶
- class feast.feature_table.FeatureTable(name: str, entities: List[str], features: List[feast.feature.Feature], batch_source: Optional[feast.data_source.DataSource] = None, stream_source: Optional[Union[feast.data_source.KafkaSource, feast.data_source.KinesisSource]] = None, max_age: Optional[google.protobuf.duration_pb2.Duration] = None, labels: Optional[MutableMapping[str, str]] = None)[source]¶
Bases:
object
Represents a collection of features and associated metadata.
- add_feature(feature: feast.feature.Feature)[source]¶
Adds a new feature to the feature table.
- property batch_source¶
Returns the batch source of this feature table
- property created_timestamp¶
Returns the created_timestamp of this feature table
- property entities¶
Returns the entities of this feature table
- property features¶
Returns the features of this feature table
- classmethod from_dict(ft_dict)[source]¶
Creates a feature table from a dict
- Parameters
ft_dict – A dict representation of a feature table
- Returns
Returns a FeatureTable object based on the feature table dict
- classmethod from_proto(feature_table_proto: feast.core.FeatureTable_pb2.FeatureTable)[source]¶
Creates a feature table from a protobuf representation of a feature table
- Parameters
feature_table_proto – A protobuf representation of a feature table
- Returns
Returns a FeatureTableProto object based on the feature table protobuf
- classmethod from_yaml(yml: str)[source]¶
Creates a feature table from a YAML string body or a file path
- Parameters
yml – Either a file path containing a yaml file or a YAML string
- Returns
Returns a FeatureTable object based on the YAML file
- is_valid()[source]¶
Validates the state of a feature table locally. Raises an exception if feature table is invalid.
- property labels¶
Returns the labels of this feature table. This is the user defined metadata defined as a dictionary.
- property last_updated_timestamp¶
Returns the last_updated_timestamp of this feature table
- property max_age¶
Returns the maximum age of this feature table. This is the total maximum amount of staleness that will be allowed during feature retrieval for each specific feature that is looked up.
- property name¶
Returns the name of this feature table
- property stream_source¶
Returns the stream source of this feature table
- to_dict() Dict [source]¶
Converts feature table to dict
- Returns
Dictionary object representation of feature table
- to_proto() feast.core.FeatureTable_pb2.FeatureTable [source]¶
Converts an feature table object to its protobuf representation
- Returns
FeatureTableProto protobuf
feast.feature_view module¶
- class feast.feature_view.FeatureView(name: str, entities: List[str], ttl: Union[google.protobuf.duration_pb2.Duration, datetime.timedelta], input: Optional[feast.data_source.DataSource] = None, batch_source: Optional[feast.data_source.DataSource] = None, stream_source: Optional[feast.data_source.DataSource] = None, features: Optional[List[feast.feature.Feature]] = None, tags: Optional[Dict[str, str]] = None, online: bool = True)[source]¶
Bases:
object
A FeatureView defines a logical grouping of serveable features.
- Parameters
name – Name of the group of features.
entities – The entities to which this group of features is associated.
ttl – The amount of time this group of features lives. A ttl of 0 indicates that this group of features lives forever. Note that large ttl’s or a ttl of 0 can result in extremely computationally intensive queries.
input – The source of data where this group of features is stored.
batch_source (optional) – The batch source of data where this group of features is stored.
stream_source (optional) – The stream source of data where this group of features is stored.
features (optional) – The set of features defined as part of this FeatureView.
tags (optional) – A dictionary of key-value pairs used for organizing FeatureViews.
- batch_source: feast.data_source.DataSource¶
- created_timestamp: Optional[datetime.datetime] = None¶
- features: List[feast.feature.Feature]¶
- classmethod from_proto(feature_view_proto: feast.core.FeatureView_pb2.FeatureView)[source]¶
Creates a feature view from a protobuf representation of a feature view.
- Parameters
feature_view_proto – A protobuf representation of a feature view.
- Returns
A FeatureViewProto object based on the feature view protobuf.
- infer_features_from_batch_source(config: feast.repo_config.RepoConfig)[source]¶
Infers the set of features associated to this feature view from the input source.
- Parameters
config – Configuration object used to configure the feature store.
- Raises
RegistryInferenceFailure – The set of features could not be inferred.
- is_valid()[source]¶
Validates the state of this feature view locally.
- Raises
ValueError – The feature view does not have a name or does not have entities.
- last_updated_timestamp: Optional[datetime.datetime] = None¶
- materialization_intervals: List[Tuple[datetime.datetime, datetime.datetime]]¶
- property most_recent_end_time: Optional[datetime.datetime]¶
Retrieves the latest time up to which the feature view has been materialized.
- Returns
The latest time, or None if the feature view has not been materialized.
- stream_source: Optional[feast.data_source.DataSource] = None¶
- to_proto() feast.core.FeatureView_pb2.FeatureView [source]¶
Converts a feature view object to its protobuf representation.
- Returns
A FeatureViewProto protobuf.
- ttl: datetime.timedelta¶
feast.names module¶
feast.online_response module¶
- class feast.online_response.OnlineResponse(online_response_proto: feast.serving.ServingService_pb2.GetOnlineFeaturesResponse)[source]¶
Bases:
object
Defines a online response in feast.
- property field_values¶
Getter for GetOnlineResponse’s field_values.
feast.registry module¶
- class feast.registry.GCSRegistryStore(uri: str)[source]¶
Bases:
feast.registry.RegistryStore
- class feast.registry.LocalRegistryStore(repo_path: pathlib.Path, registry_path_string: str)[source]¶
Bases:
feast.registry.RegistryStore
- class feast.registry.Registry(registry_path: str, repo_path: pathlib.Path, cache_ttl: datetime.timedelta)[source]¶
Bases:
object
Registry: A registry allows for the management and persistence of feature definitions and related metadata.
- apply_entity(entity: feast.entity.Entity, project: str, commit: bool = True)[source]¶
Registers a single entity with Feast
- Parameters
entity – Entity that will be registered
project – Feast project that this entity belongs to
commit – Whether the change should be persisted immediately
- apply_feature_service(feature_service: feast.feature_service.FeatureService, project: str, commit: bool = True)[source]¶
Registers a single feature service with Feast
- Parameters
feature_service – A feature service that will be registered
project – Feast project that this entity belongs to
- apply_feature_table(feature_table: feast.feature_table.FeatureTable, project: str, commit: bool = True)[source]¶
Registers a single feature table with Feast
- Parameters
feature_table – Feature table that will be registered
project – Feast project that this feature table belongs to
commit – Whether the change should be persisted immediately
- apply_feature_view(feature_view: feast.feature_view.FeatureView, project: str, commit: bool = True)[source]¶
Registers a single feature view with Feast
- Parameters
feature_view – Feature view that will be registered
project – Feast project that this feature view belongs to
commit – Whether the change should be persisted immediately
- apply_materialization(feature_view: feast.feature_view.FeatureView, project: str, start_date: datetime.datetime, end_date: datetime.datetime, commit: bool = True)[source]¶
Updates materialization intervals tracked for a single feature view in Feast
- Parameters
feature_view – Feature view that will be updated with an additional materialization interval tracked
project – Feast project that this feature view belongs to
start_date (datetime) – Start date of the materialization interval to track
end_date (datetime) – End date of the materialization interval to track
commit – Whether the change should be persisted immediately
- cached_registry_proto: Optional[feast.core.Registry_pb2.Registry] = None¶
- cached_registry_proto_created: Optional[datetime.datetime] = None¶
- cached_registry_proto_ttl: datetime.timedelta¶
- delete_feature_service(name: str, project: str, commit: bool = True)[source]¶
Deletes a feature service or raises an exception if not found.
- Parameters
name – Name of feature service
project – Feast project that this feature service belongs to
commit – Whether the change should be persisted immediately
- delete_feature_table(name: str, project: str, commit: bool = True)[source]¶
Deletes a feature table or raises an exception if not found.
- Parameters
name – Name of feature table
project – Feast project that this feature table belongs to
commit – Whether the change should be persisted immediately
- delete_feature_view(name: str, project: str, commit: bool = True)[source]¶
Deletes a feature view or raises an exception if not found.
- Parameters
name – Name of feature view
project – Feast project that this feature view belongs to
commit – Whether the change should be persisted immediately
- get_entity(name: str, project: str, allow_cache: bool = False) feast.entity.Entity [source]¶
Retrieves an entity.
- Parameters
name – Name of entity
project – Feast project that this entity belongs to
- Returns
Returns either the specified entity, or raises an exception if none is found
- get_feature_service(name: str, project: str, allow_cache: bool = False) feast.feature_service.FeatureService [source]¶
Retrieves a feature service.
- Parameters
name – Name of feature service
project – Feast project that this feature service belongs to
- Returns
Returns either the specified feature service, or raises an exception if none is found
- get_feature_table(name: str, project: str) feast.feature_table.FeatureTable [source]¶
Retrieves a feature table.
- Parameters
name – Name of feature table
project – Feast project that this feature table belongs to
- Returns
Returns either the specified feature table, or raises an exception if none is found
- get_feature_view(name: str, project: str) feast.feature_view.FeatureView [source]¶
Retrieves a feature view.
- Parameters
name – Name of feature view
project – Feast project that this feature view belongs to
- Returns
Returns either the specified feature view, or raises an exception if none is found
- list_entities(project: str, allow_cache: bool = False) List[feast.entity.Entity] [source]¶
Retrieve a list of entities from the registry
- Parameters
allow_cache – Whether to allow returning entities from a cached registry
project – Filter entities based on project name
- Returns
List of entities
- list_feature_services(project: str, allow_cache: bool = False) List[feast.feature_service.FeatureService] [source]¶
Retrieve a list of feature services from the registry
- Parameters
allow_cache – Whether to allow returning entities from a cached registry
project – Filter entities based on project name
- Returns
List of feature services
- list_feature_tables(project: str) List[feast.feature_table.FeatureTable] [source]¶
Retrieve a list of feature tables from the registry
- Parameters
project – Filter feature tables based on project name
- Returns
List of feature tables
- list_feature_views(project: str, allow_cache: bool = False) List[feast.feature_view.FeatureView] [source]¶
Retrieve a list of feature views from the registry
- Parameters
allow_cache – Allow returning feature views from the cached registry
project – Filter feature tables based on project name
- Returns
List of feature views
- class feast.registry.RegistryStore[source]¶
Bases:
abc.ABC
RegistryStore: abstract base class implemented by specific backends (local file system, GCS) containing lower level methods used by the Registry class that are backend-specific.
- class feast.registry.S3RegistryStore(uri: str)[source]¶
Bases:
feast.registry.RegistryStore
feast.repo_config module¶
- class feast.repo_config.FeastBaseModel(**extra_data: Any)[source]¶
Bases:
pydantic.main.BaseModel
Feast Pydantic Configuration Class
- class feast.repo_config.FeastConfigBaseModel[source]¶
Bases:
pydantic.main.BaseModel
Feast Pydantic Configuration Class
- class feast.repo_config.RegistryConfig(*, path: pydantic.types.StrictStr, cache_ttl_seconds: pydantic.types.StrictInt = 600, **extra_data: Any)[source]¶
Bases:
feast.repo_config.FeastBaseModel
Metadata Store Configuration. Configuration that relates to reading from and writing to the Feast registry.
- cache_ttl_seconds: pydantic.types.StrictInt¶
The cache TTL is the amount of time registry state will be cached in memory. If this TTL is exceeded then the registry will be refreshed when any feature store method asks for access to registry state. The TTL can be set to infinity by setting TTL to 0 seconds, which means the cache will only be loaded once and will never expire. Users can manually refresh the cache by calling feature_store.refresh_registry()
- Type
- class feast.repo_config.RepoConfig(*, registry: Union[pydantic.types.StrictStr, feast.repo_config.RegistryConfig] = 'data/registry.db', project: pydantic.types.StrictStr, provider: pydantic.types.StrictStr, online_store: Any = None, offline_store: Any = None, repo_path: pathlib.Path = None, **data: Any)[source]¶
Bases:
feast.repo_config.FeastBaseModel
Repo config. Typically loaded from feature_store.yaml
- offline_store: Any¶
Offline store configuration (optional depending on provider)
- Type
OfflineStoreConfig
- online_store: Any¶
Online store configuration (optional depending on provider)
- Type
OnlineStoreConfig
- project: pydantic.types.StrictStr¶
Feast project id. This can be any alphanumeric string up to 16 characters. You can have multiple independent feature repositories deployed to the same cloud provider account, as long as they have different project ids.
- Type
- registry: Union[pydantic.types.StrictStr, feast.repo_config.RegistryConfig]¶
Path to metadata store. Can be a local path, or remote object storage path, e.g. a GCS URI
- Type
- repo_path: Optional[pathlib.Path]¶
- feast.repo_config.load_repo_config(repo_path: pathlib.Path) feast.repo_config.RepoConfig [source]¶
feast.repo_operations module¶
- class feast.repo_operations.ParsedRepo(feature_tables, feature_views, entities, feature_services)[source]¶
Bases:
tuple
- property entities¶
Alias for field number 2
- property feature_services¶
Alias for field number 3
- property feature_tables¶
Alias for field number 0
- property feature_views¶
Alias for field number 1
- feast.repo_operations.apply_total(repo_config: feast.repo_config.RepoConfig, repo_path: pathlib.Path, skip_source_validation: bool)[source]¶
- feast.repo_operations.cli_check_repo(repo_path: pathlib.Path)[source]¶
- feast.repo_operations.get_ignore_files(repo_root: pathlib.Path, ignore_paths: List[str]) Set[pathlib.Path] [source]¶
Get all ignore files that match any of the user-defined ignore paths
- feast.repo_operations.get_repo_files(repo_root: pathlib.Path) List[pathlib.Path] [source]¶
Get the list of all repo files, ignoring undesired files & directories specified in .feastignore
- feast.repo_operations.is_valid_name(name: str) bool [source]¶
A name should be alphanumeric values and underscores but not start with an underscore
- feast.repo_operations.parse_repo(repo_root: pathlib.Path) feast.repo_operations.ParsedRepo [source]¶
Collect feature table definitions from feature repo
- feast.repo_operations.py_path_to_module(path: pathlib.Path, repo_root: pathlib.Path) str [source]¶
- feast.repo_operations.read_feastignore(repo_root: pathlib.Path) List[str] [source]¶
Read .feastignore in the repo root directory (if exists) and return the list of user-defined ignore paths
- feast.repo_operations.registry_dump(repo_config: feast.repo_config.RepoConfig, repo_path: pathlib.Path)[source]¶
For debugging only: output contents of the metadata registry
- feast.repo_operations.teardown(repo_config: feast.repo_config.RepoConfig, repo_path: pathlib.Path)[source]¶
feast.telemetry module¶
feast.type_map module¶
- feast.type_map.feast_value_type_to_python_type(field_value_proto: feast.types.Value_pb2.Value) Any [source]¶
Converts field value Proto to Dict and returns each field’s Feast Value Type value in their respective Python value.
- Parameters
field_value_proto – Field value Proto
- Returns
Python native type representation/version of the given field_value_proto
- feast.type_map.pa_to_feast_value_type(pa_type_as_str: str) feast.value_type.ValueType [source]¶
- feast.type_map.python_type_to_feast_value_type(name: str, value, recurse: bool = True) feast.value_type.ValueType [source]¶
Finds the equivalent Feast Value Type for a Python value. Both native and Pandas types are supported. This function will recursively look for nested types when arrays are detected. All types must be homogenous.
- Parameters
name – Name of the value or field
value – Value that will be inspected
recurse – Whether to recursively look for nested types in arrays
- Returns
Feast Value Type
- feast.type_map.python_value_to_proto_value(value: Any, feature_type: Optional[feast.value_type.ValueType] = None) feast.types.Value_pb2.Value [source]¶
- feast.type_map.redshift_to_feast_value_type(redshift_type_as_str: str) feast.value_type.ValueType [source]¶
feast.utils module¶
- feast.utils.make_tzaware(t: datetime.datetime) datetime.datetime [source]¶
We assume tz-naive datetimes are UTC
feast.value_type module¶
- class feast.value_type.ValueType(value)[source]¶
Bases:
enum.Enum
Feature value type. Used to define data types in Feature Tables.
- BOOL = 7¶
- BOOL_LIST = 17¶
- BYTES = 1¶
- BYTES_LIST = 11¶
- DOUBLE = 5¶
- DOUBLE_LIST = 15¶
- FLOAT = 6¶
- FLOAT_LIST = 16¶
- INT32 = 3¶
- INT32_LIST = 13¶
- INT64 = 4¶
- INT64_LIST = 14¶
- STRING = 2¶
- STRING_LIST = 12¶
- UNIX_TIMESTAMP = 8¶
- UNIX_TIMESTAMP_LIST = 18¶
- UNKNOWN = 0¶
feast.version module¶
feast.wait module¶
- feast.wait.wait_retry_backoff(retry_fn: Callable[[], Tuple[Any, bool]], timeout_secs: int = 0, timeout_msg: Optional[str] = 'Timeout while waiting for retry_fn() to return True', max_interval_secs: int = 60) Any [source]¶
Repeatedly try calling given retry_fn until it returns a True boolean success flag. Waits with a exponential backoff between retries until timeout when it throws TimeoutError. :param retry_fn: Callable that returns a result and a boolean success flag. :param timeout_secs: timeout in seconds to give up retrying and throw TimeoutError,
or 0 to retry perpetually.
- Parameters
timeout_msg – Message to use when throwing TimeoutError.
max_interval_secs – max wait in seconds to wait between retries.
- Returns
Returned Result from retry_fn() if success flag is True.
Module contents¶
- class feast.BigQuerySource(event_timestamp_column: Optional[str] = '', table_ref: Optional[str] = None, created_timestamp_column: Optional[str] = '', field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = '', query: Optional[str] = None)[source]¶
Bases:
feast.data_source.DataSource
- property bigquery_options¶
Returns the bigquery options of this data source
- static from_proto(data_source: feast.core.DataSource_pb2.DataSource)[source]¶
Converts data source config in FeatureTable spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]¶
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- get_table_query_string() str [source]¶
Returns a string that can directly be used to reference this table in SQL
- property query¶
- static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]¶
Returns the callable method that returns Feast type given the raw column type.
- property table_ref¶
- to_proto() feast.core.DataSource_pb2.DataSource [source]¶
Converts an DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]¶
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.
- class feast.Entity(name: str, value_type: feast.value_type.ValueType = ValueType.UNKNOWN, description: str = '', join_key: Optional[str] = None, labels: Optional[Dict[str, str]] = None)[source]¶
Bases:
object
Represents a collection of entities and associated metadata.
- Parameters
name – Name of the entity.
value_type (optional) – The type of the entity, such as string or float.
description (optional) – Additional information to describe the entity.
join_key (optional) – A property that uniquely identifies different entities within the collection. Used as a key for joining entities with their associated features. If not specified, defaults to the name of the entity.
labels (optional) – User-defined metadata in dictionary form.
- property created_timestamp: Optional[datetime.datetime]¶
Gets the created_timestamp of this entity.
- property description: str¶
Gets the description of this entity.
- classmethod from_dict(entity_dict)[source]¶
Creates an entity from a dict.
- Parameters
entity_dict – A dict representation of an entity.
- Returns
An EntityV2 object based on the entity dict.
- classmethod from_proto(entity_proto: feast.core.Entity_pb2.Entity)[source]¶
Creates an entity from a protobuf representation of an entity.
- Parameters
entity_proto – A protobuf representation of an entity.
- Returns
An EntityV2 object based on the entity protobuf.
- classmethod from_yaml(yml: str)[source]¶
Creates an entity from a YAML string body or a file path.
- Parameters
yml – Either a file path containing a yaml file or a YAML string.
- Returns
An EntityV2 object based on the YAML file.
- is_valid()[source]¶
Validates the state of this entity locally.
- Raises
ValueError – The entity does not have a name or does not have a type.
- property join_key: str¶
Gets the join key of this entity.
- property labels: Dict[str, str]¶
Gets the labels of this entity.
- property last_updated_timestamp: Optional[datetime.datetime]¶
Gets the last_updated_timestamp of this entity.
- property name: str¶
Gets the name of this entity.
- to_dict() Dict [source]¶
Converts entity to dict.
- Returns
Dictionary object representation of entity.
- to_proto() feast.core.Entity_pb2.Entity [source]¶
Converts an entity object to its protobuf representation.
- Returns
An EntityV2Proto protobuf.
- to_spec_proto() feast.core.Entity_pb2.EntitySpecV2 [source]¶
Converts an EntityV2 object to its protobuf representation. Used when passing EntitySpecV2 object to Feast request.
- Returns
An EntitySpecV2 protobuf.
- to_yaml()[source]¶
Converts a entity to a YAML string.
- Returns
An entity string returned in YAML format.
- property value_type: feast.value_type.ValueType¶
Gets the type of this entity.
- class feast.Feature(name: str, dtype: feast.value_type.ValueType, labels: Optional[Dict[str, str]] = None)[source]¶
Bases:
object
A Feature represents a class of serveable feature.
- Parameters
name – Name of the feature.
dtype – The type of the feature, such as string or float.
labels (optional) – User-defined metadata in dictionary form.
- property dtype: feast.value_type.ValueType¶
Gets the data type of this feature.
- classmethod from_proto(feature_proto: feast.core.Feature_pb2.FeatureSpecV2)[source]¶
- Parameters
feature_proto – FeatureSpecV2 protobuf object
- Returns
Feature object
- property labels: Dict[str, str]¶
Gets the labels of this feature.
- property name¶
Gets the name of this feature.
- class feast.FeatureService(name: str, features: List[Union[feast.feature_table.FeatureTable, feast.feature_view.FeatureView, feast.feature_view_projection.FeatureViewProjection]], tags: Optional[Dict[str, str]] = None)[source]¶
Bases:
object
A feature service is a logical grouping of features for retrieval (training or serving). The features grouped by a feature service may come from any number of feature views.
- Parameters
name – Unique name of the feature service.
features – A list of Features that are grouped as part of this FeatureService. The list may contain Feature Views, Feature Tables, or a subset of either.
tags (optional) – A dictionary of key-value pairs used for organizing Feature Services.
- created_timestamp: Optional[datetime.datetime] = None¶
- features: List[feast.feature_view_projection.FeatureViewProjection]¶
- static from_proto(feature_service_proto: feast.core.FeatureService_pb2.FeatureService)[source]¶
Converts a FeatureServiceProto to a FeatureService object.
- Parameters
feature_service_proto – A protobuf representation of a FeatureService.
- last_updated_timestamp: Optional[datetime.datetime] = None¶
- class feast.FeatureStore(repo_path: Optional[str] = None, config: Optional[feast.repo_config.RepoConfig] = None)[source]¶
Bases:
object
A FeatureStore object is used to define, create, and retrieve features.
- Parameters
repo_path (optional) – Path to a feature_store.yaml used to configure the feature store.
config (optional) – Configuration object used to configure the feature store.
- apply(objects: Union[feast.entity.Entity, feast.feature_view.FeatureView, feast.feature_service.FeatureService, List[Union[feast.feature_view.FeatureView, feast.entity.Entity, feast.feature_service.FeatureService]]], commit: bool = True)[source]¶
Register objects to metadata store and update related infrastructure.
The apply method registers one or more definitions (e.g., Entity, FeatureView) and registers or updates these objects in the Feast registry. Once the registry has been updated, the apply method will update related infrastructure (e.g., create tables in an online store) in order to reflect these new definitions. All operations are idempotent, meaning they can safely be rerun.
- Parameters
objects – A single object, or a list of objects that should be registered with the Feature Store.
commit – whether to commit changes to the registry
- Raises
ValueError – The ‘objects’ parameter could not be parsed properly.
Examples
Register an Entity and a FeatureView.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id") >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view
- config: feast.repo_config.RepoConfig¶
- delete_feature_service(name: str)[source]¶
Deletes a feature service.
- Parameters
name – Name of feature service.
- Raises
FeatureServiceNotFoundException – The feature view could not be found.
- delete_feature_view(name: str)[source]¶
Deletes a feature view.
- Parameters
name – Name of feature view.
- Raises
FeatureViewNotFoundException – The feature view could not be found.
- get_entity(name: str) feast.entity.Entity [source]¶
Retrieves an entity.
- Parameters
name – Name of entity.
- Returns
The specified entity.
- Raises
EntityNotFoundException – The entity could not be found.
- get_feature_service(name: str) feast.feature_service.FeatureService [source]¶
Retrieves a feature service.
- Parameters
name – Name of feature service.
- Returns
The specified feature service.
- Raises
FeatureServiceNotFoundException – The feature service could not be found.
- get_feature_view(name: str) feast.feature_view.FeatureView [source]¶
Retrieves a feature view.
- Parameters
name – Name of feature view.
- Returns
The specified feature view.
- Raises
FeatureViewNotFoundException – The feature view could not be found.
- get_historical_features(entity_df: Union[pandas.core.frame.DataFrame, str], features: Optional[Union[List[str], feast.feature_service.FeatureService]] = None, feature_refs: Optional[List[str]] = None, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]¶
Enrich an entity dataframe with historical feature values for either training or batch scoring.
This method joins historical feature data from one or more feature views to an entity dataframe by using a time travel join.
Each feature view is joined to the entity dataframe using all entities configured for the respective feature view. All configured entities must be available in the entity dataframe. Therefore, the entity dataframe must contain all entities found in all feature views, but the individual feature views can have different entities.
Time travel is based on the configured TTL for each feature view. A shorter TTL will limit the amount of scanning that will be done in order to find feature data for a specific entity key. Setting a short TTL may result in null values being returned.
- Parameters
entity_df (Union[pd.DataFrame, str]) – An entity dataframe is a collection of rows containing all entity columns (e.g., customer_id, driver_id) on which features need to be joined, as well as a event_timestamp column used to ensure point-in-time correctness. Either a Pandas DataFrame can be provided or a string SQL query. The query must be of a format supported by the configured offline store (e.g., BigQuery)
features – A list of features, that should be retrieved from the offline store. Either a list of string feature references can be provided or a FeatureService object. Feature references are of the format “feature_view:feature”, e.g., “customer_fv:daily_transactions”.
full_feature_names – A boolean that provides the option to add the feature view prefixes to the feature names, changing them from the format “feature” to “feature_view__feature” (e.g., “daily_transactions” changes to “customer_fv__daily_transactions”). By default, this value is set to False.
- Returns
RetrievalJob which can be used to materialize the results.
- Raises
ValueError – Both or neither of features and feature_refs are specified.
Examples
Retrieve historical features from a local offline store.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> import pandas as pd >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> # Before retrieving historical features, we must register the appropriate entity and featureview. >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id") >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... features=[ ... Feature(name="conv_rate", dtype=ValueType.FLOAT), ... Feature(name="acc_rate", dtype=ValueType.FLOAT), ... Feature(name="avg_daily_trips", dtype=ValueType.INT64), ... ], ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view >>> entity_df = pd.DataFrame.from_dict( ... { ... "driver_id": [1001, 1002], ... "event_timestamp": [ ... datetime(2021, 4, 12, 10, 59, 42), ... datetime(2021, 4, 12, 8, 12, 10), ... ], ... } ... ) >>> retrieval_job = fs.get_historical_features( ... entity_df=entity_df, ... features=[ ... "driver_hourly_stats:conv_rate", ... "driver_hourly_stats:acc_rate", ... "driver_hourly_stats:avg_daily_trips", ... ], ... ) >>> feature_data = retrieval_job.to_df()
- get_online_features(features: Union[List[str], feast.feature_service.FeatureService], entity_rows: List[Dict[str, Any]], feature_refs: Optional[List[str]] = None, full_feature_names: bool = False) feast.online_response.OnlineResponse [source]¶
Retrieves the latest online feature data.
Note: This method will download the full feature registry the first time it is run. If you are using a remote registry like GCS or S3 then that may take a few seconds. The registry remains cached up to a TTL duration (which can be set to infinity). If the cached registry is stale (more time than the TTL has passed), then a new registry will be downloaded synchronously by this method. This download may introduce latency to online feature retrieval. In order to avoid synchronous downloads, please call refresh_registry() prior to the TTL being reached. Remember it is possible to set the cache TTL to infinity (cache forever).
- Parameters
features – List of feature references that will be returned for each entity. Each feature reference should have the following format: “feature_table:feature” where “feature_table” & “feature” refer to the feature and feature table names respectively. Only the feature name is required.
entity_rows – A list of dictionaries where each key-value is an entity-name, entity-value pair.
- Returns
OnlineResponse containing the feature data in records.
- Raises
Exception – No entity with the specified name exists.
Examples
Materialize all features into the online store over the interval from 3 hours ago to 10 minutes ago, and then retrieve these online features.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> import pandas as pd >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> # Before getting online features, we must register the appropriate entity and featureview and then materialize the features. >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",) >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... features=[ ... Feature(name="conv_rate", dtype=ValueType.FLOAT), ... Feature(name="acc_rate", dtype=ValueType.FLOAT), ... Feature(name="avg_daily_trips", dtype=ValueType.INT64), ... ], ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view >>> fs.materialize( ... start_date=datetime.utcnow() - timedelta(hours=3), end_date=datetime.utcnow() - timedelta(minutes=10) ... ) Materializing... ... >>> online_response = fs.get_online_features( ... features=[ ... "driver_hourly_stats:conv_rate", ... "driver_hourly_stats:acc_rate", ... "driver_hourly_stats:avg_daily_trips", ... ], ... entity_rows=[{"driver_id": 1001}, {"driver_id": 1002}, {"driver_id": 1003}, {"driver_id": 1004}], ... ) >>> online_response_dict = online_response.to_dict()
- list_entities(allow_cache: bool = False) List[feast.entity.Entity] [source]¶
Retrieves the list of entities from the registry.
- Parameters
allow_cache – Whether to allow returning entities from a cached registry.
- Returns
A list of entities.
- list_feature_services() List[feast.feature_service.FeatureService] [source]¶
Retrieves the list of feature services from the registry.
- Returns
A list of feature services.
- list_feature_views() List[feast.feature_view.FeatureView] [source]¶
Retrieves the list of feature views from the registry.
- Returns
A list of feature views.
- materialize(start_date: datetime.datetime, end_date: datetime.datetime, feature_views: Optional[List[str]] = None) None [source]¶
Materialize data from the offline store into the online store.
This method loads feature data in the specified interval from either the specified feature views, or all feature views if none are specified, into the online store where it is available for online serving.
- Parameters
start_date (datetime) – Start date for time range of data to materialize into the online store
end_date (datetime) – End date for time range of data to materialize into the online store
feature_views (List[str]) – Optional list of feature view names. If selected, will only run materialization for the specified feature views.
Examples
Materialize all features into the online store over the interval from 3 hours ago to 10 minutes ago.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> # Before materializing, we must register the appropriate entity and featureview. >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",) >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... features=[ ... Feature(name="conv_rate", dtype=ValueType.FLOAT), ... Feature(name="acc_rate", dtype=ValueType.FLOAT), ... Feature(name="avg_daily_trips", dtype=ValueType.INT64), ... ], ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view >>> fs.materialize( ... start_date=datetime.utcnow() - timedelta(hours=3), end_date=datetime.utcnow() - timedelta(minutes=10) ... ) Materializing... ...
- materialize_incremental(end_date: datetime.datetime, feature_views: Optional[List[str]] = None) None [source]¶
Materialize incremental new data from the offline store into the online store.
This method loads incremental new feature data up to the specified end time from either the specified feature views, or all feature views if none are specified, into the online store where it is available for online serving. The start time of the interval materialized is either the most recent end time of a prior materialization or (now - ttl) if no such prior materialization exists.
- Parameters
end_date (datetime) – End date for time range of data to materialize into the online store
feature_views (List[str]) – Optional list of feature view names. If selected, will only run materialization for the specified feature views.
- Raises
Exception – A feature view being materialized does not have a TTL set.
Examples
Materialize all features into the online store up to 5 minutes ago.
>>> from feast import FeatureStore, Entity, FeatureView, Feature, ValueType, FileSource, RepoConfig >>> from datetime import timedelta >>> fs = FeatureStore(config=RepoConfig(registry="feature_repo/data/registry.db", project="feature_repo", provider="local")) >>> # Before materializing, we must register the appropriate entity and featureview. >>> driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",) >>> driver_hourly_stats = FileSource( ... path="feature_repo/data/driver_stats.parquet", ... event_timestamp_column="event_timestamp", ... created_timestamp_column="created", ... ) >>> driver_hourly_stats_view = FeatureView( ... name="driver_hourly_stats", ... entities=["driver_id"], ... ttl=timedelta(seconds=86400 * 1), ... features=[ ... Feature(name="conv_rate", dtype=ValueType.FLOAT), ... Feature(name="acc_rate", dtype=ValueType.FLOAT), ... Feature(name="avg_daily_trips", dtype=ValueType.INT64), ... ], ... batch_source=driver_hourly_stats, ... ) >>> fs.apply([driver_hourly_stats_view, driver]) # register entity and feature view >>> fs.materialize_incremental(end_date=datetime.utcnow() - timedelta(minutes=5)) Materializing... ...
- property project: str¶
Gets the project of this feature store.
- refresh_registry()[source]¶
Fetches and caches a copy of the feature registry in memory.
Explicitly calling this method allows for direct control of the state of the registry cache. Every time this method is called the complete registry state will be retrieved from the remote registry store backend (e.g., GCS, S3), and the cache timer will be reset. If refresh_registry() is run before get_online_features() is called, then get_online_feature() will use the cached registry instead of retrieving (and caching) the registry itself.
Additionally, the TTL for the registry cache can be set to infinity (by setting it to 0), which means that refresh_registry() will become the only way to update the cached registry. If the TTL is set to a value greater than 0, then once the cache becomes stale (more time than the TTL has passed), a new cache will be downloaded synchronously, which may increase latencies if the triggering method is get_online_features()
- property registry: feast.registry.Registry¶
Gets the registry of this feature store.
- repo_path: pathlib.Path¶
- class feast.FeatureTable(name: str, entities: List[str], features: List[feast.feature.Feature], batch_source: Optional[feast.data_source.DataSource] = None, stream_source: Optional[Union[feast.data_source.KafkaSource, feast.data_source.KinesisSource]] = None, max_age: Optional[google.protobuf.duration_pb2.Duration] = None, labels: Optional[MutableMapping[str, str]] = None)[source]¶
Bases:
object
Represents a collection of features and associated metadata.
- add_feature(feature: feast.feature.Feature)[source]¶
Adds a new feature to the feature table.
- property batch_source¶
Returns the batch source of this feature table
- property created_timestamp¶
Returns the created_timestamp of this feature table
- property entities¶
Returns the entities of this feature table
- property features¶
Returns the features of this feature table
- classmethod from_dict(ft_dict)[source]¶
Creates a feature table from a dict
- Parameters
ft_dict – A dict representation of a feature table
- Returns
Returns a FeatureTable object based on the feature table dict
- classmethod from_proto(feature_table_proto: feast.core.FeatureTable_pb2.FeatureTable)[source]¶
Creates a feature table from a protobuf representation of a feature table
- Parameters
feature_table_proto – A protobuf representation of a feature table
- Returns
Returns a FeatureTableProto object based on the feature table protobuf
- classmethod from_yaml(yml: str)[source]¶
Creates a feature table from a YAML string body or a file path
- Parameters
yml – Either a file path containing a yaml file or a YAML string
- Returns
Returns a FeatureTable object based on the YAML file
- is_valid()[source]¶
Validates the state of a feature table locally. Raises an exception if feature table is invalid.
- property labels¶
Returns the labels of this feature table. This is the user defined metadata defined as a dictionary.
- property last_updated_timestamp¶
Returns the last_updated_timestamp of this feature table
- property max_age¶
Returns the maximum age of this feature table. This is the total maximum amount of staleness that will be allowed during feature retrieval for each specific feature that is looked up.
- property name¶
Returns the name of this feature table
- property stream_source¶
Returns the stream source of this feature table
- to_dict() Dict [source]¶
Converts feature table to dict
- Returns
Dictionary object representation of feature table
- to_proto() feast.core.FeatureTable_pb2.FeatureTable [source]¶
Converts an feature table object to its protobuf representation
- Returns
FeatureTableProto protobuf
- class feast.FeatureView(name: str, entities: List[str], ttl: Union[google.protobuf.duration_pb2.Duration, datetime.timedelta], input: Optional[feast.data_source.DataSource] = None, batch_source: Optional[feast.data_source.DataSource] = None, stream_source: Optional[feast.data_source.DataSource] = None, features: Optional[List[feast.feature.Feature]] = None, tags: Optional[Dict[str, str]] = None, online: bool = True)[source]¶
Bases:
object
A FeatureView defines a logical grouping of serveable features.
- Parameters
name – Name of the group of features.
entities – The entities to which this group of features is associated.
ttl – The amount of time this group of features lives. A ttl of 0 indicates that this group of features lives forever. Note that large ttl’s or a ttl of 0 can result in extremely computationally intensive queries.
input – The source of data where this group of features is stored.
batch_source (optional) – The batch source of data where this group of features is stored.
stream_source (optional) – The stream source of data where this group of features is stored.
features (optional) – The set of features defined as part of this FeatureView.
tags (optional) – A dictionary of key-value pairs used for organizing FeatureViews.
- batch_source: feast.data_source.DataSource¶
- created_timestamp: Optional[datetime.datetime] = None¶
- features: List[feast.feature.Feature]¶
- classmethod from_proto(feature_view_proto: feast.core.FeatureView_pb2.FeatureView)[source]¶
Creates a feature view from a protobuf representation of a feature view.
- Parameters
feature_view_proto – A protobuf representation of a feature view.
- Returns
A FeatureViewProto object based on the feature view protobuf.
- infer_features_from_batch_source(config: feast.repo_config.RepoConfig)[source]¶
Infers the set of features associated to this feature view from the input source.
- Parameters
config – Configuration object used to configure the feature store.
- Raises
RegistryInferenceFailure – The set of features could not be inferred.
- is_valid()[source]¶
Validates the state of this feature view locally.
- Raises
ValueError – The feature view does not have a name or does not have entities.
- last_updated_timestamp: Optional[datetime.datetime] = None¶
- materialization_intervals: List[Tuple[datetime.datetime, datetime.datetime]]¶
- property most_recent_end_time: Optional[datetime.datetime]¶
Retrieves the latest time up to which the feature view has been materialized.
- Returns
The latest time, or None if the feature view has not been materialized.
- stream_source: Optional[feast.data_source.DataSource] = None¶
- to_proto() feast.core.FeatureView_pb2.FeatureView [source]¶
Converts a feature view object to its protobuf representation.
- Returns
A FeatureViewProto protobuf.
- ttl: datetime.timedelta¶
- class feast.FileSource(event_timestamp_column: Optional[str] = '', file_url: Optional[str] = None, path: Optional[str] = None, file_format: Optional[feast.data_format.FileFormat] = None, created_timestamp_column: Optional[str] = '', field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = '')[source]¶
Bases:
feast.data_source.DataSource
- property file_options¶
Returns the file options of this data source
- static from_proto(data_source: feast.core.DataSource_pb2.DataSource)[source]¶
Converts data source config in FeatureTable spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]¶
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- property path¶
Returns the file path of this feature data source
- static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]¶
Returns the callable method that returns Feast type given the raw column type.
- to_proto() feast.core.DataSource_pb2.DataSource [source]¶
Converts an DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]¶
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.
- class feast.KafkaSource(event_timestamp_column: str, bootstrap_servers: str, message_format: feast.data_format.StreamFormat, topic: str, created_timestamp_column: Optional[str] = '', field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = '')[source]¶
Bases:
feast.data_source.DataSource
- static from_proto(data_source: feast.core.DataSource_pb2.DataSource)[source]¶
Converts data source config in FeatureTable spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]¶
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- property kafka_options¶
Returns the kafka options of this data source
- static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]¶
Returns the callable method that returns Feast type given the raw column type.
- to_proto() feast.core.DataSource_pb2.DataSource [source]¶
Converts an DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]¶
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.
- class feast.KinesisSource(event_timestamp_column: str, created_timestamp_column: str, record_format: feast.data_format.StreamFormat, region: str, stream_name: str, field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = '')[source]¶
Bases:
feast.data_source.DataSource
- static from_proto(data_source: feast.core.DataSource_pb2.DataSource)[source]¶
Converts data source config in FeatureTable spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]¶
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- property kinesis_options¶
Returns the kinesis options of this data source
- static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]¶
Returns the callable method that returns Feast type given the raw column type.
- to_proto() feast.core.DataSource_pb2.DataSource [source]¶
Converts an DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]¶
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.
- class feast.RedshiftSource(event_timestamp_column: Optional[str] = '', table: Optional[str] = None, created_timestamp_column: Optional[str] = '', field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = '', query: Optional[str] = None)[source]¶
Bases:
feast.data_source.DataSource
- static from_proto(data_source: feast.core.DataSource_pb2.DataSource)[source]¶
Converts data source config in FeatureTable spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]¶
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- get_table_query_string() str [source]¶
Returns a string that can directly be used to reference this table in SQL
- property query¶
- property redshift_options¶
Returns the Redshift options of this data source
- static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]¶
Returns the callable method that returns Feast type given the raw column type.
- property table¶
- to_proto() feast.core.DataSource_pb2.DataSource [source]¶
Converts an DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]¶
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.
- class feast.RepoConfig(*, registry: Union[pydantic.types.StrictStr, feast.repo_config.RegistryConfig] = 'data/registry.db', project: pydantic.types.StrictStr, provider: pydantic.types.StrictStr, online_store: Any = None, offline_store: Any = None, repo_path: pathlib.Path = None, **data: Any)[source]¶
Bases:
feast.repo_config.FeastBaseModel
Repo config. Typically loaded from feature_store.yaml
- offline_store: Any¶
Offline store configuration (optional depending on provider)
- Type
OfflineStoreConfig
- online_store: Any¶
Online store configuration (optional depending on provider)
- Type
OnlineStoreConfig
- project: pydantic.types.StrictStr¶
Feast project id. This can be any alphanumeric string up to 16 characters. You can have multiple independent feature repositories deployed to the same cloud provider account, as long as they have different project ids.
- Type
- registry: Union[pydantic.types.StrictStr, feast.repo_config.RegistryConfig]¶
Path to metadata store. Can be a local path, or remote object storage path, e.g. a GCS URI
- Type
- repo_path: Optional[pathlib.Path]¶
- class feast.SourceType(value)[source]¶
Bases:
enum.Enum
DataSource value type. Used to define source types in DataSource.
- BATCH_BIGQUERY = 2¶
- BATCH_FILE = 1¶
- STREAM_KAFKA = 3¶
- STREAM_KINESIS = 4¶
- UNKNOWN = 0¶
- class feast.ValueType(value)[source]¶
Bases:
enum.Enum
Feature value type. Used to define data types in Feature Tables.
- BOOL = 7¶
- BOOL_LIST = 17¶
- BYTES = 1¶
- BYTES_LIST = 11¶
- DOUBLE = 5¶
- DOUBLE_LIST = 15¶
- FLOAT = 6¶
- FLOAT_LIST = 16¶
- INT32 = 3¶
- INT32_LIST = 13¶
- INT64 = 4¶
- INT64_LIST = 14¶
- STRING = 2¶
- STRING_LIST = 12¶
- UNIX_TIMESTAMP = 8¶
- UNIX_TIMESTAMP_LIST = 18¶
- UNKNOWN = 0¶