feast.infra.offline_stores package¶
Submodules¶
feast.infra.offline_stores.bigquery module¶
- class feast.infra.offline_stores.bigquery.BigQueryOfflineStore[source]¶
Bases:
feast.infra.offline_stores.offline_store.OfflineStore
- static get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.Registry, project: str, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]¶
- static pull_latest_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, created_timestamp_column: Optional[str], start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]¶
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.
- class feast.infra.offline_stores.bigquery.BigQueryOfflineStoreConfig(*, type: typing_extensions.Literal[bigquery] = 'bigquery', dataset: pydantic.types.StrictStr = 'feast', project_id: pydantic.types.StrictStr = None)[source]¶
Bases:
feast.repo_config.FeastConfigBaseModel
Offline store config for GCP BigQuery
- dataset: pydantic.types.StrictStr¶
(optional) BigQuery Dataset name for temporary tables
- project_id: Optional[pydantic.types.StrictStr]¶
(optional) GCP project name used for the BigQuery offline store
- type: typing_extensions.Literal[bigquery]¶
Offline store type selector
- class feast.infra.offline_stores.bigquery.BigQueryRetrievalJob(query, client, config)[source]¶
Bases:
feast.infra.offline_stores.offline_store.RetrievalJob
- to_bigquery(job_config: Optional[google.cloud.bigquery.job.query.QueryJobConfig] = None, timeout: int = 1800, retry_cadence: int = 10) Optional[str] [source]¶
Triggers the execution of a historical feature retrieval query and exports the results to a BigQuery table. Runs for a maximum amount of time specified by the timeout parameter (defaulting to 30 minutes).
- Parameters
job_config – An optional bigquery.QueryJobConfig to specify options like destination table, dry run, etc.
timeout – An optional number of seconds for setting the time limit of the QueryJob.
retry_cadence – An optional number of seconds for setting how long the job should checked for completion.
- Returns
Returns the destination table name or returns None if job_config.dry_run is True.
- feast.infra.offline_stores.bigquery.block_until_done(client: google.cloud.bigquery.client.Client, bq_job: Union[google.cloud.bigquery.job.query.QueryJob, google.cloud.bigquery.job.load.LoadJob], timeout: int = 1800, retry_cadence: int = 10)[source]¶
Waits for bq_job to finish running, up to a maximum amount of time specified by the timeout parameter (defaulting to 30 minutes).
- Parameters
client – A bigquery.client.Client to monitor the bq_job.
bq_job – The bigquery.job.QueryJob that blocks until done runnning.
timeout – An optional number of seconds for setting the time limit of the job.
retry_cadence – An optional number of seconds for setting how long the job should checked for completion.
- Raises
BigQueryJobStillRunning exception if the function has blocked longer than 30 minutes. –
BigQueryJobCancelled exception to signify when that the job has been cancelled (i.e. from timeout or KeyboardInterrupt) –
feast.infra.offline_stores.file module¶
- class feast.infra.offline_stores.file.FileOfflineStore[source]¶
Bases:
feast.infra.offline_stores.offline_store.OfflineStore
- static get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.Registry, project: str, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]¶
- static pull_latest_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, created_timestamp_column: Optional[str], start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]¶
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.
- class feast.infra.offline_stores.file.FileOfflineStoreConfig(*, type: typing_extensions.Literal[file] = 'file')[source]¶
Bases:
feast.repo_config.FeastConfigBaseModel
Offline store config for local (file-based) store
- type: typing_extensions.Literal[file]¶
Offline store type selector
feast.infra.offline_stores.helpers module¶
feast.infra.offline_stores.offline_store module¶
- class feast.infra.offline_stores.offline_store.OfflineStore[source]¶
Bases:
abc.ABC
OfflineStore is an object used for all interaction between Feast and the service used for offline storage of features.
- abstract static get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.Registry, project: str, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]¶
- abstract static pull_latest_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, created_timestamp_column: Optional[str], start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]¶
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.