feast.infra.offline_stores.contrib.mssql_offline_store package
Subpackages
Submodules
feast.infra.offline_stores.contrib.mssql_offline_store.mssql module
- class feast.infra.offline_stores.contrib.mssql_offline_store.mssql.MsSqlServerOfflineStore[source]
Bases:
feast.infra.offline_stores.offline_store.OfflineStore
Microsoft SQL Server based offline store, supporting Azure Synapse or Azure SQL.
Note: to use this, you’ll need to have Microsoft ODBC 17 installed. See https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/install-microsoft-odbc-driver-sql-server-macos?view=sql-server-ver15#17
- static get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.infra.registry.base_registry.BaseRegistry, project: str, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Retrieves the point-in-time correct historical feature values for the specified entity rows.
- Parameters
config – The config for the current feature store.
feature_views – A list containing all feature views that are referenced in the entity rows.
feature_refs – The features to be retrieved.
entity_df – A collection of rows containing all entity columns on which features need to be joined, as well as the timestamp column used for point-in-time joins. Either a pandas dataframe can be provided or a SQL query.
registry – The registry for the current feature store.
project – Feast project to which the feature views belong.
full_feature_names – If True, feature names will be prefixed with the corresponding feature view name, changing them from the format “feature” to “feature_view__feature” (e.g. “daily_transactions” changes to “customer_fv__daily_transactions”).
- Returns
A RetrievalJob that can be executed to get the features.
- static offline_write_batch(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, table: pyarrow.lib.Table, progress: Optional[Callable[[int], Any]])[source]
Writes the specified arrow table to the data source underlying the specified feature view.
- Parameters
config – The config for the current feature store.
feature_view – The feature view whose batch source should be written.
table – The arrow table to write.
progress – Function to be called once a portion of the data has been written, used to show progress.
- static pull_all_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], timestamp_field: str, start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Extracts all the entity rows (i.e. the combination of join key columns, feature columns, and timestamp columns) from the specified data source that lie within the specified time range.
All of the column names should refer to columns that exist in the data source. In particular, any mapping of column names must have already happened.
- Parameters
config – The config for the current feature store.
data_source – The data source from which the entity rows will be extracted.
join_key_columns – The columns of the join keys.
feature_name_columns – The columns of the features.
timestamp_field – The timestamp column.
start_date – The start of the time range.
end_date – The end of the time range.
- Returns
A RetrievalJob that can be executed to get the entity rows.
- static pull_latest_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], timestamp_field: str, created_timestamp_column: Optional[str], start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Extracts the latest entity rows (i.e. the combination of join key columns, feature columns, and timestamp columns) from the specified data source that lie within the specified time range.
All of the column names should refer to columns that exist in the data source. In particular, any mapping of column names must have already happened.
- Parameters
config – The config for the current feature store.
data_source – The data source from which the entity rows will be extracted.
join_key_columns – The columns of the join keys.
feature_name_columns – The columns of the features.
timestamp_field – The timestamp column, used to determine which rows are the most recent.
created_timestamp_column – The column indicating when the row was created, used to break ties.
start_date – The start of the time range.
end_date – The end of the time range.
- Returns
A RetrievalJob that can be executed to get the entity rows.
- static write_logged_features(config: feast.repo_config.RepoConfig, data: Union[pyarrow.lib.Table, pathlib.Path], source: feast.feature_logging.LoggingSource, logging_config: feast.feature_logging.LoggingConfig, registry: feast.infra.registry.base_registry.BaseRegistry)[source]
Writes logged features to a specified destination in the offline store.
If the specified destination exists, data will be appended; otherwise, the destination will be created and data will be added. Thus this function can be called repeatedly with the same destination to flush logs in chunks.
- Parameters
config – The config for the current feature store.
data – An arrow table or a path to parquet directory that contains the logs to write.
source – The logging source that provides a schema and some additional metadata.
logging_config – A LoggingConfig object that determines where the logs will be written.
registry – The registry for the current feature store.
- class feast.infra.offline_stores.contrib.mssql_offline_store.mssql.MsSqlServerOfflineStoreConfig(*, type: Literal['mssql'] = 'mssql', connection_string: pydantic.types.StrictStr = 'mssql+pyodbc://sa:yourStrong(!)Password@localhost:1433/feast_test?driver=ODBC+Driver+17+for+SQL+Server', **extra_data: Any)[source]
Bases:
feast.repo_config.FeastBaseModel
Offline store config for SQL Server
- connection_string: pydantic.types.StrictStr
Connection string containing the host, port, and configuration parameters for SQL Server format: SQLAlchemy connection string, e.g. mssql+pyodbc://sa:yourStrong(!)Password@localhost:1433/feast_test?driver=ODBC+Driver+17+for+SQL+Server
- type: Literal['mssql']
Offline store type selector
- class feast.infra.offline_stores.contrib.mssql_offline_store.mssql.MsSqlServerRetrievalJob(query: str, engine: sqlalchemy.engine.base.Engine, config: feast.infra.offline_stores.contrib.mssql_offline_store.mssql.MsSqlServerOfflineStoreConfig, full_feature_names: bool, on_demand_feature_views: Optional[List[feast.on_demand_feature_view.OnDemandFeatureView]], metadata: Optional[feast.infra.offline_stores.offline_store.RetrievalMetadata] = None, drop_columns: Optional[List[str]] = None)[source]
Bases:
feast.infra.offline_stores.offline_store.RetrievalJob
- property full_feature_names: bool
Returns True if full feature names should be applied to the results of the query.
- property metadata: Optional[feast.infra.offline_stores.offline_store.RetrievalMetadata]
Returns metadata about the retrieval job.
- property on_demand_feature_views: List[feast.on_demand_feature_view.OnDemandFeatureView]
Returns a list containing all the on demand feature views to be handled.
- persist(storage: feast.saved_dataset.SavedDatasetStorage, allow_overwrite: bool = False)[source]
Synchronously executes the underlying query and persists the result in the same offline store at the specified destination.
- Parameters
storage – The saved dataset storage object specifying where the result should be persisted.
allow_overwrite – If True, a pre-existing location (e.g. table or file) can be overwritten. Currently not all individual offline store implementations make use of this parameter.
- supports_remote_storage_export() bool [source]
Returns True if the RetrievalJob supports to_remote_storage.
- to_remote_storage() List[str] [source]
Synchronously executes the underlying query and exports the results to remote storage (e.g. S3 or GCS).
Implementations of this method should export the results as multiple parquet files, each file sized appropriately depending on how much data is being returned by the retrieval job.
- Returns
A list of parquet file paths in remote storage.
- feast.infra.offline_stores.contrib.mssql_offline_store.mssql.make_engine(config: feast.infra.offline_stores.contrib.mssql_offline_store.mssql.MsSqlServerOfflineStoreConfig) sqlalchemy.engine.base.Engine [source]
feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source module
- class feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source.MsSqlServerOptions(connection_str: Optional[str], table_ref: Optional[str])[source]
Bases:
object
DataSource MsSQLServer options used to source features from MsSQLServer query
- property connection_str
Returns the SqlServer SQL connection string referenced by this source
- classmethod from_proto(sqlserver_options_proto: feast.core.DataSource_pb2.CustomSourceOptions) feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source.MsSqlServerOptions [source]
Creates an MsSQLServerOptions from a protobuf representation of a SqlServer option :param sqlserver_options_proto: A protobuf representation of a DataSource
- Returns
Returns a SQLServerOptions object based on the sqlserver_options protobuf
- property table_ref
Returns the table ref of this SQL Server source
- class feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source.MsSqlServerSource(name: str, table_ref: Optional[str] = None, event_timestamp_column: Optional[str] = None, created_timestamp_column: Optional[str] = '', field_mapping: Optional[Dict[str, str]] = None, date_partition_column: Optional[str] = '', connection_str: Optional[str] = '', description: Optional[str] = None, tags: Optional[Dict[str, str]] = None, owner: Optional[str] = None)[source]
Bases:
feast.data_source.DataSource
- static from_proto(data_source: feast.core.DataSource_pb2.DataSource)[source]
Converts data source config in protobuf spec to a DataSource class object.
- Parameters
data_source – A protobuf representation of a DataSource.
- Returns
A DataSource class object.
- Raises
ValueError – The type of DataSource could not be identified.
- get_table_column_names_and_types(config: feast.repo_config.RepoConfig) Iterable[Tuple[str, str]] [source]
Returns the list of column names and raw column types.
- Parameters
config – Configuration object used to configure a feature store.
- get_table_query_string() str [source]
Returns a string that can directly be used to reference this table in SQL
- property mssqlserver_options
Returns the SQL Server options of this data source
- static source_datatype_to_feast_value_type() Callable[[str], feast.value_type.ValueType] [source]
Returns the callable method that returns Feast type given the raw column type.
- property table_ref
- to_proto() feast.core.DataSource_pb2.DataSource [source]
Converts a DataSourceProto object to its protobuf representation.
- validate(config: feast.repo_config.RepoConfig)[source]
Validates the underlying data source.
- Parameters
config – Configuration object used to configure a feature store.