feast.infra.offline_stores.contrib.mssql_offline_store package

Subpackages

Submodules

feast.infra.offline_stores.contrib.mssql_offline_store.mssql module

class feast.infra.offline_stores.contrib.mssql_offline_store.mssql.MsSqlServerOfflineStore[source]

Bases: OfflineStore

Microsoft SQL Server based offline store, supporting Azure Synapse or Azure SQL.

Note: to use this, you’ll need to have Microsoft ODBC 17 installed. See https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/install-microsoft-odbc-driver-sql-server-macos?view=sql-server-ver15#17

static get_historical_features(config: RepoConfig, feature_views: List[FeatureView], feature_refs: List[str], entity_df: DataFrame | str, registry: BaseRegistry, project: str, full_feature_names: bool = False) RetrievalJob[source]

Retrieves the point-in-time correct historical feature values for the specified entity rows.

Parameters:
  • config – The config for the current feature store.

  • feature_views – A list containing all feature views that are referenced in the entity rows.

  • feature_refs – The features to be retrieved.

  • entity_df – A collection of rows containing all entity columns on which features need to be joined, as well as the timestamp column used for point-in-time joins. Either a pandas dataframe can be provided or a SQL query.

  • registry – The registry for the current feature store.

  • project – Feast project to which the feature views belong.

  • full_feature_names – If True, feature names will be prefixed with the corresponding feature view name, changing them from the format “feature” to “feature_view__feature” (e.g. “daily_transactions” changes to “customer_fv__daily_transactions”).

Returns:

A RetrievalJob that can be executed to get the features.

static offline_write_batch(config: RepoConfig, feature_view: FeatureView, table: Table, progress: Callable[[int], Any] | None)[source]

Writes the specified arrow table to the data source underlying the specified feature view.

Parameters:
  • config – The config for the current feature store.

  • feature_view – The feature view whose batch source should be written.

  • table – The arrow table to write.

  • progress – Function to be called once a portion of the data has been written, used to show progress.

static pull_all_from_table_or_query(config: RepoConfig, data_source: DataSource, join_key_columns: List[str], feature_name_columns: List[str], timestamp_field: str, start_date: datetime, end_date: datetime) RetrievalJob[source]

Extracts all the entity rows (i.e. the combination of join key columns, feature columns, and timestamp columns) from the specified data source that lie within the specified time range.

All of the column names should refer to columns that exist in the data source. In particular, any mapping of column names must have already happened.

Parameters:
  • config – The config for the current feature store.

  • data_source – The data source from which the entity rows will be extracted.

  • join_key_columns – The columns of the join keys.

  • feature_name_columns – The columns of the features.

  • timestamp_field – The timestamp column.

  • start_date – The start of the time range.

  • end_date – The end of the time range.

Returns:

A RetrievalJob that can be executed to get the entity rows.

static pull_latest_from_table_or_query(config: RepoConfig, data_source: DataSource, join_key_columns: List[str], feature_name_columns: List[str], timestamp_field: str, created_timestamp_column: str | None, start_date: datetime, end_date: datetime) RetrievalJob[source]

Extracts the latest entity rows (i.e. the combination of join key columns, feature columns, and timestamp columns) from the specified data source that lie within the specified time range.

All of the column names should refer to columns that exist in the data source. In particular, any mapping of column names must have already happened.

Parameters:
  • config – The config for the current feature store.

  • data_source – The data source from which the entity rows will be extracted.

  • join_key_columns – The columns of the join keys.

  • feature_name_columns – The columns of the features.

  • timestamp_field – The timestamp column, used to determine which rows are the most recent.

  • created_timestamp_column – The column indicating when the row was created, used to break ties.

  • start_date – The start of the time range.

  • end_date – The end of the time range.

Returns:

A RetrievalJob that can be executed to get the entity rows.

static write_logged_features(config: RepoConfig, data: Table | Path, source: LoggingSource, logging_config: LoggingConfig, registry: BaseRegistry)[source]

Writes logged features to a specified destination in the offline store.

If the specified destination exists, data will be appended; otherwise, the destination will be created and data will be added. Thus this function can be called repeatedly with the same destination to flush logs in chunks.

Parameters:
  • config – The config for the current feature store.

  • data – An arrow table or a path to parquet directory that contains the logs to write.

  • source – The logging source that provides a schema and some additional metadata.

  • logging_config – A LoggingConfig object that determines where the logs will be written.

  • registry – The registry for the current feature store.

class feast.infra.offline_stores.contrib.mssql_offline_store.mssql.MsSqlServerOfflineStoreConfig(*, type: typing_extensions.Literal[mssql] = 'mssql', connection_string: StrictStr = 'mssql+pyodbc://sa:yourStrong(!)Password@localhost:1433/feast_test?driver=ODBC+Driver+17+for+SQL+Server', **extra_data: Any)[source]

Bases: FeastBaseModel

Offline store config for SQL Server

connection_string: StrictStr

Connection string containing the host, port, and configuration parameters for SQL Server format: SQLAlchemy connection string, e.g. mssql+pyodbc://sa:yourStrong(!)Password@localhost:1433/feast_test?driver=ODBC+Driver+17+for+SQL+Server

type: typing_extensions.Literal[mssql]

Offline store type selector

class feast.infra.offline_stores.contrib.mssql_offline_store.mssql.MsSqlServerRetrievalJob(query: str, engine: Engine, config: MsSqlServerOfflineStoreConfig, full_feature_names: bool, on_demand_feature_views: List[OnDemandFeatureView] | None = None, metadata: RetrievalMetadata | None = None, drop_columns: List[str] | None = None)[source]

Bases: RetrievalJob

property full_feature_names: bool

Returns True if full feature names should be applied to the results of the query.

property metadata: RetrievalMetadata | None

Returns metadata about the retrieval job.

property on_demand_feature_views: List[OnDemandFeatureView]

Returns a list containing all the on demand feature views to be handled.

persist(storage: SavedDatasetStorage, allow_overwrite: bool | None = False, timeout: int | None = None)[source]

Synchronously executes the underlying query and persists the result in the same offline store at the specified destination.

Parameters:
  • storage – The saved dataset storage object specifying where the result should be persisted.

  • allow_overwrite – If True, a pre-existing location (e.g. table or file) can be overwritten. Currently not all individual offline store implementations make use of this parameter.

supports_remote_storage_export() bool[source]

Returns True if the RetrievalJob supports to_remote_storage.

to_remote_storage() List[str][source]

Synchronously executes the underlying query and exports the results to remote storage (e.g. S3 or GCS).

Implementations of this method should export the results as multiple parquet files, each file sized appropriately depending on how much data is being returned by the retrieval job.

Returns:

A list of parquet file paths in remote storage.

feast.infra.offline_stores.contrib.mssql_offline_store.mssql.make_engine(config: MsSqlServerOfflineStoreConfig) Engine[source]

feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source module

class feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source.MsSqlServerOptions(connection_str: str | None, table_ref: str | None)[source]

Bases: object

DataSource MsSQLServer options used to source features from MsSQLServer query

property connection_str

Returns the SqlServer SQL connection string referenced by this source

classmethod from_proto(sqlserver_options_proto: CustomSourceOptions) MsSqlServerOptions[source]

Creates an MsSQLServerOptions from a protobuf representation of a SqlServer option :param sqlserver_options_proto: A protobuf representation of a DataSource

Returns:

Returns a SQLServerOptions object based on the sqlserver_options protobuf

property table_ref

Returns the table ref of this SQL Server source

to_proto() CustomSourceOptions[source]

Converts a MsSQLServerOptions object to a protobuf representation. :returns: CustomSourceOptions protobuf

class feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source.MsSqlServerSource(name: str, table_ref: str | None = None, event_timestamp_column: str | None = None, created_timestamp_column: str | None = '', field_mapping: Dict[str, str] | None = None, date_partition_column: str | None = '', connection_str: str | None = '', description: str | None = None, tags: Dict[str, str] | None = None, owner: str | None = None)[source]

Bases: DataSource

created_timestamp_column: str
date_partition_column: str
description: str
field_mapping: Dict[str, str]
static from_proto(data_source: DataSource)[source]

Converts data source config in protobuf spec to a DataSource class object.

Parameters:

data_source – A protobuf representation of a DataSource.

Returns:

A DataSource class object.

Raises:

ValueError – The type of DataSource could not be identified.

get_table_column_names_and_types(config: RepoConfig) Iterable[Tuple[str, str]][source]

Returns the list of column names and raw column types.

Parameters:

config – Configuration object used to configure a feature store.

get_table_query_string() str[source]

Returns a string that can directly be used to reference this table in SQL

property mssqlserver_options

Returns the SQL Server options of this data source

name: str
owner: str
static source_datatype_to_feast_value_type() Callable[[str], ValueType][source]

Returns the callable method that returns Feast type given the raw column type.

property table_ref
tags: Dict[str, str]
timestamp_field: str
to_proto() DataSource[source]

Converts a DataSourceProto object to its protobuf representation.

validate(config: RepoConfig)[source]

Validates the underlying data source.

Parameters:

config – Configuration object used to configure a feature store.

Module contents