Skip to main content
Version: Next

SQL Queries

Testing

CLI based Ingestion

Install the Plugin

pip install 'acryl-datahub[sql-queries]'

Config Details

Note that a . is used to denote nested fields in the YAML recipe.

FieldDescription
platform 
string
The platform for which to generate data, e.g. snowflake
query_file 
string
Path to file to ingest
default_db
string
The default database to use for unqualified table names
default_schema
string
The default schema to use for unqualified table names
platform_instance
string
The instance of the platform that all assets produced by this recipe belong to
env
string
The environment that all assets produced by this connector belong to
Default: PROD
usage
BaseUsageConfig
The usage config to use when generating usage statistics
Default: {'bucket_duration': 'DAY', 'end_time': '2023-10-05...
usage.bucket_duration
Enum
Size of the time window to aggregate usage stats.
Default: DAY
usage.end_time
string(date-time)
Latest date of lineage/usage to consider. Default: Current time in UTC
usage.format_sql_queries
boolean
Whether to format sql queries
Default: False
usage.include_operational_stats
boolean
Whether to display operational stats.
Default: True
usage.include_read_operational_stats
boolean
Whether to report read operational stats. Experimental.
Default: False
usage.include_top_n_queries
boolean
Whether to ingest the top_n_queries.
Default: True
usage.start_time
string(date-time)
Earliest date of lineage/usage to consider. Default: Last full day in UTC (or hour, depending on bucket_duration). You can also specify relative time with respect to end_time such as '-7 days' Or '-7d'.
usage.top_n_queries
integer
Number of top queries to save to each table.
Default: 10
usage.user_email_pattern
AllowDenyPattern
regex patterns for user emails to filter in usage.
Default: {'allow': ['.*'], 'deny': [], 'ignoreCase': True}
usage.user_email_pattern.allow
array(string)
usage.user_email_pattern.deny
array(string)
usage.user_email_pattern.ignoreCase
boolean
Whether to ignore case sensitivity during pattern matching.
Default: True

Code Coordinates

  • Class Name: datahub.ingestion.source.sql_queries.SqlQueriesSource
  • Browse on GitHub

Questions

If you've got any questions on configuring ingestion for SQL Queries, feel free to ping us on our Slack.