How to use Datometry's Table Override Feature

Overview

In scenarios where precise control over table creation options in Synapse is needed, you can override the default Hyper-Q behavior at an individual statement level using a comment hint:

-- @dtm_synapse_table_options_override=...

The hint can reference any of the valid options in the <table_option> section of the CREATE TABLE statement on Synapse Analytics after the = sign. Hyper-Q adds these options as-is into the table option section of the CREATE TABLE statement passed to Synapse.

 

Hints are preferable as it inlines the changes directly into the Teradata DDL and removes potential human errors of manually maintaining DDL through Hyper-Q and within Synapse.

Specifying the distribution column

The hint can be used to pick a specific column among the multiple columns in the primary index in the Teradata definition as the HASH distribution column in Synapse, as follows:

CREATE TABLE example_hash_override( 
-- @dtm_synapse_table_options_override=DISTRIBUTION = HASH(col2)
col1 INTEGER,
col2 INTEGER,
col3 INTEGER,
col4 INTEGER )
UNIQUE PRIMARY INDEX (col1, col2, col3);

In this example, of the three columns designated as the primary index in the Teradata source statement, col2 is chosen as the distribution (HASH) column in the resulting table in Synapse.

 

Specifying REPLICATE distribution

The following example illustrates how to select ‘replicate' as the table distribution style at an individual statement level:

CREATE TABLE example_replicate(
-- @dtm_synapse_table_options_override=DISTRIBUTION = REPLICATE
col1 INTEGER,
col2 INTEGER,
col3 INTEGER,
col4 INTEGER
) UNIQUE PRIMARY INDEX (col1, col2, col3);

Support for multiple lines

If necessary, the table options can be provided in a block comment spanning multiple lines, as in this example:

create table example_multiline(
/*
@dtm_synapse_table_options_override=HEAP,
DISTRIBUTION = HASH ( col2 ),
PARTITION(col3 RANGE RIGHT FOR VALUES(1))
*/
col1 INTEGER,
col2 INTEGER,
col3 INTEGER,
col4 INTEGER
);

Please ensure that no other comments are included in this comment block.

Error Reporting

Since the table options in the hint are passed as-is to Synapse, be prepared to handle errors returned by Synapse in the same manner as for a native Synapse CREATE TABLE statement with these options.

Usage Notes

Some clients may strip comments before the statement body. Therefore, it is recommended to include the comment with a hint inside the statement body as in the examples above, to ensure that is passed through to Hyper-Q.

Full list of supported table options

Please refer to the <table_option> section below for a full list of options that can be passed to the hint, based on the Microsoft documentation for CREATE TABLE. As with the native CREATE TABLE syntax, the options may be combined and separated by a comma. Full option list (refer to Microsoft documentation for the latest version):

-- Create a new table.
CREATE TABLE { database_name.schema_name.table_name | schema_name.table_name | table_name }
(
{ column_name <data_type> [ <column_options> ] } [ ,...n ]
)
[ WITH ( <table_option> [ ,...n ] ) ]
[;]

...

<table_option> ::=
{
CLUSTERED COLUMNSTORE INDEX --default for Azure Synapse Analytics
| CLUSTERED COLUMNSTORE INDEX ORDER (column [,...n])
| HEAP --default for Parallel Data Warehouse
| CLUSTERED INDEX ( { index_column_name [ ASC | DESC ] } [ ,...n ] ) -- default is ASC
}
{
DISTRIBUTION = HASH ( distribution_column_name ) 
| DISTRIBUTION = ROUND_ROBIN -- default for Azure Synapse Analytics 
| DISTRIBUTION = REPLICATE -- default for Parallel Data Warehouse
}  
| PARTITION ( partition_column_name RANGE [ LEFT | RIGHT ] -- default is LEFT 24 FOR VALUES ( [ boundary_value [,...n] ] ) )

 

 

 

 

 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.