Tracer Log Reference

Introduction

Each tracer log line corresponds to a “tracer event” that is published during execution of a query. Various different components within Hyper-Q publish different events, but every event published will always correspond to a particular client request within an established client session. Events may also correspond to a particular backend request, if applicable. Additionally, events may also correspond to a given LSN if their session is part of a bulk job, though for normal sessions the LSN field will show a default value of "0".

Every tracer event also includes both a “tag category” and a “tag name”. The category is a rough indicator of the type of event, and includes labels like “Preprocessing”, “Emulation”, and “Results” (a complete list of categories is included below for reference). The category is mainly useful as a filtering mechanism, in case you’re only interested in a particular stage of query processing, though it may also be useful as a rough indicator of when different stages start and end. The tag name is a much more specific label, and indicates more precisely what that particular event is for; thus, it is useful for getting a more fine-grained window into the handling of a given query. Events may also include a “tag value”, which contains additional info about that event. For example, the "Start Client Query” event includes the actual text of the client query as the tag value for that event.

Most of this document discusses the current stable version of the tracer log, known as the "default tracer" log, but there is a newer version of the log that is available for preview via a special configuration flag. This new version is known as the "beta tracer", and will be detailed later at the end of this article.

Log File Naming and Rotation

All tracer logs are rotated when Hyper-Q starts up. This means that there will always be a fresh new log created for any given instance of the Hyper-Q server process; you will never see log messages from two different instances of the server process in the same log file. The logs will also be rotated at midnight (as determined based on the local time zone of the server). At some point in the future we may add support for the use of external log rotation tools as well, though for now the Hyper-Q rotation behavior is automatic and mostly non-configurable.

The precise file naming for the tracer log varies depends on whether the legacy logger is enabled or not (see the "enable_legacy_logger" option below for more details). If the legacy logger is enabled, then every log file name includes the timestamp from what that file was first opened, and rotation does not actually rename any log files, but rather simply closes the current log file and opens a new one. If the legacy logger is disabled, then the tracer log file name contains no timestamp when it is opened (i.e. it is just named “default_tracer.csv”), and the file is renamed to add a timestamp when that file is rotated, with the timestamp being based on the time of rotation.

File names with timestamps are formatted as “base_log_name-YYYY-MM-DDTHH-MM-SS.extension”. Example: "default_tracer-2021-03-25T23-59-53.csv".

Format and Columns

The default tracer log is formatted as CSV, and always has the same columns in every log file, which are listed below. If the legacy logger is enabled, a column header line will be printed at the top of each log file; if the legacy logger is disabled, the header line will be omitted. Regardless of the presence or absence of a column header, the log file will always be formatted as valid CSV. Keep in mind though that some events may span multiple lines, which can happen if a multiline string is included in the tag_value field; while this is considered valid CSV as per RFC 4180, some tools which are otherwise able to parse CSV files may have problems with this.

The CSV columns are as follows:

▪ “Time”: the timestamp of when the event occurred, using the local server timezone and formatted with millisecond-precision timestamps as “YYYY-MM-DD HH:MM:SS.mmm”
▪ “Session Id”: the integer session identifier for the session that an event belong to
▪ “Client Request Id”: a UUID string identifier for the current client request being handled
▪ “Backend Request Id”: a UUID string identifier for the current backend request being executed, if applicable
▪ “LSN”: the “logon sequence number” integer ID, generated for bulk jobs that utilize multiple parallel sessions, and usable to tie together different sessions with distinct session IDs that are part of the same job
▪ “Tag Category”: the tracer event category, formatted as both a numeric code prefixed with “DLC” and a user-readable string. Ex: "DLC002--Preprocessing"
▪ “Tag Name”: the tracer event name, formatted as both a numeric code prefix with “DLE” and a user-readable string. Ex: "DLE0003--Start Client Query"
▪ “Tag Value”: the tracer tag value (if one is included in the event). This value is free-form, and may span multiple lines, with the meaning of the value dependent on the particular event.

Configuration Options

There are a few different options that can be specified in dtm.ini which affect the behavior of the tracer logs. All of these settings are optional, and they are all specified under the [general_settings] header. They are as follows:

log_dir

Specifies the directory that the log files are written to. Note that this also affects other log files, such as the error log and system_info log. (There is not currently a way to specify different directories for different types of logs.) Also note that if the -logdir flag is specified on the command line at startup, then the CLI flag will override this setting.

default_tracer_verbosity_level

Specifies how verbose the tracer logs should be, with lower settings causing some events to be suppressed from the logs, and higher settings resulting in more events being written. The specific verbosity value for each event is listed in a later section. If the verbosity level of an event is less than or equal to the configured value of default_tracer_verbosity_level, then it will be logged; otherwise, it will be suppressed and will not appear in the logs.

enable_legacy_logger

If set to “true”, log naming, formatting, and rotation will follow the same behavior as in past versions of Hyper-Q. If set to “false”, some of those legacy behaviors will change, as described elsewhere in this document. In addition to these small behavior differences, setting this to “false” will enable a log overload protection system to help maintain system stability in situations where very high load causes large amounts of log messages to be written at once. In these types of situations, the overload protection system may drop messages if it becomes necessary to maintain system stability; if this happens, notice of the dropped messages will be written separately to the error log.

enable_beta_tracer

If set to “true”, the beta tracer log will be written alongside the default tracer log. If set to “false”, the log will not be written. More information about this log is contained at the end of this document.

Categories

Category Code Category Name
DLC002 Pre-processing
DLC003 Execution
DLC004 Results
DLC005 Metadata
DLC006 Logic Processing
DLC007 Fast Export
DLC008 Feature Extraction
DLC009 Emulation
DLC010 Debug

Events

Event Code Event Name Verbosity
DLE0003 Start Client Query 2
DLE0055 Start XC Request Handling 5
DLE0004 Start Binding 5
DLE0005 Start Transformation 5
DLE0006 Start Serialization 5
DLE0007 Start Data Conversion 2
DLE0008 Query Sent to Backend 2
DLE0009 Query Completed 2
DLE0010 Error 1
DLE0011 Start Catalog Lookup 4
DLE0012 Catalog Lookup Completed 4
DLE0013 Start MD-Store Lookup 4
DLE0014 MD-Store Lookup Complete 4
DLE0019 Start Transaction Abort 2
DLE0021 Start Parameterized Query 5
DLE0023 Initialize MD-Store Update 4
DLE0024 MD-Store Update Complete 4
DLE0025 Intermediate Query Response Received 3
DLE0026 Fetch Chunk Complete 5
DLE0027 Start TDF Cursor Query 4
DLE0028 TDF Cursor Query Complete 4
DLE0029 Query Cancelled 4
DLE0030 Row Count 4
DLE0031 Bulk Load Session 1
DLE0032 Start MD-Store Update 4
DLE0033 Client Query Hash 2
DLE0034 CTAS with UPI 4
DLE0035 DML with Quantified Subquery 4
DLE0036 Emulate Sequence 4
DLE0037 Enforce SET Table INSERT 4
DLE0038 UPI INSERT 4
DLE0039 UPI UPDATE 4
DLE0040 UPDATE with Join 4
DLE0041 UPDATE Temp Table in Xact 4
DLE0042 SET Table INSERT 4
DLE0043 SET Table INSERT Values 4
DLE0044 MERGE 4
DLE0045 SET Table UPDATE 4
DLE0046 MLOAD Uniqueness Check 4
DLE0048 Materialize GTT 4
DLE0049 Eliminate Nested Case Expressions 4
DLE0050 Quantified Subquery With Long Const List 4
DLE0051 Start Fetch Chunk 5
DLE0053 DEBUG 4
DLE0054 Client Query Config 2
DLE0060 Query Type Select 2
DLE0061 Query Type DDL 2
DLE0062 Query Type DML 2
DLE0063 Query Type Other 2

Beta Tracer

A new version of the tracer log is in development, which is being referred to as the “beta tracer”. The plan is for this log to eventually replace the default tracer log that exists now. For the time being however, the beta tracer is subject to change as we continue to iterate and improve it.

Content and Formatting

The biggest change in the beta tracer is that each line is formatted as a JSON object instead of a CSV row. This allows for better compatibility with many popular tools for log storage and analysis, and also gives us more flexibility in terms of adding custom metadata to tracer events; in a CSV file, adding a new field requires adding a column that is included in every row, but with JSON we can pick and choose which fields are included in different events.

Note that for succinctness, the beta tracer uses shorter event category and name strings. Instead of using the DL codes, it has a simple tag string for each category and event, which typically consists of a few words separated by underscores. For example, the current default tracer might show the string "DLE0013--Start MD-Store Lookup" while the beta tracer would just show "start_mdstore_lookup”. The beta tracer also includes some additional fields to assist in debugging, such as the location in the source code where the event was published. Timestamps are also formatted using the RFC3339 standard, and include greater precision than the time values in the existing default tracer log.

Beta Tracer Log File Naming and Rotation

The file naming for the beta tracer is unaffected by the legacy logger config, as it always uses the new logging code, and thus always behaves the same way as the default tracer with the legacy logger disabled. That is to say, a newly opened beta tracer log is named “beta_tracer.json”, and a timestamp is added on rotation.

Spans

Another important addition to the beta tracer is the inclusion of “span” info. While events in the default tracer log are often useful to see when specific actions began, it can be more difficult to see when things end, and therefore hard to determine how long different operations take. Spans are used to fill this informational gap. Any event may indicate that it’s starting a span by including the “new_span_id” field in the event. Later, when the given span ends, a special “end_span” event will be published with the “span_id” field set to the ID of the span that is ending. Any intermediate events published mid-span by the same internal process that started the span will also be tagged with a “span_id” field to indicate that they are happening within that particular span. Spans can also be nested inside of one another, in which case they will be closed in LIFO order.

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.