Each tracer log line corresponds to a “tracer event” that is published during execution of a query. Various different components within Hyper-Q publish different events, but every event published will always correspond to a particular client request within an established client session. Events may also correspond to a particular backend request, if applicable. Additionally, events may also correspond to a given LSN if their session is part of a bulk job, though for normal sessions the LSN field will show a default value of "0".
Every tracer event also includes both a “tag category” and a “tag name”. The category is a rough indicator of the type of event, and includes labels like “Preprocessing”, “Emulation”, and “Results” (a complete list of categories is included below for reference). The category is mainly useful as a filtering mechanism, in case you’re only interested in a particular stage of query processing, though it may also be useful as a rough indicator of when different stages start and end. The tag name is a much more specific label, and indicates more precisely what that particular event is for; thus, it is useful for getting a more fine-grained window into the handling of a given query. Events may also include a “tag value”, which contains additional info about that event. For example, the "Start Client Query” event includes the actual text of the client query as the tag value for that event.
Most of this document discusses the current stable version of the tracer log, known as the "default tracer" log, but there is a newer version of the log that is available for preview via a special configuration flag. This new version is known as the "beta tracer", and will be detailed later at the end of this article.
Log File Naming and Rotation
All tracer logs are rotated when Hyper-Q starts up. This means that there will always be a fresh new log created for any given instance of the Hyper-Q server process; you will never see log messages from two different instances of the server process in the same log file. The logs will also be rotated at midnight (as determined based on the local time zone of the server). At some point in the future we may add support for the use of external log rotation tools as well, though for now the Hyper-Q rotation behavior is automatic and mostly non-configurable.
The precise file naming for the tracer log varies depends on whether the legacy logger is enabled or not (see the "enable_legacy_logger" option below for more details). If the legacy logger is enabled, then every log file name includes the timestamp from what that file was first opened, and rotation does not actually rename any log files, but rather simply closes the current log file and opens a new one. If the legacy logger is disabled, then the tracer log file name contains no timestamp when it is opened (i.e. it is just named “default_tracer.csv”), and the file is renamed to add a timestamp when that file is rotated, with the timestamp being based on the time of rotation.
File names with timestamps are formatted as “base_log_name-YYYY-MM-DDTHH-MM-SS.extension”. Example: "default_tracer-2021-03-25T23-59-53.csv".
Format and Columns
The default tracer log is formatted as CSV, and always has the same columns in every log file, which are listed below. If the legacy logger is enabled, a column header line will be printed at the top of each log file; if the legacy logger is disabled, the header line will be omitted. Regardless of the presence or absence of a column header, the log file will always be formatted as valid CSV. Keep in mind though that some events may span multiple lines, which can happen if a multiline string is included in the tag_value field; while this is considered valid CSV as per RFC 4180, some tools which are otherwise able to parse CSV files may have problems with this.
The CSV columns are as follows:
▪ “Time”: the timestamp of when the event occurred, using the local server timezone and formatted with millisecond-precision timestamps as “YYYY-MM-DD HH:MM:SS.mmm”
▪ “Session Id”: the integer session identifier for the session that an event belong to
▪ “Client Request Id”: a UUID string identifier for the current client request being handled
▪ “Backend Request Id”: a UUID string identifier for the current backend request being executed, if applicable
▪ “LSN”: the “logon sequence number” integer ID, generated for bulk jobs that utilize multiple parallel sessions, and usable to tie together different sessions with distinct session IDs that are part of the same job
▪ “Tag Category”: the tracer event category, formatted as both a numeric code prefixed with “DLC” and a user-readable string. Ex: "DLC002--Preprocessing"
▪ “Tag Name”: the tracer event name, formatted as both a numeric code prefix with “DLE” and a user-readable string. Ex: "DLE0003--Start Client Query"
▪ “Tag Value”: the tracer tag value (if one is included in the event). This value is free-form, and may span multiple lines, with the meaning of the value dependent on the particular event.
There are a few different options that can be specified in dtm.ini which affect the behavior of the tracer logs. All of these settings are optional, and they are all specified under the [general_settings] header. They are as follows:
Specifies the directory that the log files are written to. Note that this also affects other log files, such as the error log and system_info log. (There is not currently a way to specify different directories for different types of logs.) Also note that if the -logdir flag is specified on the command line at startup, then the CLI flag will override this setting.
Specifies how verbose the tracer logs should be, with lower settings causing some events to be suppressed from the logs, and higher settings resulting in more events being written. The specific verbosity value for each event is listed in a later section. If the verbosity level of an event is less than or equal to the configured value of default_tracer_verbosity_level, then it will be logged; otherwise, it will be suppressed and will not appear in the logs.
If set to “true”, log naming, formatting, and rotation will follow the same behavior as in past versions of Hyper-Q. If set to “false”, some of those legacy behaviors will change, as described elsewhere in this document. In addition to these small behavior differences, setting this to “false” will enable a log overload protection system to help maintain system stability in situations where very high load causes large amounts of log messages to be written at once. In these types of situations, the overload protection system may drop messages if it becomes necessary to maintain system stability; if this happens, notice of the dropped messages will be written separately to the error log.
If set to “true”, the beta tracer log will be written alongside the default tracer log. If set to “false”, the log will not be written. More information about this log is contained at the end of this document.
|Category Code||Category Name|
|Event Code||Event Name||Verbosity|
|DLE0003||Start Client Query||2|
|DLE0055||Start XC Request Handling||5|
|DLE0007||Start Data Conversion||2|
|DLE0008||Query Sent to Backend||2|
|DLE0011||Start Catalog Lookup||4|
|DLE0012||Catalog Lookup Completed||4|
|DLE0013||Start MD-Store Lookup||4|
|DLE0014||MD-Store Lookup Complete||4|
|DLE0019||Start Transaction Abort||2|
|DLE0021||Start Parameterized Query||5|
|DLE0023||Initialize MD-Store Update||4|
|DLE0024||MD-Store Update Complete||4|
|DLE0025||Intermediate Query Response Received||3|
|DLE0026||Fetch Chunk Complete||5|
|DLE0027||Start TDF Cursor Query||4|
|DLE0028||TDF Cursor Query Complete||4|
|DLE0031||Bulk Load Session||1|
|DLE0032||Start MD-Store Update||4|
|DLE0033||Client Query Hash||2|
|DLE0034||CTAS with UPI||4|
|DLE0035||DML with Quantified Subquery||4|
|DLE0037||Enforce SET Table INSERT||4|
|DLE0040||UPDATE with Join||4|
|DLE0041||UPDATE Temp Table in Xact||4|
|DLE0042||SET Table INSERT||4|
|DLE0043||SET Table INSERT Values||4|
|DLE0045||SET Table UPDATE||4|
|DLE0046||MLOAD Uniqueness Check||4|
|DLE0049||Eliminate Nested Case Expressions||4|
|DLE0050||Quantified Subquery With Long Const List||4|
|DLE0051||Start Fetch Chunk||5|
|DLE0054||Client Query Config||2|
|DLE0060||Query Type Select||2|
|DLE0061||Query Type DDL||2|
|DLE0062||Query Type DML||2|
|DLE0063||Query Type Other||2|
A new version of the tracer log is in development, which is being referred to as the “beta tracer”. The plan is for this log to eventually replace the default tracer log that exists now. For the time being however, the beta tracer is subject to change as we continue to iterate and improve it.
Content and Formatting
The biggest change in the beta tracer is that each line is formatted as a JSON object instead of a CSV row. This allows for better compatibility with many popular tools for log storage and analysis, and also gives us more flexibility in terms of adding custom metadata to tracer events; in a CSV file, adding a new field requires adding a column that is included in every row, but with JSON we can pick and choose which fields are included in different events.
Note that for succinctness, the beta tracer uses shorter event category and name strings. Instead of using the DL codes, it has a simple tag string for each category and event, which typically consists of a few words separated by underscores. For example, the current default tracer might show the string "DLE0013--Start MD-Store Lookup" while the beta tracer would just show "start_mdstore_lookup”. The beta tracer also includes some additional fields to assist in debugging, such as the location in the source code where the event was published. Timestamps are also formatted using the RFC3339 standard, and include greater precision than the time values in the existing default tracer log.
Beta Tracer Log File Naming and Rotation
The file naming for the beta tracer is unaffected by the legacy logger config, as it always uses the new logging code, and thus always behaves the same way as the default tracer with the legacy logger disabled. That is to say, a newly opened beta tracer log is named “beta_tracer.json”, and a timestamp is added on rotation.
Another important addition to the beta tracer is the inclusion of “span” info. While events in the default tracer log are often useful to see when specific actions began, it can be more difficult to see when things end, and therefore hard to determine how long different operations take. Spans are used to fill this informational gap. Any event may indicate that it’s starting a span by including the “new_span_id” field in the event. Later, when the given span ends, a special “end_span” event will be published with the “span_id” field set to the ID of the span that is ending. Any intermediate events published mid-span by the same internal process that started the span will also be tagged with a “span_id” field to indicate that they are happening within that particular span. Spans can also be nested inside of one another, in which case they will be closed in LIFO order.