Normalization
Data Normalization
Tarsal normalizes logs from every data source to make analysis easy. It also applies a set of standard fields across all log sources to make cross-log correlation simple.
For example, events from a data source have a time that they occurred, but every source won't name their timestamp attribute the same way, nor is it guaranteed that the time has a timezone consistent with other data sources. Tarsal appends a UTC-normalized field called t_event_time
to each log which maps to the log's corresponding event time. That lets you query over logs from multiple data sources using t_event_time
to properly align and correlate despite their disparate schemas.
We append the below fields to every log record:
t_event_time
: The event time for the log, normalized to UTCt_parse_time
: The time when the event was parsed, normalized to UTC. If an event does not have a timestamp, thent_event_time
will be set tot_parse_time
t_ip_address
: IP address for the log source. Even if one source defines an ip address field asipAddr
, and another defines it assrcIpAddress
, you can query across both by searching fort_ip_address
.t_email_address
: The email address of the user
Table/Bucket/Container Normalization
When configuring a flow with the following destinations, the following normalizations are available:
Snowflake
Normalization Type | Default | Description |
---|---|---|
NONE | yes | All streams end up in their own raw table. Raw tables are a single column with a JSON blob inserted into that column. Turns off both SINGLE_TABLE and BASIC |
SINGLE_TABLE | no | All streams end up in 1 raw table. Raw tables are a single column with a JSON blob inserted into that column.Turns off both NONE and BASIC |
BASIC | no | All streams end up in a table where each column is a top level key. JSON sub objects are inserted as JSON blobs. Turns on NONE; turns off SINGLE_TABLE Note: Basic normalization is not yet supported for the Standard Inserts loading method. To make use of normalized tables, use one of the other two Snowflake loading methods, such as S3 Staging. This is configurable on the destination configuration. |
Updated about 1 month ago