Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[tracecompass-dev] Trace Compass state snapshot exchange format

Hello people.

Alex and I were discussing about an exchange format for a Trace Compass
state snapshot and we came up with an idea. Consider this as a proposal
for something to put on the wishlist, not a priority.

The current format of a Trace Compass state history is made of two
parts, both within the same binary file:

1. The intervals. Each interval has a (numeric) key, which uniquely
   identifies a "timeline". In other words, two intervals with the same
   key cannot overlap. Each interval holds a value of some type (null,
   integer, string, etc.).

   Trace Compass calls this key a "quark" because it associates it to a
   path string (more below).

2. An association of keys to state tree nodes (identified by paths).

We consider the second part as a simple extension to the first one. It's
possible to have a state history with only part 1, but you cannot
organize it as a tree without part 2.

The first problem I see is that a whole timeline (given key) can have
many many intervals with values that share the same type. As it is
designed right now, the interval holds its value _and_ its type ID
(I think there's one byte allocated for the type ID).

Here's a suggestion:

1. Keep part 1 of the file, but, if a given timeline only has interval
   values which share the same type, do not write the type ID there.

2. Have a JSON file, `types.json` for example, which associates keys
   to types of values:

       [
         {"key": 0, "type": "int"},
         {"key": 1, "type": "string"},
         {"key": 2, "type": "int"},
         {"key": 3, "type": "int"},
         {"key": 4, "type": "bool"},
         {"key": 5, "type": "string"},
         {"key": 6, "type": "none"},
         {"key": 7, "type": "any"},
         {"key": 8, "type": "int"}
       ]

   Other layouts are possible. This is just an idea.

   The `any` type means that the type ID is part of the interval in the
   binary file (just like right now). This is a variant type. It could
   be called `variant` also.

   The `none` type means all the intervals which have this key have
   no value (they just exist).

3. Have a JSON file, `tree.json` for example, which associates keys to
   tree nodes (this example is not related to the key-to-type
   association example above):

       {
         "root": {
           "key": 0,
           "children": {
             "Threads": {
               "key": 1,
               "children": {
                 "390": {
                   "key": 2,
                   "children": {
                     "Exec_name": {"key": 3},
                     "Prio": {"key": 4},
                     "System_call": {"key": 5},
                     "PPID": {"key": 6}
                   }
                 },
                 "1823": {
                   "key": 20,
                   "children": {
                     "Exec_name": {"key": 21},
                     "Prio": {"key": 22},
                     "System_call": {"key": 23},
                     "PPID": {"key": 24}
                   }
                 }
               }
             },
             "CPUS": {
               "key": 7,
               "children": {
                 "0": {
                   "key": 8,
                   "children": {
                     "Current_thread": {"key": 9},
                     "IRQs": {"key": 10},
                     "Soft_IRQs": {"key": 11}
                   }
                 },
                 "1": {
                   "key": 12,
                   "children": {
                     "Current_thread": {"key": 13},
                     "IRQs": {"key": 14},
                     "Soft_IRQs": {"key": 15}
                   }
                 },
                 "2": {
                   "key": 16,
                   "children": {
                     "Current_thread": {"key": 17},
                     "IRQs": {"key": 18},
                     "Soft_IRQs": {"key": 19}
                   }
                 }
               }
             }
           }
         }
       }

So there you have it: `history.bin`, `types.json`, `tree.json`.

Now, a Trace Compass state snapshot is made of:

1. `types.json` (same file used for the history).

2. `tree.json` (same file used for the history).

3. `state.json`, a simple key-to-value association:

       [
         {"key": 0, "value": 1728},
         {"key": 1, "value": "lttng-sessiond"},
         {"key": 2, "value": true},
         {"key": 3, "value": -17.23}
       ]

You get the idea. Add a few version properties here and there to make
this as backward and forward compatible as possible.

Some advantages of this layout:

* `types.json` means you save some space in all the intervals (type ID).

* `types.json` means you enforce the one type per timeline rule if you
  need it.

  The `any` or `variant` type still exists for situations where a state
  tree node has a current value of which the type can change over time.
  I'm thinking about the current value of a Python variable in an
  application trace, for example.

* A state history and a state snapshot have an intersection
  (`types.json` and `tree.json`), which means common code to produce and
  consume both. They should be already produced when taking the snapshot
  in fact.

Perhaps a single bit should be reserved in each interval to indicate if
it has a value or not (null or not). It's not necessary to use a variant
for this use case which seems so common.

We have ideas to handle this on the API side too. I could give details
if there's interest.

Again, this is intended for TC's wishlist. We came out with a snapshot
format (also JSON) which works with the current state history format,
but it needs to include the type information and it's completely
independent from the state history's format.

Thoughts?

Philippe Proulx
EfficiOS Inc.
http://www.efficios.com/


Back to the top