Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[tracecompass-dev] Fwd: Re: LTTng/TraceCompass cooperation



-------- Forwarded Message --------
Subject: 	Re: LTTng/TraceCompass cooperation
Date: 	Fri, 19 Dec 2014 16:48:25 -0500
From: 	Julien Desfossez <jdesfossez@xxxxxxxxxxxx>
To: 	Matthew Khouzam <matthew.khouzam@xxxxxxxxxxxx>, Marc-André Laperle
<marc-andre.laperle@xxxxxxxxxxxx>,
alexandre.montplaisir-gon.alves@xxxxxxxxxxxx,
bernd.hufmann@xxxxxxxxxxxx, Dominique Toupin
<dominique.toupin@xxxxxxxxxxxx>, patrick.tasse@xxxxxxxxxxxx,
marco.masse@xxxxxxxxxxxx, Mathieu Desnoyers
<mathieu.desnoyers@xxxxxxxxxxxx>, Jérémie Galarneau
<jeremie.galarneau@xxxxxxxxxxxx>, Christian Babeux
<christian.babeux@xxxxxxxxxxxx>, Geneviève Bastien
<genevieve.bastien@xxxxxxxxxx>, Michel Dagenais
<michel.dagenais@xxxxxxxxxx>



Hi Matthew,

Thanks for your reply and for the details of what you have up to now. I
do agree that focusing on the user interface for offline traces is much
more important at first. And I'll be glad to provide feedback on
LTTng-related features such as live.

Yes, that's a great idea to archive this discussion on the tracecompass
mailing list.

Thanks,

Julien

On 14-12-19 09:36 AM, Matthew Khouzam wrote:
> Hi Julien,
> 
> First off, thank you for the list of steps to take to get up and running
> properly.
> 
> Right now, what we have is not a true (remote, streamed, in memory,
> handling all the cases of relayd) live trace reading mode, if it were, I
> think you know me and my pre-disposition for spectacle, I would be
> shouting it at the top of the highest mountains.
> 
> What we have right now:
> 
> We read the latest "safe" time from relayd and read the local trace up
> to that time, then we repeat.
> 
> The following patches already exist, but we did not yet merge them, they
> need a little more work to be cleaner.
>  - Update metadata when there is more
>     * ctf parser is done, and it reads partial metadata fragments with a
> nice performance boost vs reading the whole thing
>     * The signal to update trace information is not yet propagated, it
> needs to be done (This is a problem inherrent to CTF not being the only
> trace format supported by tc.
>  - Add new streams
>     * same as above
>  - support reading packets from relayd
>     * not yet, the ctf parser can handle in memory packets pretty well,
> but we have issues making a clean implementation since we need to
> support many trace types in tc. Maybe this should be in ctf, and not TC?
>     * We do not support very large packets though... they have to fit
> into a memory map.
>  - Our way of reading is racy, yes, it was based on the assumption that
> if we have a "safe" time, that it is truly safe. We will certainly
> revisit it with a "write to disk then re-read policy." This way we can
> keep seeks working. We will look into seekless trace streams in what I
> would call phase 3.
> 
>  - Our code is pretty much untested, we don't yet have a non-manual
> testing strategy for this, and feedback is welcome.
> 
> We know there's a long road ahead to get monitoring in tc, we have
> worked mostly on the front-end since that would also give user
> experience improvements for offline traces (higher hanging but MUCH
> larger fruits). We also worked to make the CTF parser support streaming
> and it works reasonably well. The main issue at this moment is making
> sure the glue to connect everything together works well.
> 
> This conversation is very good and interesting, I believe we should
> archive it on the tracecompass mailing list if you're OK with that.
> 
> 
> 
> 
> On 14-12-18 01:22 PM, Julien Desfossez wrote:
>> Hi,
>>
>> Last Friday, I had a chance to talk to the TraceCompass team about the
>> live tracing feature and realized that it would be good to establish a
>> communication channel between our two teams.
>> I saw a proof-of-concept of live tracing with TraceCompass being able to
>> refresh the view and the state of an active trace, and I would like to
>> make sure that the other requirements for processing a live session are
>> well detailed so that we are all on the same track as to what to expect
>> when this feature is ready for inclusion upstream.
>>
>> I think it will be a good exercise and we should do it whenever a
>> feature in TraceCompass is related to a core feature of LTTng.
>>
>> I propose that for each high level item that relates to LTTng, we split
>> the task in multiple subtasks to simplify the whole problem and make
>> sure we are on the right track with all the features and optimizations
>> implemented in the right order.
>> I am sure something like that already exists in your development process
>> and I don't want to interfere with it, but for the items that have
>> dependencies on LTTng components, I think it would be good to start this
>> kind of cooperation.
>>
>> It should be short bullet points that need to be addressed to consider
>> the feature complete and then work on optimizations (in a predefined
>> order of impact as well).
>> Judging from our schedules and work habits, it should be exchanged by
>> email to allow quick commenting, multiple iterations and archival of the
>> resulting discussions.
>>
>> We don't have the necessary background/time/workforce to work directly
>> on TraceCompass-specific issues, but I think we would all benefit from
>> this process if you sent us your semi-detailed plan before working on a
>> LTTng feature and we could comment on it.
>>
>> As an example, here is what I think would be good steps to implement the
>> live feature in TC and I'd like to get your feedback:
>> - connect to the relay with the live viewer protocol
>> - exchange versions
>> - list the sessions established on a relay and show them in a dedicated
>> view (not the session control view) with the detail of traced hostname,
>> session name, live timer, number of clients connected, etc
>> - attach to a session from this list:
>> -- receive all the streams
>> -- receive all the metadata
>> - implement the same interaction with the relay as Babeltrace:
>> -- for each stream, get a data index and the data corresponding (ideally
>> keep the data in memory, but a temporary file could be a start)
>> -- when we have one packet for each stream: process and display the
>> trace we have up to min(safe_time)
>> -- when one stream is at the end of its packet: get the next packet from
>> the relay with get_data
>> -- process the trace up to the next packet end, and refresh the display
>> -- make sure to honour the flags from get_next_index:
>> --- if needed: update the metadata if needed before processing the next
>> data packet
>> --- if needed: update the list of active streams before processing the
>> next data packet
>> -- when all the streams have hung up, close the connection with the relay
>>
>> With that list of items, you will have a live viewer that can work with
>> a remote relay, limit the polling to the relay only when it is needed
>> (require more data to work) and support the dynamic features related to
>> LTTng (new streams, new metadata, tracefile rotation, undefined storage
>> backend on the relayd side, data availability guaranties, etc). If you
>> have your own plan of steps for this feature, please share it with us so
>> we can discuss it.
>>
>> After that, we could do an iteration on possible optimizations, but I
>> would not start that before all of these items are complete and
>> thoroughly tested.
>>
>> Again, I don't have all the background on the architecture of
>> TraceCompass, so I am interested to review this plan with you and make
>> sure we can find a way to implement all of that.
>>
>> I hope you will see this as a constructive process to make sure the
>> features we implement have the maximum impact on our users and that we
>> all work in the same direction.
>>
>> Best regards,
>>
>> Julien
> 





Back to the top