Hi Emily, Thank you for this suggestion. We do
have instructions on the wiki for users that might want to create their
own custom logger. The url is: https://wiki.eclipse.org/STEM_Create_EMF_Project The example is for a JSON logger but
you could do a new delimited file logger by a similar set of steps.
Some things to consider. For very long
and/or large runs, the format you suggest below would create much much
larger log files because the strings used to represent the node uid's would
be repeated for each node and each time step.
imho the simplest thing you might try
first (simpler than creating a new logger) is to write a small script in
either R or Python that reads the current STEM log format and exports it
to the new format you need for your other software. This could be very
simple code and you could test it with a test run of your STEM scenario
using a small graph and just a few time steps. You might try using a Pandas
data frame if you chose to go with Python.
01/07/2019 09:00 AM Subject:
Vol 117, Issue 1 Sent by:
Send stem-dev mailing list submissions to
Message: 1 Date: Mon, 7 Jan 2019 14:47:53 +0000 From: Emily Nixon <emily.nixon@xxxxxxxxxxxxx> To: STEM developer mailing list <stem-dev@xxxxxxxxxxx> Subject: Re: [stem-dev] stem-dev Digest, Vol 116, Issue 6 Message-ID:
Content-Type: text/plain; charset="us-ascii"
That makes sense, I can definitely see why you don't produce summary data
I do often choose to change the log interval so that the recorded simulations
files are smaller, however, for what I am trying to do (calculating how
many farms have an incidence >1 per iteration), I need to have each
What about my other suggestion, to change the format of how the results
are recorded by STEM in the csv files in the recorded simulations folder?
Would it be difficult to change it so that instead of having nodes as individual
columns, that there just be one column which specifies the node? This would
greatly reduce the number of columns and increase the rows instead which
would enable me to read even very large files into the software I am trying
to use to analyse the data.
So instead of the current format:
Iteration time stem://org.eclipse.stem/node/1
0 2018-02-01 1
We would have:
Iteration time nodeID
0 2018-02-01 stem://org.eclipse.stem/node/1
0 2018-02-01 stem://org.eclipse.stem/node/2
On much smaller files, of course it is possible to modify the output myself
in order to have this format. However, due to the sheer number of nodes
I am using, and therefore the sheer number of columns, I am not able to
use my database management systems to store the data first before I modify
it into this format.
I think this format would be useful to others too, as it is more compatible
with a number of data analysing softwares/languages.
How difficult do you think it would be to change the format STEM records
the data to what I've suggested above?
Emily Nixon PhD Student
School of Biological Sciences University of Bristol Bristol Life Sciences Building 24 Tyndall Avenue Bristol BS8 1TQ Tel +44 (0)117 394 1389 ________________________________ From: stem-dev-bounces@xxxxxxxxxxx <stem-dev-bounces@xxxxxxxxxxx>
on behalf of James Kaufman <jhkauf@xxxxxxxxxx> Sent: 20 December 2018 17:44:05 To: stem-dev@xxxxxxxxxxx Subject: Re: [stem-dev] stem-dev Digest, Vol 116, Issue 6
Hi Emily, The current logger allows you to select which compartments to log but it
does not do integration over the spacial nodes. STEM used to have this
feature (it was called an integrating disease model) but it was removed
for few reasons. In order to integrated (eg) level 3 nodes up to level 0 (country) nodes
it is necessary to form a graph with all hierarchical admin levels and
connect them by containment edges. The containment edges still exist in
STEM so this feature could be reimplemented but it has several issues. 1) It adds a lot more nodes and edges to your graph 2) it adds a lot of computational overhead 3) You need to rethink decoration of the graphs itself. You don't want
to propagate the disease at all levels - just aggregate above the lowest
node. But what if the graph is not uniform in depth. Computing this adds
to overhead as well.
We decided it was best to keep things simple. The aggregation feature caused
uses to make too many errors in composing their graph. As an alternative
you might consider changing the log time interval if the logs are too big.
The log time interval is not dependant on your integration time interval.
From: stem-dev-request@xxxxxxxxxxx To: stem-dev@xxxxxxxxxxx Date: 12/20/2018 03:58 AM Subject: stem-dev Digest, Vol 116, Issue 6 Sent by: stem-dev-bounces@xxxxxxxxxxx ________________________________
Send stem-dev mailing list submissions to stem-dev@xxxxxxxxxxx
Message: 1 Date: Thu, 20 Dec 2018 11:58:01 +0000 From: Emily Nixon <emily.nixon@xxxxxxxxxxxxx> To: developer mailing list STEM <stem-dev@xxxxxxxxxxx> Subject: [stem-dev] Request for a new way of outputting data from STEM Message-ID: <VI1PR0601MB2352EDD4E8CA3D4513E36CD1BFBF0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Content-Type: text/plain; charset="iso-8859-1"
As some of you may know, I am running scenarios on STEM that contain graphs
with thousands of nodes and edges.
Therefore, the output csv files I get for each compartment of the disease
are all very large which makes them difficult to work with and they take
up a lot of space.
I am trying to develop ways in R to summarise this data so that I have
some meaningful results, however, my supervisor thought that it might be
worth checking with STEM developers about whether it is possible to have
a new logger, or an option on an existing logger which means that only
summary data is output. I know you are very busy at the moment, but she
thought that it might not take too long, depending on how the logger code
is currently written, and that it could be useful for other people too.
Perhaps it could be a new feature on STEM4?
Specifically, what I would like to have is something like the following
for the Incidence and I_0 csv files:
So instead of having a column for every node, having nodeID as a column
and then having the node names in this column. This would make the list
have more rows but less columns, which would make it much easier to work
If anyone has any thoughts or suggestions, if anyone else would find this
useful or if anyone thinks they could help with implementing this in STEM,
then I would greatly appreciate it!