Home » Eclipse Projects » Spatiotemporal Epidemiological Modeler (STEM) » STEM Plan for Version 2(STEM Plan for Version 2)
STEM Plan for Version 2 [message #492980] |
Thu, 22 October 2009 10:59  |
Eclipse User |
|
|
|
Let's start a thread here about an update to the website reflecting our plan for Version 2.
We might want to update the plan web page to reflect what we definitely have on track as well as a wish list of features were we would especially welcome/encourage new committers.
In particular:
Matthias Filter wrote:
> Dear Dan,
> yes you were right, the thing I was looking for was a graph editor. A possibility to edit or modify preinstalled graphs could be quite useful especially for user that are not familiar with programming or that experience technical problems when following the developer instructions. So if it would be possible to implement that without to much effort it would be for sure a big plus.
> Concerning the graph visualization tool: I can imagine that there are a lot of other tasks on the "wish" list. So maybe one could describe these higher level goals on the project plan website and by this further increase the support for the development team. I for myself have the possibility to invest some of my resources next year into this project and for planning this such information would be very helpful as well.
>
> Matthias
|
|
| | |
Re: Features in Version 2 - some ideas [message #501595 is a reply to message #501357] |
Wed, 02 December 2009 18:49   |
Eclipse User |
|
|
|
Hi Matthias,
these are all very good suggestion, let me try and comment on some of them but I'm sure others have some thoughts on it as well.
As for the feature request, that's what we originally wanted to do but there really isn't any other suitable forum provided by eclipse where anybody can easily provide input, unless we write the web app ourselves A wiki page was suggested but that might be too difficult to use for general audience.
I'm not sure I understand your first suggestion. If you know the distribution of farm size and the number of farms in each district, wouldn't you be able to roughly estimate the number of cattle in each district? That could then easily be incorporated into a STEM model.
We do not have a "parameter sensitivity" estimate right now, but I can see it being incorporated into the Experiment features in STEM. Experiments allow you to run many simulations varying all the values of the disease model parameters (either automatically or manually). I think it is a good suggestion. If you could give us an indication of the "importance" of such a feature to a public health person such as yourself it would be very helpful to us. It is actually possible to manually calculate the sensitivity of each parameter right now by looking at the log files generated when running experiments, but that would require some skills.
I don't know what the traffic light principle is. STEM allows you to compare the outcome of one simulation with another, and even compare simulation results with actual reference data if available. Perhaps you can explain this feature a little more?
The ability to add/remove nodes and edges from within the application has been requested many times so we need to do it for sure. One complication is that models in STEM points to graphs in the standard STEM library that cannot be modified. So one alternative is to make a copy of all the nodes/edges in the STEM library when you incorporate them into your model so you can edit them. However, the STEM libraries can be HUGE (like every region on the planet) so that might not be the best solution. A better solution is to keep a "change" log where the modifications done by the user are applied on top of the STEM library ones.
Importing other file formats into STEM is easier for some formats than other right now. We support ESRI shape files for GIS data for instance, but other formats would require more work. Is there any particular format that you encounter frequently that would be useful?
Regards,
/ Stefan
|
|
| |
Re: Features in Version 2 - some ideas [message #502521 is a reply to message #501595] |
Tue, 08 December 2009 11:46   |
Eclipse User |
|
|
|
Hi Stefan,
To Stefan's comments:
to the first point - nodes without GIS information:
Maybe the example was not the best. Of course you could work something around the issue, if you don't have the exact position of a node, but this is extra work for the user that simply wants to apply STEM to e.g. a simple edgelist. In the end this might prevent potential users from applying STEM. So e.g. in the area I'm working in, there we have the situation, that we don't want to locate the farm or the production site on a map, because this would cause problems with data privacy. So actually for me the question is whether it is much effort to open the system for that kind of data, because the simulation infrastructure, e.g. the epidemiological models and the simulation infrastructure connected with them, are as far as I understood not directly affected by the exact location of the node.
So if this (nodes without GIS) would be possible then one could also extend the STEM functionality with features from the graph theory community that can help to descibe the network used in a simulation, e.g. distribution of the indegrees or outdegrees of nodes etc.. One solution might be that one assigns manually the same coordinates to all nodes where exact locations are missing and the system then applies some jittering when it comes to displaying the network. But this is for sure just one possibility.
- parameter sensitivity issue:
your idea is very good, the Experiment features should be able to cover this. For me personally a sensitivity analysis would be a must if I would have to base a decision on a simulation (except I can verify my simulation with independent historical real world date). But to be honest I don't have to make that kind of decision here.
- traffic light principle means a simple three colour coding scheme, red is bad and green is good.
to the other things I comment later.
Matthias
|
|
| |
Re: Features in Version 2 - some ideas [message #503835 is a reply to message #502584] |
Tue, 15 December 2009 17:56   |
Eclipse User |
|
|
|
Hi Matthias,
to clarify, we do not require the exact location of farms, people etc in STEM since we know that would be a privacy problem. For instance, all the public health data we use to evaluate our models in STEM must be de-identified and contain aggregate information only, for example how many new cases of flu were reported in ZIP code xyz a given day.
So in your example, we don't want to exact lat/lon position of the farm, rather all we'd need to know is what STEM region it is located it. Right now, the finest granularity regions we have in STEM is down to admin level 2, which typically is county. So disclosing the county a farm is in should be okay, right?
Using your concept of "traffic lights" to indicate the confidence we have in the input to the model is an interesting idea. One thing we do know about is the year we have population data for a country, so if that year is far in the past we would be less confident in those numbers. It would take some effort to implement such a feature, right now I would put a higher priority on being able to handle zoonotic diseases, multi-serotype disease models and new improved stochastic models in STEM.
Jamie knows more about the ESRI shapefile import into STEM, Jamie can you let Matthias know how we support that?
Regards,
/ Stefan
|
|
| | | | |
Re: Features in Version 2 - some ideas [message #561892 is a reply to message #502584] |
Tue, 15 December 2009 17:56  |
Eclipse User |
|
|
|
Hi Matthias,
to clarify, we do not require the exact location of farms, people etc in STEM since we know that would be a privacy problem. For instance, all the public health data we use to evaluate our models in STEM must be de-identified and contain aggregate information only, for example how many new cases of flu were reported in ZIP code xyz a given day.
So in your example, we don't want to exact lat/lon position of the farm, rather all we'd need to know is what STEM region it is located it. Right now, the finest granularity regions we have in STEM is down to admin level 2, which typically is county. So disclosing the county a farm is in should be okay, right?
Using your concept of "traffic lights" to indicate the confidence we have in the input to the model is an interesting idea. One thing we do know about is the year we have population data for a country, so if that year is far in the past we would be less confident in those numbers. It would take some effort to implement such a feature, right now I would put a higher priority on being able to handle zoonotic diseases, multi-serotype disease models and new improved stochastic models in STEM.
Jamie knows more about the ESRI shapefile import into STEM, Jamie can you let Matthias know how we support that?
Regards,
/ Stefan
|
|
|
Re: Features in Version 2 - some ideas [message #561911 is a reply to message #561892] |
Thu, 17 December 2009 11:59  |
Eclipse User |
|
|
|
Hi Stefan,
thank's a lot for the comments and the feedback.
To the location issue:
OK, then I misinterpreted the available possibilities.
I assumed that I can create my own property files ( e.g. DEU_3_node.properties) and the corresponding SPATIAL_URI ( e.g. DEU_3_MAP.xml) specifically for my specific issues. And after recompiling I could run STEM on my own property files.
So if this is not possible, well then this would be another "nice to have".
Concerning the "traffic light" issue - well it is clear that development resources are limited and that not all wishes will come true, even though it is close to Christmas. I just want to mention, that this feature could be applied to every model you create with STEM while e.g. the possibility to model multi-serotype disease models would "only" affect a limited number of diseases. Of course if you generate models just for fundamental research then the documentation and evaluation of model assumptions might be not that critical, but if you want to base a decision on a model, you would want to know that. At least this is the feedback I get from responsible risk managers here in Germany. So for prioritization of development tasks one might pose the question - Who is the main target group for STEM? Is it the scientific community or the risk assessor or the risk manager?
By the way - a traffic light score on a certain parameter could also be generated by the community. So e.g. if several STEM users assign independently a good score / green label to a parameter (e.g. the population size in a region) this would increases my confidence in the date as well.
Regards,
Matthias
|
|
|
Re: Features in Version 2 - some ideas [message #562016 is a reply to message #561892] |
Fri, 08 January 2010 15:26  |
Eclipse User |
|
|
|
Matthias,
Per this weeks stem call I look forward to your paper on the traffic light idea. I think for us we need a general framework that not only represents confidence in a parameter (red, yellow, green) but a measure of how important the parameter is to the model. this is difficult as the models are all nonlinear.
The Dublin core was supposed to accomplish this but it does not address individual model parameters. It does describe source and validity of denominator data however.
|
|
| |
Goto Forum:
Current Time: Wed Apr 02 01:39:11 EDT 2025
Powered by FUDForum. Page generated in 0.25894 seconds
|