Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [scava-dev] Dataset detailing the project details on Eclipse

On 13/10/2018 08:45, Pal, Sukrit wrote:
Good Morning Baldasaari,
Good morning Sukrit,

I was trying to use the API but was not able to make much of a headway into it. After I get the mailing list data I would need project specific information like
  * # of project members and the ids
That you can get by analysing the dataset or API results.

  * Number of days they have spent in the project
That you can get by analysing the dataset or API results.

  * Id of the project owner/manager
That you can get from the project description:
https://projects.eclipse.org/

  * Any organizational associations
?? Don't know what you mean. Same as above, you may find the project description useful for that..
https://projects.eclipse.org/

  * Some other control variables
??

However, it seemed like APIs would not be able to give the details, but I may be completely wrong about it. Please suggest.
There is a R Markdown [0] documentation file that describes the mailing list dataset [1,2]. Depending on the tools you use for your work this file provides examples of analysis and answers some of your questions. In some cases it might be necessary to work on several sources and do some cross-analyses.


[0] https://rmarkdown.rstudio.com/
[1] https://software-data.org/datasets/eclipse_mls/resources/mbox_analysis.html [2] https://software-data.org/datasets/eclipse_mls/resources/mbox_analysis.rmd


Hope that helps.

Cheers,


--
boris




Regards,
Sukrit Pal
PhD  Student | Operations & Sourcing Management
Michigan State University - Eli Broad School of Business

On Oct 12, 2018, at 9:30 PM, Pal, Sukrit <palsukri@xxxxxxxxxxxxx <mailto:palsukri@xxxxxxxxxxxxx>> wrote:

Thanks Boris for the file. I think it would be very helpful in the project. I would love to share the results to you guys once I completed my analysis.

Going forward, I would try to download some project specific information using the API you mentioned. If I face any issues I may touch base with you again.
Thanks for helping me!

Regards,
Sukrit Pal
PhD  Student | Operations & Sourcing Management
Michigan State University - Eli Broad School of Business

On Oct 12, 2018, at 4:33 PM, Boris Baldassari <boris@xxxxxxxxxxxxxx <mailto:boris@xxxxxxxxxxxxxx>> wrote:

[Note: for the mailing list, I've to switch to another email address. sorry for the noise.]

On 12/10/2018 21:40, Pal, Sukrit wrote:
Good Afternoon All,
Good evening Sukrit,

I am trying to study any empirical impact that team cohesion may have on the project success. Hence, I need the emailing list to develop a network of connectedness among the team members working in projects. I do not need the detailed email but a record showing which two ids have exchanged an email should be enough. Along with that I would need the project specific information: developers associated with it, the project manager id etc.
I'd be very interested in the results (visualisation, analysis of the relationships..) if you plan to make them public!

A quick question, the file that is available for download in the below link is only 1KB whereas it is written as 12MB which expands to 63M. I am not able to open the zipped file that I download from this link using any unzipping S/W on Mac. I am not sure if I doing wrong some where.
https://software-data.org/datasets/eclipse_mls
That was on my side. It should work now, please just try again, and let me know if it still fails.


Cheers,


--
boris




Regards,
Sukrit Pal
PhD  Student | Operations & Sourcing Management
Michigan State University - Eli Broad School of Business
On Oct 12, 2018, at 3:45 AM, Philippe Krief <philippe.krief@xxxxxxxxxxxxxxxxxxxxxx <mailto:philippe.krief@xxxxxxxxxxxxxxxxxxxxxx> <mailto:philippe.krief@xxxxxxxxxxxxxxxxxxxxxx>> wrote:

Let continue this interesting exchange by using one of our public mailing lists.

Thanks
*—*
Philippe Krief
Research Relations Director | Eclipse Foundation, Inc.
Eclipse Foundation <http://www.eclipse.org/>: The Platform for Open Innovation and Collaboration
M: +33 (0)6 21 01 06 81

Annastr. 44, D-64673 Zwingenberg
Handelsregister: Darmstadt HRB 92821
Managing Directors: Ralph Mueller, Mike Milinkovich, Chris Laroque

Begin forwarded message:

*From: *Boris Baldassari <boris.baldassari@castalia.solutions <mailto:boris.baldassari@castalia.solutions> <mailto:boris.baldassari@castalia.solutions>>
*Subject: **Re: Fwd: Dataset detailing the project details on Eclipse*
*Date: *12 October 2018 at 09:17:00 CEST
*To: *"Pal, Sukrit" <palsukri@xxxxxxxxxxxxx <mailto:palsukri@xxxxxxxxxxxxx> <mailto:palsukri@xxxxxxxxxxxxx>> *Cc: *Philippe Krief <philippe.krief@xxxxxxxxxxxxxxxxxxxxxx <mailto:philippe.krief@xxxxxxxxxxxxxxxxxxxxxx> <mailto:philippe.krief@xxxxxxxxxxxxxxxxxxxxxx>>, Gaël Blondelle <gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx> <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx>>, Davide Di Ruscio <davide.diruscio@xxxxxxxxx <mailto:davide.diruscio@xxxxxxxxx> <mailto:davide.diruscio@xxxxxxxxx>>

Good morning Sukrit,

Hope you are fine.

There is a number of ways to get information regarding communication channels in Eclipse, although there is no equivalent of the GHTorrent initiative yet.

* The Eclipse API [1] provides access points for mailing lists and forums of all projects, among other things. The anonymisation technique however is fairly brutal (e.g. on email addresses) and some information may be missing, depending on what you are looking for.

* If you're interested in Issues (and their comments), the new version of Bugzilla hosted at Eclipse provides an API that can be used. The Eclipse API documentation [1] in its first paragraphs has links to the tool APIs documentation. There is also a dataset for Eclipse bugs available from [5] published by an external lab.

* There is a curated dataset for mailing lists available at [2] which includes basic headers information, without the body of emails. Its strong point is that the anonymisation used preserves the unicity of IDs, thus enabling research on collaboration. Once again, it depends on your requirements. Please note however that this url [2] may change in the upcoming months -- but then we'll provide a link to the new location.

* The Crossminer research project [3] also addresses communication channels, analysing the interactions between collaborators of projects. The outcomes (dashboards and results) of the project will not be available before 2019-Q3, but some datasets have already been published -- the above-mentioned dataset for mailing lists is one of them -- and you may want to follow its Eclipse offspring, Scava [4], for more information. Several other datasets will be published by Scava in the upcoming months.


Are there other areas of collaboration that you're interested in?
Please let me know if you have any question or remark regarding data access for the Eclipse forge.


Have a wonderful day, cheers! :-)



[1] https://api.eclipse.org/docs
[2] https://software-data.org/datasets/eclipse_mls
[3] https://crossminer.org <https://crossminer.org/>
[4] https://projects.eclipse.org/projects/technology.scava
[5] https://www.st.cs.uni-saarland.de/softevo/bug-data/eclipse/


--
boris



---------- Message transféré ----------
*De :* Pal, Sukrit <palsukri@xxxxxxxxxxxxx <mailto:palsukri@xxxxxxxxxxxxx> <mailto:palsukri@xxxxxxxxxxxxx>>
*Date :* 10 oct. 2018 à 18:04 +0200
*À :* gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx> <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx> <gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx> <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx>> *Cc :* research@xxxxxxxxxxx <mailto:research@xxxxxxxxxxx> <mailto:research@xxxxxxxxxxx> <research@xxxxxxxxxxx <mailto:research@xxxxxxxxxxx> <mailto:research@xxxxxxxxxxx>>
*Sujet :* Dataset detailing the project details on Eclipse
Good Afternoon Team,

Greetings! I hope you are doing great.
I am an PhD candidate @ Michigan State University and was looking forward to study the collaboration patten among the team members in the Open Source Software development platform that promotes software development productivities/efficiencies in the OSS development efforts. I wanted to base the study as my dissertation project. For the study I need data related to communications among the project team members(among many others). It would be great if you can route me to a site that manages the project details in a dataset on the Eclipse (very much like GhTorrent does it for GITHub). Any information on the topic will be very helpful.

Regards,
Sukrit Pal
PhD  Student | Operations & Sourcing Management
Michigan State University - Eli Broad School of Business


--
Boris Baldassari
Castalia Solutions -- Elegant Software Engineering
Web: http://castalia.solutions <http://castalia.solutions/> <http://castalia.solutions/>
Tel: +33 6 48 03 82 89



--
Boris Baldassari
Castalia Solutions -- Elegant Software Engineering
Web: http://castalia.solutions <http://castalia.solutions/>
Tel: +33 6 48 03 82 89




Back to the top