Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[scava-dev] Fwd: Dataset detailing the project details on Eclipse

Let continue this interesting exchange by using one of our public mailing lists.

Philippe Krief
Research Relations Director | Eclipse Foundation, Inc.
Eclipse Foundation: The Platform for Open Innovation and Collaboration
M: +33 (0)6 21 01 06 81


Annastr. 44, D-64673 Zwingenberg
Handelsregister: Darmstadt HRB 92821
Managing Directors: Ralph Mueller, Mike Milinkovich, Chris Laroque

Begin forwarded message:

From: Boris Baldassari <>
Subject: Re: Fwd: Dataset detailing the project details on Eclipse
Date: 12 October 2018 at 09:17:00 CEST
To: "Pal, Sukrit" <palsukri@xxxxxxxxxxxxx>

Good morning Sukrit,

Hope you are fine.

There is a number of ways to get information regarding communication channels in Eclipse, although there is no equivalent of the GHTorrent initiative yet.

* The Eclipse API [1] provides access points for mailing lists and forums of all projects, among other things. The anonymisation technique however is fairly brutal (e.g. on email addresses) and some information may be missing, depending on what you are looking for.

* If you're interested in Issues (and their comments), the new version of Bugzilla hosted at Eclipse provides an API that can be used. The Eclipse API documentation [1] in its first paragraphs has links to the tool APIs documentation. There is also a dataset for Eclipse bugs available from [5] published by an external lab.

* There is a curated dataset for mailing lists available at [2] which includes basic headers information, without the body of emails. Its strong point is that the anonymisation used preserves the unicity of IDs, thus enabling research on collaboration. Once again, it depends on your requirements.
Please note however that this url [2] may change in the upcoming months -- but then we'll provide a link to the new location.

* The Crossminer research project [3] also addresses communication channels, analysing the interactions between collaborators of projects. The outcomes (dashboards and results) of the project will not be available before 2019-Q3, but some datasets have already been published -- the above-mentioned dataset for mailing lists is one of them -- and you may want to follow its Eclipse offspring, Scava [4], for more information. Several other datasets will be published by Scava in the upcoming months.

Are there other areas of collaboration that you're interested in?
Please let me know if you have any question or remark regarding data access for the Eclipse forge.

Have a wonderful day, cheers! :-)



---------- Message transféré ----------
*De :* Pal, Sukrit <palsukri@xxxxxxxxxxxxx>
*Date :* 10 oct. 2018 à 18:04 +0200
*À :* gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx <gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx>
*Cc :* research@xxxxxxxxxxx <research@xxxxxxxxxxx>
*Sujet :* Dataset detailing the project details on Eclipse
Good Afternoon Team,

Greetings! I hope you are doing great.
I am an PhD candidate @ Michigan State University and was looking forward to study the collaboration patten among the team members in the Open Source Software development platform that promotes software development productivities/efficiencies in the OSS development efforts. I wanted to base the study as my dissertation project. For the study I need data related to communications among the project team members(among many others).
It would be great if you can route me to a site that manages the project details in a dataset on the Eclipse (very much like GhTorrent does it for GITHub). Any information on the topic will be very helpful.

Sukrit Pal
PhD  Student | Operations & Sourcing Management
Michigan State University - Eli Broad School of Business

Boris Baldassari
Castalia Solutions -- Elegant Software Engineering
Tel: +33 6 48 03 82 89

Back to the top