Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [scava-dev] Dataset detailing the project details on Eclipse

[Note: for the mailing list, I've to switch to another email address. sorry for the noise.]

On 12/10/2018 21:40, Pal, Sukrit wrote:
Good Afternoon All,
Good evening Sukrit,

I am trying to study any empirical impact that team cohesion may have on the project success. Hence, I need the emailing list to develop a network of connectedness among the team members working in projects. I do not need the detailed email but a record showing which two ids have exchanged an email should be enough. Along with that I would need the project specific information: developers associated with it, the project manager id etc.
I'd be very interested in the results (visualisation, analysis of the relationships..) if you plan to make them public!

A quick question, the file that is available for download in the below link is only 1KB whereas it is written as 12MB which expands to 63M. I am not able to open the zipped file that I download from this link using any unzipping S/W on Mac. I am not sure if I doing wrong some where.
That was on my side. It should work now, please just try again, and let me know if it still fails.



Sukrit Pal
PhD  Student | Operations & Sourcing Management
Michigan State University - Eli Broad School of Business

On Oct 12, 2018, at 3:45 AM, Philippe Krief <philippe.krief@xxxxxxxxxxxxxxxxxxxxxx <mailto:philippe.krief@xxxxxxxxxxxxxxxxxxxxxx>> wrote:

Let continue this interesting exchange by using one of our public mailing lists.

Philippe Krief
Research Relations Director | Eclipse Foundation, Inc.
Eclipse Foundation <>: The Platform for Open Innovation and Collaboration
M: +33 (0)6 21 01 06 81

Annastr. 44, D-64673 Zwingenberg
Handelsregister: Darmstadt HRB 92821
Managing Directors: Ralph Mueller, Mike Milinkovich, Chris Laroque

Begin forwarded message:

*From: *Boris Baldassari < <>>
*Subject: **Re: Fwd: Dataset detailing the project details on Eclipse*
*Date: *12 October 2018 at 09:17:00 CEST
*To: *"Pal, Sukrit" <palsukri@xxxxxxxxxxxxx <mailto:palsukri@xxxxxxxxxxxxx>> *Cc: *Philippe Krief <philippe.krief@xxxxxxxxxxxxxxxxxxxxxx <mailto:philippe.krief@xxxxxxxxxxxxxxxxxxxxxx>>, Gaël Blondelle <gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx>>, Davide Di Ruscio <davide.diruscio@xxxxxxxxx <mailto:davide.diruscio@xxxxxxxxx>>

Good morning Sukrit,

Hope you are fine.

There is a number of ways to get information regarding communication channels in Eclipse, although there is no equivalent of the GHTorrent initiative yet.

* The Eclipse API [1] provides access points for mailing lists and forums of all projects, among other things. The anonymisation technique however is fairly brutal (e.g. on email addresses) and some information may be missing, depending on what you are looking for.

* If you're interested in Issues (and their comments), the new version of Bugzilla hosted at Eclipse provides an API that can be used. The Eclipse API documentation [1] in its first paragraphs has links to the tool APIs documentation. There is also a dataset for Eclipse bugs available from [5] published by an external lab.

* There is a curated dataset for mailing lists available at [2] which includes basic headers information, without the body of emails. Its strong point is that the anonymisation used preserves the unicity of IDs, thus enabling research on collaboration. Once again, it depends on your requirements. Please note however that this url [2] may change in the upcoming months -- but then we'll provide a link to the new location.

* The Crossminer research project [3] also addresses communication channels, analysing the interactions between collaborators of projects. The outcomes (dashboards and results) of the project will not be available before 2019-Q3, but some datasets have already been published -- the above-mentioned dataset for mailing lists is one of them -- and you may want to follow its Eclipse offspring, Scava [4], for more information. Several other datasets will be published by Scava in the upcoming months.

Are there other areas of collaboration that you're interested in?
Please let me know if you have any question or remark regarding data access for the Eclipse forge.

Have a wonderful day, cheers! :-)



---------- Message transféré ----------
*De :* Pal, Sukrit <palsukri@xxxxxxxxxxxxx <mailto:palsukri@xxxxxxxxxxxxx>>
*Date :* 10 oct. 2018 à 18:04 +0200
*À :* gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx> <gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx <mailto:gael.blondelle@xxxxxxxxxxxxxxxxxxxxxx>> *Cc :* research@xxxxxxxxxxx <mailto:research@xxxxxxxxxxx> <research@xxxxxxxxxxx <mailto:research@xxxxxxxxxxx>>
*Sujet :* Dataset detailing the project details on Eclipse
Good Afternoon Team,

Greetings! I hope you are doing great.
I am an PhD candidate @ Michigan State University and was looking forward to study the collaboration patten among the team members in the Open Source Software development platform that promotes software development productivities/efficiencies in the OSS development efforts. I wanted to base the study as my dissertation project. For the study I need data related to communications among the project team members(among many others). It would be great if you can route me to a site that manages the project details in a dataset on the Eclipse (very much like GhTorrent does it for GITHub). Any information on the topic will be very helpful.

Sukrit Pal
PhD  Student | Operations & Sourcing Management
Michigan State University - Eli Broad School of Business

Boris Baldassari
Castalia Solutions -- Elegant Software Engineering
Web: <>
Tel: +33 6 48 03 82 89

Boris Baldassari
Castalia Solutions -- Elegant Software Engineering
Tel: +33 6 48 03 82 89

Back to the top