[cross-project-issues-dev] Anonymisation of public data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[cross-project-issues-dev] Anonymisation of public data

From: Boris Baldassari <boris@xxxxxxxxxxxxxx>
Date: Thu, 26 Apr 2018 07:18:20 +0200
Delivered-to: cross-project-issues-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/cross-project-issues-dev>
List-help: <mailto:cross-project-issues-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev>, <mailto:cross-project-issues-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/cross-project-issues-dev>, <mailto:cross-project-issues-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0

Hello good people,

In the context of the Crossminer research project [1], we plan topublish a number of datasets to the public and for the researchcommunity. This includes public data from the Eclipse forge (i.e. datais fetched from public data sources and APIs only), and we want to setupan anonymisation process that would:

* Efficiently and safely remove all personally identifiable data -- wedon't want to help spammers or malicious harvesters, and* Still provide valuable information and datasets for the researchcommunity -- e.g. ability to identify identical IDs across sourceswithout specifically knowing them.

The basic idea is to simply replace all identifiers with asymmetricallyencrypted strings, so all IDs have the same ciphered result. RSA is usedfor the encryption, and the private key is thrown away once the encodingis done, making it impossible (according to common encryption standards)to retrieve the original string.

A prototype has already been published [2, 3] and we would like to askpeople to review it so as to make sure that our privacy-preservingmechanism is safe.


Any feedback, concern or contribution is warmly welcome.

[1] https://www.crossminer.org/
[2] https://github.com/borisbaldassari/data-anonymiser
[3] https://borisbaldassari.github.io/data-anonymiser/

Thanks in advance, have a wonderful week!

--
boris

Follow-Ups:
- Re: [cross-project-issues-dev] Anonymisation of public data
  - From: Gunnar Wagenknecht
- Re: [cross-project-issues-dev] Anonymisation of public data
  - From: Mickael Istria

Prev by Date: Re: [cross-project-issues-dev] Website style changes
Next by Date: Re: [cross-project-issues-dev] Anonymisation of public data
Previous by thread: [cross-project-issues-dev] Website style changes
Next by thread: Re: [cross-project-issues-dev] Anonymisation of public data
Index(es):
- Date
- Thread

Breadcrumbs