PDP4E: Leverage Eclipse Tools for GDPR Compliance

The EU General Data Protection Regulation (GDPR), enforced since May 25, 2018, has posed a set of challenges to organizations that process personal data (even if the organization itself has no establishment in the EU). But GDPR is not just a matter of lawyers or managers; on the contrary, GDPR also concerns those who create systems, products and services, that is... engineers! Fortunately, GDPR compliance can be supported by activities carried out within engineering disciplines (requirements, risk management, design and modelling, assurance...)

Since last year, a lot of tools and methods have appeared that try to ease compliance with that law; however, there is indeed a lack of tools specifically addressing engineers. That's why the PDP4E project is aiming to put engineers in the loop, integrating privacy and data protection into engineering practice, by extending existing methods and tools, currently applied by mainstream engineering work, with features dealing with privacy and data protection. In particular, we are reusing a set of open source tools (most part of the Eclipse ecosystem), and introducing features from state-of-the-art privacy and data protection research, aligning them with mainstream software and systems engineering practice, as shown below.

Risk Management: Be Proactive, not Reactive

GDPR is said to be "risk-oriented" in that compliance requires analysis of potential risks and impacts to the data subjects Risk management processes involve a proactive attitude since the onset of a project (rather than waiting until incidents have already happened and then reacting). This discipline has a long track of systematically dealing with security risks, and the same approaches can be extended to also deal with privacy and data protection. In particular, PDP4E is extending and linking existent tools from a previous EU project (MUSA) and tools provided by the French supervisory authority (CNIL).

The Risk Management tool provided by PDP4E will help engineers get involved in risk management, from a technical perspective. This will facilitate the integration of legal requirements with actual technical mitigation actions to be implemented by engineers during the software development process. The tool will not only allow creating a risk management plan at design time, but it will also provide means for the continuous management of risks through monitoring the implementation status of mitigation actions. GDPR also entails that organizations shall reassess their treatment plans as new privacy and data protection risks are being discovered. And it is advised to document your risk assessment processes so that you can review its contents and make changes during the whole project lifecycle.

The focus of the risk management methodology used will consider both privacy and security issues. The main user for this tool will be the engineer and it is being implemented taking usability into account and following an approach which is as transparent and non-intrusive for the engineer as possible.

Requirements Engineering: Code is Law

Legal code and computer code may both seem a set of hardcoded rules, but the former is more than a set of closed rules and needs interpretation which might depend on the context. Hence translation from GDPR provisions into actionable technical requirements for a given project is not straightforward, and can benefit from the use of appropriate tools.

GDPR establishes a set of data protection principles(including e.g. consent as one possible lawfulness basis) that must guide the development of any system; it compels data controllers and processors to abide by a set of legal obligations; and they must honor several rights of the data subjects (including data portability, right to erasure, etc.) All those legal aspects shall be operationalized into requirements that can be integrated as first-class citizens in the backlog of the products under development, and which engineers can implement in the products they create.

PDP4E provides a method and tools for the elicitation of privacy-related requirements in systems development projects. In particular, such method takes into account the legal obligations introduced by GDPR and seeks to incorporate them into a development project at the early stages. This approach is mainly inspired by the Problem-based Privacy Analysis (ProPAn). The ProPan method is being extended so as to address identified stakes of PDP4E like the need of additional requirement taxonomies and more specific contextual, data and software artifacts comprehensible enough to the project's stakeholders and, in particular, to engineers.

The requirements management tool for data protection relies upon the Eclipse platform and more specifically upon the Papyrus framework, which is leveraged to support non-privacy savvy engineers during specification, analysis, and elicitation of GDPR-specific requirements.

Model-Driven Design: Know Thyself

Privacy and Data Protection should be addressed "by design", that is since the onset of a project rather than as an afterthought. Organizations must be aware of all kinds of personal data they are dealing with, the data subjects affected, the processing operations they undergo, etc. This knowledge is critical to be able to honor data subject rights (e.g. right of access, right to be forgotten, data portability), to carry out data protection impact assessments, etc. Appropriate software and system models can be leveraged and enriched with metadata that signals who, where and how processes personal data.

A privacy and data protection by design (PDPbD) framework is specified and developed in PDP4E. Several model-driven engineering techniques and platforms like Papyrus are leveraged in order to support non-savvy privacy engineers to conduct typical systems and software design activities. Our approach for PDPbD combines three views at different levels of abstraction: data-oriented, process-oriented, and architecture models are consistently developed and enriched so as to ensure a three-fold goal.

  • First, the design models shall be in conformity with the requirements integrating the specificities of GDPR and the typical privacy concerns. For the conformity to be truly ensured, personal data should be accurately and early identified. This means, for instance, properly labeling which database fields store personal data, which functions carry out data processing operations, and in which realms they are deployed.
  • Secondly, the design phase should provide confidence about the effectiveness of privacy controls elicited during the risk assessment phase.
  • Last but not least, the PDPbD framework should implement algorithms and techniques to facilitate the application of strategies for data protection.
  • Moreover, the validation and verification of privacy-related properties is addressed, in particular, at code level relying upon the Frama-C platform.

Software and Systems Assurance: Be Good and Look Like

GDPR establishes the accountability and transparency principles, which entail that organizations show in an accessible and comprehensible way how they are processing personal data and that they demonstrate they are appropriately implementing all the requirements posed by GDPR.

An Assurance Case is a set of auditable claims, arguments, and evidence created to support the claim that a defined system/service will satisfy particular given requirements. Assurance Cases have a previously successful track record to exchange information between various system stakeholders such as suppliers and acquirers, and between the operator and regulator, where the knowledge (related to e.g. the safety and security of the system) is communicated in a clear and defendable way.

Assurance methods and tools are being used in PDP4E to demonstrate that compliance, through the recording of evidences that demonstrates that the processes determined by GDPR (or by ancillary standards and regulations) have been carried out and by adding argumentations which support that line.

PDP4E takes advantages of OpenCert Eclipse project as a solution for assurance and certification management of Cyber-Physical Systems (CPS). Further information on Assurance cases and OpenCert can be found in the AMASS project, also featured in this newsletter, and in this video about OpenCert (from the AMASS' YouTube Channel).

Method Engineering

All in all, PDP4E is not only describing tools but, more importantly, engineering methods and knowledge bases that capture best privacy engineering practice (and to which those tools provide support). These methods should (ideally) be included within the software development lifecycle, following the preferred development methodology chosen by engineers, thanks to a flexible and modular approach for the methods themselves.

PDP4E and the Eclipse Community

The role of open source, and the Eclipse ecosystem, in particular, is key to this project, as most of the background tools are already part of this ecosystem, which we also plan to leverage so as to reach the community of developers and get feedback from the users (i.e. the engineers). The core functionality of the toolset and the related methods will be released under open licenses.

The PDP4E project is currently developing a first version of the toolset, which we plan to present at this year's edition of EclipseCon Europe, to get initial user feedback.

About PDP4E

More information about the PDP4E Project is available at:

Authors of Article

Antonio Kung is co-founder of Trialog. With more than 30 years of experience in the field of cyber physical systems and the Internet of Things, he brings expertise and know-how particularly on architecture, interoperability or data security and protection. He was the coordinator of numerous national and European collaborative projects in these fields. He is active in standardisation on the Internet of Things, security and data protection, and the editor of ISO/IEC standards 27550, 27556, 27030, 27570, 21823-3. He became CEO of Trialog in 2018. Antonio has a master degree from Harvard university and an engineering degree from Ecole centrale Paris.
Yod-Samuel Martín is a Researcher at the Departamento de Ingeniería de Sistemas Telemáticos of Universidad Politécnica de Madrid (DIT-UPM). His research work focuses on different categories of non-functional software and service requirements, especially on the categories of accessibility and privacy, understood from different points of view. Yod-Samuel is currently the Scientific and Technical Lead of the PDP4E project. The results from his research have been applied, in collaboration with private companies, to fields like telecommunications, banking and financial services, social networks, transportation and logistics, etc.
Dr. Victor Muntés-Mulero is the co-founder, CEO and Scientific Director of Beawre (beawre.com). Before this, he was vice president of research at CA Technologies (CA), leading the Strategic Research team, worldwide. He was responsible for leading research that has the potential to impact the strategic direction of CA products, in collaboration with universities. Dr. Muntés has more than 70 peer-reviewed research publications, as well as 8 granted patents plus 31 patents pending evaluation. He also authored a book, as well as several book chapters. He has been mentioned in press more than 250 times. Prior to joining CA, he was an associate professor at the Universitat Politècnica de Catalunya (UPC), doing research related to managing very large data volumes. Besides, he was named Honorary Professor at Universidad of San Martín de Porres (Lima, Perú) in August 2015 and he also taught master courses related to data management systems and data stream mining, at University of Reading (UK, 2003) and Pontifícia Universidad Javeriana (Bogotá, Colombia, 2011), respectively. Dr. Muntes acted as the industrial chair of BPM 2017. Finally, he is co-founder of Sparsity Technologies SL, a spin-off started at UPC in 2010.
Patrick Tessier obtained a PhD in Computer Science in 2005 from University of Lille (France) and the CEA. His PhD was about the management of the variability for the designing of real time system family in the context of a model-driven approach. Today, he is researcher at CEA LIST/LECS (CEA - French Atomic Energy Agency, System Requirements and Compliance Laboratory) where he works on requirement management, and traceability problematic. He is also involved as Technical lead of Eclipse Papyrus tool (http://www.eclipse.org/papyrus).
Gabriel Pedroza is a Research Engineer at the Laboratory of Systems Requirements and Conformity Engineering (LECS) of the CEA institute in France. He conducts research in the field of systems security, safety and privacy by exploring, using, defining and extending high-level languages and methods to conduct systems modelling and multi-concern analysis. His work includes the development of techniques like modelling by reverse engineering, and language transformation towards formal frameworks in order to validate systems properties. He has participated in several projects like EVITA (FP7), SESAM-Grids (French R&D project), MOSARIS (French, ANR), AMASS (ECSEL JU) and more recently PDP4E (H2020).
Alejandra Ruiz holds a Ph.D. degree in Telecommunications and Computer Engineering, (2015, U. of Deusto), an MSc in Advanced Artificial Intelligence (2012, UNED) and the degree in Telecommunication Engineering (2005, University of Deusto). She joined TECNALIA in 2007 and is a Research Engineer in the Cyber Security and Safety group. She currently leads the area of Modular Assurance and Certification of Safety-critical Systems, with particular focus on automotive, aerospace, railway and medical device industries. She has been the leading the AMASS project (Architecture-driven, Multi-concern and Seamless Assurance and Certification of Cyber-Physical Systems) and is the main contributor in these areas for European projects such as RECOMP (Reduced Certification Costs for Trusted Multicore Platforms), OPENCOSS (Open Platform for EvolutioNary Certification of Safety-critical Systems) SafeAdapt (Safe Adaptive Software for Fully Electric Vehicles) and EMC2 (Embedded Multi-Core systems for Mixed Criticality applications in dynamic and changeable real-time environments). She is taking care of the assurance related work in PDP4E project.
David Sánchez-Charles obtained an industrial PhD in Computer Science in 2017 from the Universitat Politècnica de Catalunya (Spain) and CA Technologies. His PhD was focused on the application of innovative user behavior modelling and anomaly detection mechanisms in access control management systems. He has been involved in research transfer and improvement of open innovation schemes for small and large enterprises. His research is now focused on the integration of privacy and data protection into organizational and engineering processes. David Sánchez-Charles is currently the Research and Innovation Manager of the PDP4E project.
This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 787034