Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[technology-pmc] [CQ 5453] Code Recommenders Usage Data Collector v1

http://dev.eclipse.org/ipzilla/show_bug.cgi?id=5453





--- Comment #6 from Marcel Bruch <bruch@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>  2011-08-04 08:55:03 ---
(In reply to comment #5)
> What data does this UDC collect

It stores the information how an API is used in code, i.e., which methods have
been called on an object in code. The information includes names of the
enclosing class and method as well as variable names but no information about
control flow. Furthermore, data can be anonymized to include only the methods
invoked on an object. Furthermore, sharing is enabled on per project basis or
per package selection. Per default no data is shared. The user is asked in a
similar way to Eclipse' usage data collector (upload now, automatically, never
etc.).

The UDC offers the support for allowing vendors to build and deliver their own
models for their own libraries and customers. It's still possible to use code
recommenders' features w/o sharing any usage data.

> and where does it store data (both locally and
> once uploaded)?

locally, the data is currently stored inside the project's .recommenders/data/
folder in human-readable JSON format. Remote, the same data (-
privacy/filtering settings) is stored on disk or whatever persistence provider
is used. Server URL is configurable to point to any address. At the moment
(during incubation) it points to an university server where we generate the
models for code recommenders in a nightly build job.


> Does the user opt into data collection?

She has to before anything is uploaded. A "terms of usage" page following the
example of the Eclipse Usage Data Collector needs to be written. 

> Are there any issue
> with respect to user privacy that need to be addressed?

Privacy settings exist such as no-sharing-at-all, or anonymization. A
fine-grained control for which libraries data is shared is in progress. 

Control structures or business logic is from my viewpoint almost not
reconstructable since we do not track any information about control statements,
return values, values (such as strings, ints, etc.), or objects from the java
standard library. However, there is obviously some information that can be
extracted such as how to use a Text widget.

The upload wizard contains a preview window showing which data gets filtered to
give a good feeling what get's submitted. Further refinements are in progress.
This contribution will be developed in a separate branch until privacy
requirements are met and terms of usage are set up properly. Any concerns?


-- 
Configure CQmail: http://dev.eclipse.org/ipzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the CQ.


Back to the top