Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [] Improving license check for dependencies


Thanks for starting this thread.  It's been super constructive and this is exactly the role of this council.

As Modeling PMC lead, I generally approve CQs within minutes, but on occasion one will slip past my attention in my daily mail avalanche.  I'd be ecstatic to abdicate my approval rubber stamp to the Foundation staff.  Is that a shift of burden? :-P

1. Remove the need to get PMC approval.
+1 +1 +1 !!!

2. Provide some way to get faster approval for version upgrades.
As Wayne mentioned, I was under the impression that "delta" CQs were already further streamlined.  Certainly the previously endless stream of piggyback CQs have stopped flowing into my mailbox, thank the powers the be!


On 20.03.2020 01:38, Jim Hughes wrote:

Hi Wayne,

Yes, there is a branch here:  Thanks for giving it a spin!

The tough part around provided dependencies is that as your issue notes (paraphrased) "The transitive closure of [a workswith dependency] ... requires no further scrutiny.".  I've added some dependency tree output before as example*.

From that little fragment, the use of hadoop-common likely should be a workswith dependency, and for those, we do have CQs like which say "hadoop Version: 2.2.0 and Later".  Assuming that's ok, then that particularly library is covered. 

Ideally though, it'd be great to have a tool which calculated the dependency list and stopped at the first provided dependency.  Once you had a list of GAVs + Scopes, you could sort by GAV, and then keep the highest scoped (compile beats provided) one.  The compile GAVs need to be a prereq-CQ and the provided would need a workswith-CQ?  In either case, it sounds like ClearlyDefined metadata may be sufficient?

If I were trying to bang through this, I'd probably fuss with the source of the Maven dependency plugin:  You might be able to fork it and toss in the ClearlyDefined checks to decorate a dependency tree/list.  You could have a Visitor that just once it saw a 'Provided' scope, it'll mark up a status as child-of-provided or something. 

As an idle thought / wish, it'd be lovely if there were some way to mark up a pom with a comment that'd say "Hey, this dependency doesn't have the CD metadata, but it is covered by CQ-1234!".  By the time you put all that together, the only other things you could ask for the tool to do are 1) ask you for your Eclipse u/p to start the CQs, and 2) send email.

Anyhow, hope that helps some! 



* Portion of some
[INFO] org.locationtech.geomesa:geomesa-utils_2.11:jar:2.5.0-SNAPSHOT
[INFO] +- org.locationtech.geomesa:geomesa-z3_2.11:jar:2.5.0-SNAPSHOT:compile
[INFO] +- org.apache.hadoop:hadoop-common:jar:2.8.5:provided
[INFO] |  +- org.apache.hadoop:hadoop-annotations:jar:2.8.5:provided
[INFO] |  |  \-
[INFO] |  +- commons-cli:commons-cli:jar:1.2:provided
[INFO] |  +- org.apache.commons:commons-math3:jar:3.6.1:provided
[INFO] |  +- xmlenc:xmlenc:jar:0.52:provided
[INFO] |  +- org.apache.httpcomponents:httpclient:jar:4.5.2:provided
[INFO] |  |  \- org.apache.httpcomponents:httpcore:jar:4.4.4:provided


[INFO] |  +- commons-lang:commons-lang:jar:2.6:provided
[INFO] |  +- commons-configuration:commons-configuration:jar:1.6:provided
[INFO] |  |  +- commons-digester:commons-digester:jar:1.8:provided
[INFO] |  |  |  \- commons-beanutils:commons-beanutils:jar:1.7.0:provided
[INFO] |  |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:provided

On 3/19/2020 6:44 PM, Wayne Beaton wrote:
Hey Jim. Thanks for running the tool. Your experience has been helpful for me.

I assume that you are consistently marking "workswith" dependencies as being in the Maven "provided" scope. This is a pretty good solution, I think, to issue 13. Assuming that this is correct, then the solution would be to filter out the "provided" dependencies when you're assembling the dependency list. Anything left over should be pre-requisite.

As I've been testing, I've compared the "problematic" content against the CQ record (oftentimes referring to the dependency tree) to manually skip over/ignore "workswith" stuff.

I've been tinkering with using finding a means of cutting off the branch on the dependency tree when you identify a "works with" but using the "provided" scope sounds way easier. Assuming that it makes sense and projects can consistently do this, it sounds like a best practice.

It's probably the "-incubating" that's messing up org.apache.htrace:htrace-core:jar:3.2.0-incubating. The tool cuts off the "-incubating" part, so that's probably what is causing the problem. I'll change the tool to keep all of the version information (I clipped it because of a hack involving qualifiers in OSGi bundle versioning, but have another idea for how to handle that). In the meantime, it looks like you don't need a CQ. Have you committed this to a branch that I can test? (note that the IP Policy now lets you push content that hasn't been fully vetted; it must be vetted before you do a release).


On Thu, Mar 19, 2020 at 6:06 PM Jim Hughes <jnh5y@xxxxxxxx> wrote:

Hi Wayne,

Very cool tool.  Thanks for forwarding the issue to the IP team for them to check off those CQs.

Sorry that I've been unavailable to participate more fully on calls and discussions about the tool.  As a concrete suggestion, for Maven based projects, it'd be awesome for there to be a little Maven plugin to automate running it.

For that one artifact, I'm curious...  the GAV is


And running the tool on that comes back empty.  That said, there's info for 3.1.0-incubating.  And the website will do something useful when you search for org.apache.htrace/htrace-core.

I wonder what's missing in the 3.2.0 release that is causing the tool to fail?

Thanks for running the tool against the GeoMesa build.  I ran it myself, and I did find one dependency we seemed to have missed (and I'm following up on it).  Separate from that, I'll own up to being relaxed around 'provided' dependencies.  The approach detailed ( suggests looking at system, provided, and compile dependencies. 

My understanding/interpretation has been that provided dependencies don't necessarily need to declared.  (Or that declaring them was kind of a nice-to-have / sanity check.)  Is that acceptable / correct?  If not, I probably just owe another 30-ish CQs... 




On 3/19/2020 5:04 PM, Wayne Beaton wrote:
Thanks for your input, Jim.

It may well be time to revisit the requirement that the PMC approve CQs (especially with the recent changes to the IP Policy). TBH, I'm concerned that the PMC is not engaging; as part of the project leadership chain, we depend on the PMC to help ensure that project teams are implementing the process.

We've had a streamlined process for version upgrades for some time. We have, however, been wrestling with a bug that's been putting CQs into the wrong state.  I'll forward your bug list to the IP team to ask that they process them and then investigate why they're not in the right state.

More generally, recent changes in the IP Policy (at least partially motivated by your experience) is moving us in a direction where hopefully most (in some cases, all) CQs are not required. Rather than CQs being a means of tracking content, we're changing them to be purely concerned with vetting content that has not previously been vetted.

A few weeks ago, I posted on this list about a tool that we've developed in-house (and are contributing to the Eclipse Dash project) that checks the trusted sources of license information that we have available to us (IPzilla and ClearlyDefined). The tool reports that we have trusted license information for org.apache.accumulo:accumulo-core:2.0.0, org.apache.commons:commons-configuration2:2.5, and org.apache.zookeeper:zookeeper:3.4.14, so (assuming that I've correctly guessed at the right Maven coordinates) three of the CQs that you've listed aren't required (I couldn't map org.apache.htrace:htrace-core:3.2, so that CQ is required).

The tool can be used to test a single CQ, e.g.:

$ echo "org.apache.accumulo:accumulo-core:2.0.0" | java -jar /gitroot/dash/ -
Vetted license information was found for all content. No further investigation is required.
$ _

I ran the tool on your full build (there's instructions for this in the tool's repository's README) and it came back with a pretty big list of content that it couldn't map to trusted license information. I'm pretty sure, however, that we have CQs for most/all of that content, but don't have a mapping between the content Id and corresponding CQ. I'll work through those.

In the meantime, when you do create CQs, it would be helpful if you could use the Maven id/coordinates as the name. With that, we'll have an automatic mapping.

It would helpful if the AC members could test this tool on code bases that they work with and provide feedback (issues on the GitHub repository are welcome).

I'll vent a bit myself... we've been discussing this on calls for some time and I did post about this on the mailing list with a request for assistance to develop this tool into something that can be used to save project teams time. I have stated repeatedly that I owe the community a few blog posts on this topic, but I brought the AC into the loop as far back as when we started drafting edits to the IP Policy.


On Thu, Mar 19, 2020 at 3:08 PM Jim Hughes <jnh5y@xxxxxxxx> wrote:
Hi all,

In the spirit of reducing friction, I'd like to share a little about my
week in trying to get some CQs approved.  I work on a project called
GeoMesa which has dependencies on two different ecosystems: Java-based
geospatial and Apache Software Foundation 'big data' projects (like
HBase, Cassandra, etc).  (Consequently, GeoMesa has entered over 400 CQs
over its lifetime.)

For an upcoming release, we are upgrading versions of our Apache
projects.  I tossed in 4 CQs* for ASF code.  All of the versions are
upgrades of software projects which have already been approved.

In order to make sure that we don't miss something, we wait for Eclipse
approval in IPZilla before merging a PR with a dependency change. 
(We've got a little bit of scripting with Maven's Dependency plugin
which helps us monitor this.  Let me know if you are interested; I'd be
happy to share more info.)

I'm hitting a few pieces of slow down / friction / frustration.

First, I am required to get PMC approval for each CQ.  It took me two
days of pinging my fellow PMCers to get one of them to vote +1 and click
4 boxes in IPZilla.

Second, now that the CQs are back to the 'new' status, and I am
completely unsure what the next steps are.  Does anyone know how long
the automated checks take to run?

Third, each of projects is an Apache Software Foundation project for
which previous versions have been used by Eclipse projects.  I know
there's always a chance that something goes screwy with licensing in any
project at any time.  That said, if any of these CQs fail an automated
check, then I imagine an Eclipse employee is gonna have a task to open
the CQ, they'd see that it is an ASF project and click 'approve'.

Anyhow, apologies for venting.  Thanks for reading, thanks in advance
for any suggestions.  For my position, I think there are some changes we
could make:

1. Remove the need to get PMC approval.

2. Provide some way to get faster approval for version upgrades.

Either of those approaches would make my life better.  Together, 95% of
my frustration with IP concerns would be gone.

As the AC, what are we in a position to recommend / request?



* Interested in watching from home, the CQs are here:

_______________________________________________ mailing list
To unsubscribe from this list, visit


Wayne Beaton

Director of Open Source Projects | Eclipse Foundation, Inc.

_______________________________________________ mailing list
To unsubscribe from this list, visit
_______________________________________________ mailing list
To unsubscribe from this list, visit


Wayne Beaton

Director of Open Source Projects | Eclipse Foundation, Inc.

_______________________________________________ mailing list
To unsubscribe from this list, visit

_______________________________________________ mailing list
To unsubscribe from this list, visit

Back to the top