|Re: [tycho-dev] [discuss] tycho repository layout and metadata format|
See inline -- Regards, Igor On 11-05-31 4:54 AM, Sievers, Jan wrote:
* focus on artifact deploy and dependency resolution during the build. To help us manage scope of this work, I want to explicitly exclude IU categories and other user-facing features from the scope.We do support anything that is published by tycho in p2artifacts.xml/p2content.xml though? I am thinking e.g. about feature root files or source bundles.
Yes, we need to support these, which I think means that each Tycho project should be able to deploy multiple artifacts and multiple IUs and Tycho repository layout has to support this.
If we want to exclude other features, we have to make this very explicit as people will expect the repo to behave just like any plain p2 repo. So effectively are we saying this is a repo format for tycho and at build time only, it's not something you can paste into the p2 update UI and install/update from it?
The problem is IU categorizatyion a filtering. For example, if all Helios and Indigo artifacts were deployed in a Tycho repository, how do we let the client choose between the two? I am not saying this is not possible or not desirable, I just want to concentrate on build-related use cases for now.
Another requirement that's important to us to ensure build reproducibility is good support for dependency management, i.e. exact control over which IUs are in the search scope of a build and which are not. With Import-Package and repositories the size 200K+ IUs, requirements are bound to become ambiguous. Also, there is widespread use of dependency ranges but there is often an implicit assumption that this dependency is resolved against a certain eclipse release train repo. What we need IMHO is a concept of restricted "view" on the repository. In contrast to maven POM dependency management, for a build against released versions I would like to have a "white list" or "bill of materials" of IU versions that define the resolution scope.
Not sure I understand. Does this "bill of materials" list all target platform IUs or does it provide additional constraints to help the resolver choose among multiple versions? One of aspects of build reproducibility is ability to make small changes to the project, to produce a bugfix for an old release for example. This implies ability to make small changes to target platform definition and expect corresponding small changes to resolved target platform. I don't think resolved target platform "snapshot" provides this property.
Don't know whether dependency management should be done on client or server side or maybe both. If it's on the server side, I could imagine e.g. "helios" and "indigo" views on the same repo. Each view would have a distinct URL similar to nexus repo groups. If dependency management is done on the client side, we don't want everybody to define their own or copy-paste the target definition. So we would need a concept for reusing and composability of the target definition.
We, too, thought about repository "views" and we actually implemented something like this as part one of Sonatype commercial products. The way I see it, "views" can be built on top of "raw" repositories used by the build and do filtering, aggregation, categorization and anything else necessary to expose build results to the end user. This is why I am suggesting we push user-facing behaviour out of scope for now, and deal with it separately from build-related usecases. I do not think these "views" can be used for build reproducibility, however. To really guarantee reproducible build, everything interesting/important should happen on the client, and server should just be a relatively dump metatada store without any smarts. Servers get moved and server software gets updated, and we need to make sure build results stay the same even if these results are technically "wrong". Here I assume that exact version of Tycho is part of the client. And yes, pervasive use of dependency version ranges makes reproducible builds much harder to guarantee. In p2 3.5, it was possible to tell if given p2 resolution result was completely locked down or there was some wiggle-room, so it was possible to validate project target platform was reproducible or some additional constraints were needed. After introduction of query-based requirements in 3.6, this became much a harder question to answer.
* long term metadata compatibility strategy, i.e. artifacts deployed with tycho 1.0 should be consumable by tycho 2.3.1For this we need to be clear whether we reuse the p2 metadata format or we define our own. Since we use the p2 publishers, we have a dependency on the p2 metadata format. AFAIK the p2 metadata format is not API.
IUSerializer and IUDeserializer introduced in P2 3.7 are API and were introduced to support this exact use case. They, however, only provide forward compatibility, i.e. future versions of P2 will be able to read metadata generated by older versions but not vise versa. The question here, what do we do if older tycho runs into metatada generated by newer tycho? ignore it? fail the build? This is another aspect of build reproducibility. Also, p2 publisher does NOT depend on repository metadata format AFIAK, so we are free to use whatever format we choose as long as we implement IMetadataRepository and IArtifactRepository implementation. I am not saying we should invent our own format, but we can if we decide we need to.
Regards Jan -----Original Message----- From: tycho-dev-bounces@xxxxxxxxxxx [mailto:tycho-dev-bounces@xxxxxxxxxxx] On Behalf Of Igor Fedorenko Sent: Freitag, 27. Mai 2011 18:17 To: tycho-dev@xxxxxxxxxxx Subject: [tycho-dev] [discuss] tycho repository layout and metadata format I think I am finally ready to give TYCHO-335  another try. For uninitiated, this is about being able to share artifacts and corresponding p2 metadata via a repository.Before I do anything I'd like us to discuss and agree on high-level requirements for this * synchronous or near-synchronous metadata update after deploy. So if one Hudson (or cli) build deploys artifacts, the next build is expected to be able to consume the artifacts * efficiently support both deploy-only (i.e. RELEASE) and deploy-remove (i.e. SNAPSHOT) repositories. Although desirable, it is not a hard requirement to support both usage patterns with single format, we can define two formats if needed. * scale to 200K+ artifacts/installable units. To put this in conext, - there are ~160K artifacts in maven central  - indigo M7 repo had ~11K IUs and was ~4.6M in size when jarred - assuming comparable compression level, 200K IUs will be>65M jarred * long term metadata compatibility strategy, i.e. artifacts deployed with tycho 1.0 should be consumable by tycho 2.3.1 * focus on artifact deploy and dependency resolution during the build. To help us manage scope of this work, I want to explicitly exclude IU categories and other user-facing features from the scope. * providing "simple" or "composite" p2 repository layouts is explicitly NOT a requirement. likewise, using Maven2 repository layout is NOT a requirement. Lets keep our options open For bonus points * allow efficient caching-proxy and aggregating repositories * allow efficient implementation of fine-grained access control * possibility to interact with maven-bundle-plugin and other maven OSGi tools Did I miss anything? Comments, ideas?  https://issues.sonatype.org/browse/TYCHO-335  http://search.maven.org/#stats
Back to the top