Home » Modeling » EMF » Feature IDs in generated code(Custom code generation for VCS friendly code )
Feature IDs in generated code [message #1835085] |
Tue, 24 November 2020 13:28 |
Janik S Messages: 12 Registered: November 2017 |
Junior Member |
|
|
Hi,
we are facing issues with maintaining checked in emf generated code across multiple branches. The main problem is that the generated code for the package class is containing enumerated feature ids. If the model is changed by multiple developers, it is very hard to do out of order merges. The only safe way to achieve this, would be a manual integrate with regenerating the code on the target branch. Right now we are considering to avoid checking in the generated code but instead generating it on the fly in our build process. I think both approaches have advantages and disadvantages, so far our preference would still be to have the generated code checked in (e.g. because of customized edit code, easier understanding of code history in revision control).
Are there any tips how we could solve this problem?
We thought of customizing the code generation to generate more merge friendly code. If this is possible, we think that the following changes could simplify the merging process:
a) In the ModelPackageImpl the index based feature access could use the static feature ids defined on the interface instead of duplicating the index hardcoded.
E.g:
public EAttribute getModel_Version() {
return (EAttribute)getModel().getEStructuralFeatures().get(10);
}
//change to
public EAttribute getModel_Version() {
return (EAttribute)getModel().getEStructuralFeatures().get(CorePackage.MODEL__VERSION);
}
b) Don't hardcode the ids on the interface, but instead generate them with some static helper. Needs to provide global counters and sub-counters for offset feature ids.
int NAMED_MODEL_ELEMENT = 16;
//change to
int NAMED_MODEL_ELEMENT = CoreIdGenerator.metaObjectId().incrementAndGet()
int NAMED_MODEL_ELEMENT__NAME = MODEL_ELEMENT_FEATURE_COUNT + 0;
//change to
int NAMED_MODEL_ELEMENT__NAME = MODEL_ELEMENT_FEATURE_COUNT + CoreIdGenerator.featureId(NAMED_MODEL_ELEMENT).incrementAndGet();
int NAMED_MODEL_ELEMENT_FEATURE_COUNT = MODEL_ELEMENT_FEATURE_COUNT + 1;
//change to
int NAMED_MODEL_ELEMENT_FEATURE_COUNT = MODEL_ELEMENT_FEATURE_COUNT + CoreIdGenerator.featureId(NAMED_MODEL_ELEMENT).get();
This is just a rough idea how it could work without thinking much about details like thread safety (static initialization is done once per class loader?). Have there already been some thoughts about something like this? Or can you give any other recommendation for avoiding the merge problems?
If this is a valid approach, what would be the best way to customize the generation? Is it required to mirror the code and hack the changes into it, or are there any other possible hooks for modifications like this?
I think the relevant pointers for the JET template / model are here:
https://github.com/eclipse/emf/blob/master/plugins/org.eclipse.emf.codegen.ecore/templates/model/PackageClass.javajet#L200
https://github.com/eclipse/emf/blob/8b4397688e71b577c5c9b1f915da1769bab8ade0/plugins/org.eclipse.emf.codegen.ecore/src/org/eclipse/emf/codegen/ecore/genmodel/impl/GenClassImpl.java#L1043
Thanks
Janik
|
|
| |
Re: Feature IDs in generated code [message #1835106 is a reply to message #1835088] |
Tue, 24 November 2020 19:30 |
Janik S Messages: 12 Registered: November 2017 |
Junior Member |
|
|
Hey Ed,
thanks for the very quick answer. Hmm I think my point is, that the final id as integer number should be irrelevant, if all references are always going through the package class API. It turns out to be something like an enum. The eGet/eSet/eIsSet/eUnset methods which you have mentioned are such examples, e.g.:
public Object eGet(int featureID, boolean resolve, boolean coreType) {
switch (featureID) {
case CorePackage.MODEL__NAME: //I don't care if it is 0 or 57 and it is subject to change anyways
return getName();
case CorePackage.MODEL__FILE_VERSION:
return getFileVersion();
[...]
I would say that at least in java code, the features should never be accessed via a hardcoded integer, but rather by using the static field on the package interface. Whether this field has the value 57 or 58 becomes irrelevant. That is what I recommend as change a) in the package impl class. eGet/eSet/eIsSet/eUnset are already doing this correctly.
Thinking of your example now, if the ids are not hard coded but statically uniquely assigned, the order would also not matter - or am I msissing something?
[...]
int X_Foo = CoreIdGenerator.featureId(X).incrementAndGet(); //might be 57
int X_Bar = CoreIdGenerator.featureId(X).incrementAndGet(); //might be 58
[...]
or
[...]
int X_Bar = CoreIdGenerator.featureId(X).incrementAndGet(); //might be 57
int X_Foo = CoreIdGenerator.featureId(X).incrementAndGet(); //might be 58
[...]
Referencing via the interface (CorePackage.X_Bar) would always be correct. Referencing via the integer value would today already be dangerous or even wrong?!
So far I can only imagine some problems, if we are leaving java and want to carry some feature id (e.g. serialize a feature value update: set "feature 57" to "foo" and apply it later on on a loaded model where feature 57 became a different feature). This would neither work with today's hardcoded approach.
Of course regenerating the code on the target branch is the safest approach, but this is not realistic in our case, as we have multiple release branches with staging branches. In our case we are also not trying to solve the problem of merging two separately evolved models, but we can assume that the model is changed only on one branch. We just want to relax developer dependencies and allow for out of order merges. So the target branch(es) will always follow the source branch.
Getting back to your example:
1. Developer A adds feature Foo to class X on branch Develop. Regenerate.
-> some methods will be added in several places, new feature id will be generated (and unfortunately other ids change)
2. Developer B adds feature Bar to class X on bnanch Develop.
-> some methods will be added in several places, new feature id will be generated (and unfortunately other ids change)
[state on branch Develop now: 2x several methods added -> ok to merge, easy diff in the model file -> ok to merge, some fixed ids as result of both changes -> not ok to out of order merge]
Our goal is to eventually have both changes on the Release branch. Today, it is already "ok" to first merge Developer A's changes and then Developer B's changes. In the end the Develop and Release branch will be in the same state.
With my limited knowledge about the EMF details, I'm currently only seeing the fixed ids from preventing us to merge the two changes in a different order.
I can see a problem when the generated changes are in two neighboring lines, then the code can be out of sync after a merge, but should be still functionaly correct. Of course this should be a corner case and we should take care when manually merging that the target branch should always try to get into sync with the source branch.
Can you see any other problems, which I might have overlooked?
Sorry for the long explanation, but I hope that I could better explain the idea and you can give it a second thought with the assumption that IDs should never be directly used in generated code but always accessed via the static indirection.
If you still say, this is fundamentally not possible, is the best approach for us to NOT check the generated code in but generate it on the fly?
Thanks for your help! :)
|
|
| | |
Re: Feature IDs in generated code [message #1835132 is a reply to message #1835118] |
Wed, 25 November 2020 11:26 |
Janik S Messages: 12 Registered: November 2017 |
Junior Member |
|
|
Thanks Ed and Ed :)
Ed Merks,
sorry I made some wrong assumption, now I see the problem and you are totally right, what I proposed is not possible in Java:
public class CompileTimeConstantTest {
public final static class CoreIds {
private final static AtomicInteger globalIds = new AtomicInteger(0);
private final static Map<Integer, AtomicInteger> subIds = new HashMap<>();
public static int metaObjectId_GetAndIncrement() {
return globalIds.getAndIncrement();
}
public static int featureId_GetAndIncrement(int base) {
AtomicInteger subId = subIds.computeIfAbsent(base, b -> new AtomicInteger(0));
return subId.incrementAndGet();
}
}
public static interface CorePackageFail {
final int MODEL = CoreIds.metaObjectId_GetAndIncrement();
int MODEL_NAME = CoreIds.featureId_GetAndIncrement(MODEL);
int MODEL_FOO = CoreIds.featureId_GetAndIncrement(MODEL);
int MEMORY = CoreIds.metaObjectId_GetAndIncrement();
int MEMORY_NAME = CoreIds.featureId_GetAndIncrement(MEMORY);
int MEMORY_BAR = CoreIds.featureId_GetAndIncrement(MEMORY);
}
@Test
public void testCompileTimeConstantsError() {
assertEquals(0, CorePackageFail.MODEL);
switch (0) {
case CorePackageFail.MODEL: //Error: Case expression must be constant expression
break;
default:
fail();
}
}
public static interface CorePackage {
final int MODEL = 0;
int MODEL_NAME = 0;
int MODEL_FOO = MODEL_NAME + 1;
int MODEL_BAR = MODEL_FOO + 1;
int MEMORY = 1;
int MEMORY_NAME = 0;
int MEMORY_BAR = MEMORY_NAME + 1;
}
@Test
public void testCompileTimeConstants() {
assertEquals(0, CorePackage.MODEL);
switch (0) {
case CorePackage.MODEL:
break;
default:
fail();
}
}
}
Nevertheless, maybe there are other similar possibilities to minimize the effect of a model change on the generated code. You already have expressions in the assignments today, so some increment expression could be used to reduce the number of changes (see unit test above). Alternatively maybe Enums could be used in some way, as the positioning alone already defines some kind of uniquiness - but maybe not ideal, as we would need multiple enums?!
Ed Willink, I had a quick look at your discussion and solution and this is an interesting point - are the ids public API or not?! First I was thinking, maybe it is not even required, as the real features (and object identity) are API enough. On the second look though, I see that they are also referred to in e.g. notifications, hence they should be publicly available. So I would say, yes, the IDs serving as "unique identifiers" is public API, but the real "integer values" themselves are an implementation detail and should never be directly written in generated code except for the definition. Like this all references remain stable. (this is problem 'a)' I was mentioning)
I see that you have moved the feature ids from the package into the class. I could not see how they are used in the notification infrastructure, on the package they are completely removed right? Are they still exposed in the Class interfaces or you could you eliminate them completely from the API?
|
|
| | |
Re: Feature IDs in generated code [message #1835165 is a reply to message #1835146] |
Thu, 26 November 2020 09:23 |
Ed Willink Messages: 7655 Registered: July 2009 |
Senior Member |
|
|
Hi
The more general problem of stabilizing order might benefit from a change of merger philosophy. Currently the merger endeavours to preserve existing ordering, which is sensible for a single linear development. However I find that if I develop a change that ultimately involves changes A, B, C where A,B,C are in the order of the Ecore file declarations, but incrementally I developed it as B then C then A, the incremental merge accommodates B,C,A and is inconveniently different to the start again with A,B,C at the end of the incremental experiment/research. You are probably seeing similar issues with concurrent developments.
If rather than preserving prevailing order, the merger imposed a sensible (typically alphabetical) order there would be far fewer merge conflicts as the results of changes A,B,C are accommodated in distinct orders. The merger could therefore read old/new files as a hierarchy of 'paragraphs' such as package/class/nested-class/operation/field taking care to aggregate all comments to the nearest 'paragraph'. The hierarchies can then be sorted and interleaved with merge conflicts left distinctively marked to force user attention. A similar functionality would be helpful in GIT, except that would then make GIT Java-language aware.
Regards
Ed Willink
|
|
| | | | | |
Re: Feature IDs in generated code [message #1835212 is a reply to message #1835201] |
Fri, 27 November 2020 08:23 |
Janik S Messages: 12 Registered: November 2017 |
Junior Member |
|
|
I thought subclassing + an item provider factory returning the specialized item providers is the generation gap pattern. But yes, if it is not required, then we go for the simpler approach.
I was not considerung this change to be a fundamentally different "generation pattern", more a cleanup of the existing "generation pattern". I also thought that it should not be too complicated, as it is done "correctly" in most other places already. So I was more just interested in whether you agree with the point, or if there is a concrete reason, why in certain places the numbers are hardcoded and in other places the indirection via the API is done? If you say this makes sense, I could also try to come up with some proposal, but of course it doesn't make sense to look into it, if you don't agree. :)
|
|
| | |
Re: Feature IDs in generated code [message #1835232 is a reply to message #1835229] |
Fri, 27 November 2020 14:04 |
|
Janik S wrote on Fri, 27 November 2020 14:24[...]Btw, we even observed different emf output in exactly the same eclipse (different users) - so there must be some user code-formatting preferences affecting the generation output - but this is a different story.
Yes, you can, for example, execute the source code formatter as part of the generation procedure. If you do that in multiple workspaces you better make sure that you store the formatter profile in the project-specific settings - not in the global workspace preferences - and commit it to your source code repository along with your sources.
Cheers
/Eike
----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper
|
|
| |
Re: Feature IDs in generated code [message #1835239 is a reply to message #1835237] |
Fri, 27 November 2020 16:04 |
Ed Merks Messages: 33133 Registered: July 2009 |
Senior Member |
|
|
Okay, we really are going down the rat hole and then down into the side tunnels of the rat hole... :-(
Firstly as for the StringBuilder/StringBuffer, it seemed unlikely to me that anyone would want an option to keep generating StringBuilder. I could be wrong, but no one complained. I actually only did this because the Platform, which uses EMF, went on a StringBuffer witch hunt, and were whining about this, otherwise who really cares? It's not a performance issue! But it's simply impossible to keep everyone happy...
As for the hard coded numbers in methods like this: public EReference getEClass_EAllSuperTypes()
{
return (EReference)eClassEClass.getEStructuralFeatures().get(11);
} You should note that feature IDs from the package interface are indexes in eClass.getEAllStructureFeatures(), but in this code we are accessing eClass.getEStructuralFeatures(), This is a different list (the features declared local to this class as opposed to all feature available via inheritance). For this there are no constants in the interface for the things at each index. So your suggested change is actually wrong, and generating yet more constant seems overkill because they'd not be used outside of this initialization code.
Soon it will be time to send a invoice for my 1 hour of time!! :-P
Ed Merks
Professional Support: https://www.macromodeling.com/
|
|
|
Re: Feature IDs in generated code [message #1835240 is a reply to message #1835237] |
Fri, 27 November 2020 16:18 |
Janik S Messages: 12 Registered: November 2017 |
Junior Member |
|
|
Sorry Ed, seems like there is some misunderstanding on my or your side.
My point is that 62 should only be used in ONE single place, the place where the id is declared: FEATURE = 62. It should NOT be used in any switch case statement .. NEITHER in the place which I mentioned in my last post : return (EAttribute)getModel().getEStructuralFeatures().get(62);
In this place it should rather be return (EAttribute)getModel().getEStructuralFeatures().get(CorePackage.FEATURE); ... which is not the case today
There is only one argument I could imagine: if the feature name changes, the static field name changes, hence all references change - if the int value was used instead for references, only the field name changes and all references (via the int) are stable. This is not a strong argument in my eyes.
Eike, thanks for the recommendation, maybe we should fixate the auto formating for those particular projects then. An option in the gen model for an eclipse preference independant generation would be good - or maybe already exist?
EDIT:
After reading Ed Merks last post it makes sense now. I was checking our diffs and it looked very much like duplicated ids. If this is not the case, please forget about it. Thanks for the explanations and sorry for wasting your time. :)
EDIT2:
I'm not complaining about the StringBuilder change, this is totally fine - it just popped into my mind first, when you said EMF users are not used to generator changes.
[Updated on: Sat, 28 November 2020 09:35] Report message to a moderator
|
|
| |
Goto Forum:
Current Time: Thu Apr 18 06:23:48 GMT 2024
Powered by FUDForum. Page generated in 0.03404 seconds
|