Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » Underscore rules in generated values
Underscore rules in generated values [message #425502] Fri, 28 November 2008 10:31 Go to next message
Eclipse UserFriend
Originally posted by: neko.ticino.com

Hello,

I've some problems understanding how emg generates names for
Diagnostic's values. How are the rules for placing 'underscores'?

I've found these rules:
- place underscore between lowercase and uppercase chars
- place underscore before the last uppercase char in a row of at least 3
uppercase chars, if it is not the last one in the string.

But how it works if in the name there is already an underscore? And if
there are numbers? These are neither lowercase or uppercase...

Here's an example this is what emf generates:

/**
* The {@link org.eclipse.emf.common.util.Diagnostic#getCode() code} for
* constraint 'Validate Compatible Gates Containing Correct Values' of 'Gate
* Definition'. <!-- begin-user-doc --> <!-- end-user-doc -->
*
* @generated
*/
public static final int
GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES =
71;

To generate the name
GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES emf
uses entities names (in this case Gate, Definition and
CompatibleGatesContainingCorrectValues). And this is ok, I can replicate
what emf does...

But...:
- if the name of the entity is 'Gate_' the result is still
GATE_DEFINITION (it seems the underscore is removed)
- If the name is Gate_7 the result is 'GATE_7DEFINITION'...
- If the name is already uppercase like 'GATE' the result is still
GATE_DEFINITION, but according to the second rule it should be
GAT_EDEFINITION...

Thanks for helping!

Alex
Re: Underscore rules in generated values [message #425506 is a reply to message #425502] Fri, 28 November 2008 12:15 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33140
Registered: July 2009
Senior Member
Alex,

Comments below.


Alex Nekoti wrote:
> Hello,
>
> I've some problems understanding how emg generates names for
> Diagnostic's values. How are the rules for placing 'underscores'?
>
When generating constant names, we generally use an _ to separate words
(which are determined by where lower case letters become upper case
again, i.e, where the transitions in the camel case occur).
> I've found these rules:
> - place underscore between lowercase and uppercase chars
> - place underscore before the last uppercase char in a row of at least 3
> uppercase chars, if it is not the last one in the string.
>
Yes, since it's quite common for acronyms to be used, i.e., XSDComponent
would become XSD_COMPONENT.
> But how it works if in the name there is already an underscore?
Generally one should not be using _ in feature or type names when
following Java's language naming conventions...
> And if
> there are numbers? These are neither lowercase or uppercase...
>
An _ will represent a word break. A number will not result in a word break.
> Here's an example this is what emf generates:
>
> /**
> * The {@link org.eclipse.emf.common.util.Diagnostic#getCode() code} for
> * constraint 'Validate Compatible Gates Containing Correct Values' of 'Gate
> * Definition'. <!-- begin-user-doc --> <!-- end-user-doc -->
> *
> * @generated
> */
> public static final int
> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES =
> 71;
>
> To generate the name
> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES emf
> uses entities names (in this case Gate, Definition and
> CompatibleGatesContainingCorrectValues). And this is ok, I can replicate
> what emf does...
>
> But...:
> - if the name of the entity is 'Gate_' the result is still
> GATE_DEFINITION (it seems the underscore is removed)
>
Yes.
> - If the name is Gate_7 the result is 'GATE_7DEFINITION'...
>
Gross name. The _ causes a word break and the 7 being neutral just
associates with the next word.
> - If the name is already uppercase like 'GATE' the result is still
> GATE_DEFINITION, but according to the second rule it should be
> GAT_EDEFINITION...
>
The gory details of word breaking are in
org.eclipse.emf.codegen.util.CodeGenUtil.format.
> Thanks for helping!
>
> Alex
>
>
>


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Underscore rules in generated values [message #425507 is a reply to message #425506] Fri, 28 November 2008 13:33 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: neko.ticino.com

Thank you Ed for the answers, but I still have some troubles

Ed Merks wrote:

> Alex,
>
> Comments below.
>
>
> Alex Nekoti wrote:
>> Hello,
>>
>> I've some problems understanding how emg generates names for
>> Diagnostic's values. How are the rules for placing 'underscores'?
>>
> When generating constant names, we generally use an _ to separate words
> (which are determined by where lower case letters become upper case
> again, i.e, where the transitions in the camel case occur).
That's ok.

>> I've found these rules:
>> - place underscore between lowercase and uppercase chars
>> - place underscore before the last uppercase char in a row of at least 3
>> uppercase chars, if it is not the last one in the string.
>>
> Yes, since it's quite common for acronyms to be used, i.e., XSDComponent
> would become XSD_COMPONENT.
This is ok too.

>> But how it works if in the name there is already an underscore?
> Generally one should not be using _ in feature or type names when
> following Java's language naming conventions...
I know. But I still have to consider cases where someone uses names not
according to java conventions.

>> And if
>> there are numbers? These are neither lowercase or uppercase...
>>
> An _ will represent a word break. A number will not result in a word break.
>> Here's an example this is what emf generates:
>>
>> /**
>> * The {@link org.eclipse.emf.common.util.Diagnostic#getCode() code} for
>> * constraint 'Validate Compatible Gates Containing Correct Values' of 'Gate
>> * Definition'. <!-- begin-user-doc --> <!-- end-user-doc -->
>> *
>> * @generated
>> */
>> public static final int
>> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES =
>> 71;
>>
>> To generate the name
>> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES emf
>> uses entities names (in this case Gate, Definition and
>> CompatibleGatesContainingCorrectValues). And this is ok, I can replicate
>> what emf does...
>>
>> But...:
>> - if the name of the entity is 'Gate_' the result is still
>> GATE_DEFINITION (it seems the underscore is removed)
>>
> Yes.
But it is removed only if it is at the end of a word? Underscores in the
middle of a word are considered word breaks, that's right?
So, 'Gate_' is like 'Gate', but 'Gate_A' is not like 'GateA' ?

>> - If the name is Gate_7 the result is 'GATE_7DEFINITION'...
>>
> Gross name. The _ causes a word break and the 7 being neutral just
> associates with the next word.
Ok, let's try something more difficult (sorry Ed!):
Gate_1A3Definition generates as Gate_1A3_Definition. Why A3 is not
separated but 3D it is?

What does mean that 7 is neutral? The rule says 'place underscore
before the last uppercase char in a row of at least 3', but a digit does
affect the row count o not?
GATEDefinition is a row of 5 uppercase letters so it becomes
GATE_Definition, but GA6EDefinition?

>> - If the name is already uppercase like 'GATE' the result is still
>> GATE_DEFINITION, but according to the second rule it should be
>> GAT_EDEFINITION...
>>
> The gory details of word breaking are in
> org.eclipse.emf.codegen.util.CodeGenUtil.format.
>> Thanks for helping!
>>
>> Alex
>>
>>
>>
Thank you again!
Alex
Re: Underscore rules in generated values [message #425509 is a reply to message #425507] Fri, 28 November 2008 14:11 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33140
Registered: July 2009
Senior Member
This is a multi-part message in MIME format.
--------------010704050705020500000209
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Alex,

Comments below.

Alex Nekoti wrote:
> Thank you Ed for the answers, but I still have some troubles
>
> Ed Merks wrote:
>
>
>> Alex,
>>
>> Comments below.
>>
>>
>> Alex Nekoti wrote:
>>
>>> Hello,
>>>
>>> I've some problems understanding how emg generates names for
>>> Diagnostic's values. How are the rules for placing 'underscores'?
>>>
>>>
>> When generating constant names, we generally use an _ to separate words
>> (which are determined by where lower case letters become upper case
>> again, i.e, where the transitions in the camel case occur).
>>
> That's ok.
>
>
>>> I've found these rules:
>>> - place underscore between lowercase and uppercase chars
>>> - place underscore before the last uppercase char in a row of at least 3
>>> uppercase chars, if it is not the last one in the string.
>>>
>>>
>> Yes, since it's quite common for acronyms to be used, i.e., XSDComponent
>> would become XSD_COMPONENT.
>>
> This is ok too.
>
>
>>> But how it works if in the name there is already an underscore?
>>>
>> Generally one should not be using _ in feature or type names when
>> following Java's language naming conventions...
>>
> I know. But I still have to consider cases where someone uses names not
> according to java conventions.
>
If you use unconventional names it's reasonable that the results aren't
so conventional either. Where are such names coming from anyway that
they're beyond anyone's control?
>
>>> And if
>>> there are numbers? These are neither lowercase or uppercase...
>>>
>>>
>> An _ will represent a word break. A number will not result in a word break.
>>
>>> Here's an example this is what emf generates:
>>>
>>> /**
>>> * The {@link org.eclipse.emf.common.util.Diagnostic#getCode() code} for
>>> * constraint 'Validate Compatible Gates Containing Correct Values' of 'Gate
>>> * Definition'. <!-- begin-user-doc --> <!-- end-user-doc -->
>>> *
>>> * @generated
>>> */
>>> public static final int
>>> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES =
>>> 71;
>>>
>>> To generate the name
>>> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES emf
>>> uses entities names (in this case Gate, Definition and
>>> CompatibleGatesContainingCorrectValues). And this is ok, I can replicate
>>> what emf does...
>>>
>>> But...:
>>> - if the name of the entity is 'Gate_' the result is still
>>> GATE_DEFINITION (it seems the underscore is removed)
>>>
>>>
>> Yes.
>>
> But it is removed only if it is at the end of a word? Underscores in the
> middle of a word are considered word breaks, that's right?
> So, 'Gate_' is like 'Gate', but 'Gate_A' is not like 'GateA' ?
>
Well I think Gate_A is like GateA because both will become "Gate", "A"
in the parsed word list.
>
>>> - If the name is Gate_7 the result is 'GATE_7DEFINITION'...
>>>
>>>
>> Gross name. The _ causes a word break and the 7 being neutral just
>> associates with the next word.
>>
> Ok, let's try something more difficult (sorry Ed!):
> Gate_1A3Definition generates as Gate_1A3_Definition. Why A3 is not
> separated but 3D it is?
>
Because it's assumed that words might contain or end with numbers, but
that they don't start with them.
> What does mean that 7 is neutral?
It's treated more like a lower case letter.
> The rule says 'place underscore
> before the last uppercase char in a row of at least 3', but a digit does
> affect the row count o not?
>
I could answer many questions but in the end, what I say must match
exactly the logic in the code.
> GATEDefinition is a row of 5 uppercase letters so it becomes
> GATE_Definition, but GA6EDefinition?
>
Unfortunately I don't think any syntactic rule will every get is all
exactly right, e.g., XSDXPath is really intended to be XSD XPath. Or
EJBRDBMapping is maybe supposed to be ERJ_RDB_Mapping...
>
>>> - If the name is already uppercase like 'GATE' the result is still
>>> GATE_DEFINITION, but according to the second rule it should be
>>> GAT_EDEFINITION...
>>>
>>>
>> The gory details of word breaking are in
>> org.eclipse.emf.codegen.util.CodeGenUtil.format.
>>
>>> Thanks for helping!
>>>
>>> Alex
>>>
>>>
>>>
>>>
> Thank you again!
>
If you don't like the name that's generated, you can always add by hand
a nicer variant; of course the generated code will always use the
generated form...
> Alex
>
>

--------------010704050705020500000209
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Alex,<br>
<br>
Comments below.<br>
<br>
Alex Nekoti wrote:
<blockquote cite="mid:ggorv3$5t0$1@build.eclipse.org" type="cite">
<pre wrap="">Thank you Ed for the answers, but I still have some troubles

Ed Merks wrote:

</pre>
<blockquote type="cite">
<pre wrap="">Alex,

Comments below.


Alex Nekoti wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Hello,

I've some problems understanding how emg generates names for
Diagnostic's values. How are the rules for placing 'underscores'?

</pre>
</blockquote>
<pre wrap="">When generating constant names, we generally use an _ to separate words
(which are determined by where lower case letters become upper case
again, i.e, where the transitions in the camel case occur).
</pre>
</blockquote>
<pre wrap=""><!---->That's ok.

</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">I've found these rules:
- place underscore between lowercase and uppercase chars
- place underscore before the last uppercase char in a row of at least 3
uppercase chars, if it is not the last one in the string.

</pre>
</blockquote>
<pre wrap="">Yes, since it's quite common for acronyms to be used, i.e., XSDComponent
would become XSD_COMPONENT.
</pre>
</blockquote>
<pre wrap=""><!---->This is ok too.

</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">But how it works if in the name there is already an underscore?
</pre>
</blockquote>
<pre wrap="">Generally one should not be using _ in feature or type names when
following Java's language naming conventions...
</pre>
</blockquote>
<pre wrap=""><!---->I know. But I still have to consider cases where someone uses names not
according to java conventions.
</pre>
</blockquote>
If you use unconventional names it's reasonable that the results aren't
so conventional either.&nbsp;&nbsp; Where are such names coming from anyway that
they're beyond anyone's control?<br>
<blockquote cite="mid:ggorv3$5t0$1@build.eclipse.org" type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap=""> And if
there are numbers? These are neither lowercase or uppercase...

</pre>
</blockquote>
<pre wrap="">An _ will represent a word break. A number will not result in a word break.
</pre>
<blockquote type="cite">
<pre wrap="">Here's an example this is what emf generates:

/**
* The {@link org.eclipse.emf.common.util.Diagnostic#getCode() code} for
* constraint 'Validate Compatible Gates Containing Correct Values' of 'Gate
* Definition'. &lt;!-- begin-user-doc --&gt; &lt;!-- end-user-doc --&gt;
*
* @generated
*/
public static final int
GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES =
71;

To generate the name
GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES emf
uses entities names (in this case Gate, Definition and
CompatibleGatesContainingCorrectValues). And this is ok, I can replicate
what emf does...

But...:
- if the name of the entity is 'Gate_' the result is still
GATE_DEFINITION (it seems the underscore is removed)

</pre>
</blockquote>
<pre wrap="">Yes.
</pre>
</blockquote>
<pre wrap=""><!---->But it is removed only if it is at the end of a word? Underscores in the
middle of a word are considered word breaks, that's right?
So, 'Gate_' is like 'Gate', but 'Gate_A' is not like 'GateA' ?
</pre>
</blockquote>
Well I think Gate_A is like GateA because both will become "Gate", "A"
in the parsed word list.<br>
<blockquote cite="mid:ggorv3$5t0$1@build.eclipse.org" type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">- If the name is Gate_7 the result is 'GATE_7DEFINITION'...

</pre>
</blockquote>
<pre wrap="">Gross name. The _ causes a word break and the 7 being neutral just
associates with the next word.
</pre>
</blockquote>
<pre wrap=""><!---->Ok, let's try something more difficult (sorry Ed!):
Gate_1A3Definition generates as Gate_1A3_Definition. Why A3 is not
separated but 3D it is?
</pre>
</blockquote>
Because it's assumed that words might contain or end with numbers, but
that they don't start with them.<br>
<blockquote cite="mid:ggorv3$5t0$1@build.eclipse.org" type="cite">
<pre wrap="">
What does mean that 7 is neutral? </pre>
</blockquote>
It's treated more like a lower case letter.<br>
<blockquote cite="mid:ggorv3$5t0$1@build.eclipse.org" type="cite">
<pre wrap="">The rule says 'place underscore
before the last uppercase char in a row of at least 3', but a digit does
affect the row count o not?
</pre>
</blockquote>
I could answer many questions but in the end, what I say must match
exactly the logic in the code.<br>
<blockquote cite="mid:ggorv3$5t0$1@build.eclipse.org" type="cite">
<pre wrap="">GATEDefinition is a row of 5 uppercase letters so it becomes
GATE_Definition, but GA6EDefinition?
</pre>
</blockquote>
Unfortunately I don't think any syntactic rule will every get is all
exactly right, e.g., XSDXPath is really intended to be XSD XPath.&nbsp; Or
EJBRDBMapping is maybe supposed to be ERJ_RDB_Mapping...<br>
<blockquote cite="mid:ggorv3$5t0$1@build.eclipse.org" type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">- If the name is already uppercase like 'GATE' the result is still
GATE_DEFINITION, but according to the second rule it should be
GAT_EDEFINITION...

</pre>
</blockquote>
<pre wrap="">The gory details of word breaking are in
org.eclipse.emf.codegen.util.CodeGenUtil.format.
</pre>
<blockquote type="cite">
<pre wrap="">Thanks for helping!

Alex



</pre>
</blockquote>
</blockquote>
<pre wrap=""><!---->Thank you again!
</pre>
</blockquote>
If you don't like the name that's generated, you can always add by hand
a nicer variant; of course the generated code will always use the
generated form...<br>
<blockquote cite="mid:ggorv3$5t0$1@build.eclipse.org" type="cite">
<pre wrap="">Alex

</pre>
</blockquote>
</body>
</html>

--------------010704050705020500000209--


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Underscore rules in generated values [message #425512 is a reply to message #425509] Fri, 28 November 2008 14:36 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: neko.ticino.com

Ed Merks wrote:

> Alex,
>
> Comments below.
>
> Alex Nekoti wrote:
>> Thank you Ed for the answers, but I still have some troubles
>>
>> Ed Merks wrote:
>>
>>
>>> Alex,
>>>
>>> Comments below.
>>>
>>>
>>> Alex Nekoti wrote:
>>>
>>>> Hello,
>>>>
>>>> I've some problems understanding how emg generates names for
>>>> Diagnostic's values. How are the rules for placing 'underscores'?
>>>>
>>>>
>>> When generating constant names, we generally use an _ to separate words
>>> (which are determined by where lower case letters become upper case
>>> again, i.e, where the transitions in the camel case occur).
>>>
>> That's ok.
>>
>>
>>>> I've found these rules:
>>>> - place underscore between lowercase and uppercase chars
>>>> - place underscore before the last uppercase char in a row of at least 3
>>>> uppercase chars, if it is not the last one in the string.
>>>>
>>>>
>>> Yes, since it's quite common for acronyms to be used, i.e., XSDComponent
>>> would become XSD_COMPONENT.
>>>
>> This is ok too.
>>
>>
>>>> But how it works if in the name there is already an underscore?
>>>>
>>> Generally one should not be using _ in feature or type names when
>>> following Java's language naming conventions...
>>>
>> I know. But I still have to consider cases where someone uses names not
>> according to java conventions.
>>
> If you use unconventional names it's reasonable that the results aren't
> so conventional either. Where are such names coming from anyway that
> they're beyond anyone's control?
From users! Sure, it's possible to suggest to use conventional names,
but if emf allows to have not standard conventions I should be able to
match the emf generation.

>>
>>>> And if
>>>> there are numbers? These are neither lowercase or uppercase...
>>>>
>>>>
>>> An _ will represent a word break. A number will not result in a word break.
>>>
>>>> Here's an example this is what emf generates:
>>>>
>>>> /**
>>>> * The {@link org.eclipse.emf.common.util.Diagnostic#getCode() code} for
>>>> * constraint 'Validate Compatible Gates Containing Correct Values' of 'Gate
>>>> * Definition'. <!-- begin-user-doc --> <!-- end-user-doc -->
>>>> *
>>>> * @generated
>>>> */
>>>> public static final int
>>>> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES =
>>>> 71;
>>>>
>>>> To generate the name
>>>> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES emf
>>>> uses entities names (in this case Gate, Definition and
>>>> CompatibleGatesContainingCorrectValues). And this is ok, I can replicate
>>>> what emf does...
>>>>
>>>> But...:
>>>> - if the name of the entity is 'Gate_' the result is still
>>>> GATE_DEFINITION (it seems the underscore is removed)
>>>>
>>>>
>>> Yes.
>>>
>> But it is removed only if it is at the end of a word? Underscores in the
>> middle of a word are considered word breaks, that's right?
>> So, 'Gate_' is like 'Gate', but 'Gate_A' is not like 'GateA' ?
>>
> Well I think Gate_A is like GateA because both will become "Gate", "A"
> in the parsed word list.
>>
>>>> - If the name is Gate_7 the result is 'GATE_7DEFINITION'...
>>>>
>>>>
>>> Gross name. The _ causes a word break and the 7 being neutral just
>>> associates with the next word.
>>>
>> Ok, let's try something more difficult (sorry Ed!):
>> Gate_1A3Definition generates as Gate_1A3_Definition. Why A3 is not
>> separated but 3D it is?
>>
> Because it's assumed that words might contain or end with numbers, but
> that they don't start with them.
>> What does mean that 7 is neutral?
> It's treated more like a lower case letter.
>> The rule says 'place underscore
>> before the last uppercase char in a row of at least 3', but a digit does
>> affect the row count o not?
>>
> I could answer many questions but in the end, what I say must match
> exactly the logic in the code.
>> GATEDefinition is a row of 5 uppercase letters so it becomes
>> GATE_Definition, but GA6EDefinition?
>>
> Unfortunately I don't think any syntactic rule will every get is all
> exactly right, e.g., XSDXPath is really intended to be XSD XPath. Or
> EJBRDBMapping is maybe supposed to be ERJ_RDB_Mapping...
>>
>>>> - If the name is already uppercase like 'GATE' the result is still
>>>> GATE_DEFINITION, but according to the second rule it should be
>>>> GAT_EDEFINITION...
>>>>
>>>>
>>> The gory details of word breaking are in
>>> org.eclipse.emf.codegen.util.CodeGenUtil.format.
>>>
>>>> Thanks for helping!
>>>>
>>>> Alex
>>>>
>>>>
>>>>
>>>>
>> Thank you again!
>>
> If you don't like the name that's generated, you can always add by hand
> a nicer variant; of course the generated code will always use the
> generated form...
That's not the case, generated names are perfectly suitables. The fact
is that I can't replicate them! The goal is to match generated names
with my own generator.

Thank you Ed!

Alex
Re: Underscore rules in generated values [message #425513 is a reply to message #425512] Fri, 28 November 2008 14:42 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33140
Registered: July 2009
Senior Member
This is a multi-part message in MIME format.
--------------070906010209020301050103
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Alex,

Comments below.


Alex Nekoti wrote:
> Ed Merks wrote:
>
>
>> Alex,
>>
>> Comments below.
>>
>> Alex Nekoti wrote:
>>
>>> Thank you Ed for the answers, but I still have some troubles
>>>
>>> Ed Merks wrote:
>>>
>>>
>>>
>>>> Alex,
>>>>
>>>> Comments below.
>>>>
>>>>
>>>> Alex Nekoti wrote:
>>>>
>>>>
>>>>> Hello,
>>>>>
>>>>> I've some problems understanding how emg generates names for
>>>>> Diagnostic's values. How are the rules for placing 'underscores'?
>>>>>
>>>>>
>>>>>
>>>> When generating constant names, we generally use an _ to separate words
>>>> (which are determined by where lower case letters become upper case
>>>> again, i.e, where the transitions in the camel case occur).
>>>>
>>>>
>>> That's ok.
>>>
>>>
>>>
>>>>> I've found these rules:
>>>>> - place underscore between lowercase and uppercase chars
>>>>> - place underscore before the last uppercase char in a row of at least 3
>>>>> uppercase chars, if it is not the last one in the string.
>>>>>
>>>>>
>>>>>
>>>> Yes, since it's quite common for acronyms to be used, i.e., XSDComponent
>>>> would become XSD_COMPONENT.
>>>>
>>>>
>>> This is ok too.
>>>
>>>
>>>
>>>>> But how it works if in the name there is already an underscore?
>>>>>
>>>>>
>>>> Generally one should not be using _ in feature or type names when
>>>> following Java's language naming conventions...
>>>>
>>>>
>>> I know. But I still have to consider cases where someone uses names not
>>> according to java conventions.
>>>
>>>
>> If you use unconventional names it's reasonable that the results aren't
>> so conventional either. Where are such names coming from anyway that
>> they're beyond anyone's control?
>>
> From users! Sure, it's possible to suggest to use conventional names,
> but if emf allows to have not standard conventions I should be able to
> match the emf generation.
>
Assuming that one algorithm can handle all possible inputs in a way that
humans consider ideal. But I don't think that's 100% possible.
>
>>>
>>>
>>>>> And if
>>>>> there are numbers? These are neither lowercase or uppercase...
>>>>>
>>>>>
>>>>>
>>>> An _ will represent a word break. A number will not result in a word break.
>>>>
>>>>
>>>>> Here's an example this is what emf generates:
>>>>>
>>>>> /**
>>>>> * The {@link org.eclipse.emf.common.util.Diagnostic#getCode() code} for
>>>>> * constraint 'Validate Compatible Gates Containing Correct Values' of 'Gate
>>>>> * Definition'. <!-- begin-user-doc --> <!-- end-user-doc -->
>>>>> *
>>>>> * @generated
>>>>> */
>>>>> public static final int
>>>>> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES =
>>>>> 71;
>>>>>
>>>>> To generate the name
>>>>> GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES emf
>>>>> uses entities names (in this case Gate, Definition and
>>>>> CompatibleGatesContainingCorrectValues). And this is ok, I can replicate
>>>>> what emf does...
>>>>>
>>>>> But...:
>>>>> - if the name of the entity is 'Gate_' the result is still
>>>>> GATE_DEFINITION (it seems the underscore is removed)
>>>>>
>>>>>
>>>>>
>>>> Yes.
>>>>
>>>>
>>> But it is removed only if it is at the end of a word? Underscores in the
>>> middle of a word are considered word breaks, that's right?
>>> So, 'Gate_' is like 'Gate', but 'Gate_A' is not like 'GateA' ?
>>>
>>>
>> Well I think Gate_A is like GateA because both will become "Gate", "A"
>> in the parsed word list.
>>
>>>
>>>
>>>>> - If the name is Gate_7 the result is 'GATE_7DEFINITION'...
>>>>>
>>>>>
>>>>>
>>>> Gross name. The _ causes a word break and the 7 being neutral just
>>>> associates with the next word.
>>>>
>>>>
>>> Ok, let's try something more difficult (sorry Ed!):
>>> Gate_1A3Definition generates as Gate_1A3_Definition. Why A3 is not
>>> separated but 3D it is?
>>>
>>>
>> Because it's assumed that words might contain or end with numbers, but
>> that they don't start with them.
>>
>>> What does mean that 7 is neutral?
>>>
>> It's treated more like a lower case letter.
>>
>>> The rule says 'place underscore
>>> before the last uppercase char in a row of at least 3', but a digit does
>>> affect the row count o not?
>>>
>>>
>> I could answer many questions but in the end, what I say must match
>> exactly the logic in the code.
>>
>>> GATEDefinition is a row of 5 uppercase letters so it becomes
>>> GATE_Definition, but GA6EDefinition?
>>>
>>>
>> Unfortunately I don't think any syntactic rule will every get is all
>> exactly right, e.g., XSDXPath is really intended to be XSD XPath. Or
>> EJBRDBMapping is maybe supposed to be ERJ_RDB_Mapping...
>>
>>>
>>>
>>>>> - If the name is already uppercase like 'GATE' the result is still
>>>>> GATE_DEFINITION, but according to the second rule it should be
>>>>> GAT_EDEFINITION...
>>>>>
>>>>>
>>>>>
>>>> The gory details of word breaking are in
>>>> org.eclipse.emf.codegen.util.CodeGenUtil.format.
>>>>
>>>>
>>>>> Thanks for helping!
>>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>> Thank you again!
>>>
>>>
>> If you don't like the name that's generated, you can always add by hand
>> a nicer variant; of course the generated code will always use the
>> generated form...
>>
> That's not the case, generated names are perfectly suitables. The fact
> is that I can't replicate them! The goal is to match generated names
> with my own generator.
>
Ah, I see. :-) Reusing the GenModel's methods that return the names
it's using would be best. Reusing CodeGenUtil would be best second
best. And reimplementing the same algorithms by copy and paste would be
a last resort.
> Thank you Ed!
>
> Alex
>
>
>

--------------070906010209020301050103
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Alex,<br>
<br>
Comments below.<br>
<br>
<br>
Alex Nekoti wrote:
<blockquote cite="mid:ggovlh$ng7$1@build.eclipse.org" type="cite">
<pre wrap="">Ed Merks wrote:

</pre>
<blockquote type="cite">
<pre wrap="">Alex,

Comments below.

Alex Nekoti wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Thank you Ed for the answers, but I still have some troubles

Ed Merks wrote:


</pre>
<blockquote type="cite">
<pre wrap="">Alex,

Comments below.


Alex Nekoti wrote:

</pre>
<blockquote type="cite">
<pre wrap="">Hello,

I've some problems understanding how emg generates names for
Diagnostic's values. How are the rules for placing 'underscores'?


</pre>
</blockquote>
<pre wrap="">When generating constant names, we generally use an _ to separate words
(which are determined by where lower case letters become upper case
again, i.e, where the transitions in the camel case occur).

</pre>
</blockquote>
<pre wrap="">That's ok.


</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">I've found these rules:
- place underscore between lowercase and uppercase chars
- place underscore before the last uppercase char in a row of at least 3
uppercase chars, if it is not the last one in the string.


</pre>
</blockquote>
<pre wrap="">Yes, since it's quite common for acronyms to be used, i.e., XSDComponent
would become XSD_COMPONENT.

</pre>
</blockquote>
<pre wrap="">This is ok too.


</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">But how it works if in the name there is already an underscore?

</pre>
</blockquote>
<pre wrap="">Generally one should not be using _ in feature or type names when
following Java's language naming conventions...

</pre>
</blockquote>
<pre wrap="">I know. But I still have to consider cases where someone uses names not
according to java conventions.

</pre>
</blockquote>
<pre wrap="">If you use unconventional names it's reasonable that the results aren't
so conventional either. Where are such names coming from anyway that
they're beyond anyone's control?
</pre>
</blockquote>
<pre wrap=""><!---->From users! Sure, it's possible to suggest to use conventional names,
but if emf allows to have not standard conventions I should be able to
match the emf generation.
</pre>
</blockquote>
Assuming that one algorithm can handle all possible inputs in a way
that humans consider ideal.&nbsp; But I don't think that's 100% possible.<br>
<blockquote cite="mid:ggovlh$ng7$1@build.eclipse.org" type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap=""> And if
there are numbers? These are neither lowercase or uppercase...


</pre>
</blockquote>
<pre wrap="">An _ will represent a word break. A number will not result in a word break.

</pre>
<blockquote type="cite">
<pre wrap="">Here's an example this is what emf generates:

/**
* The {@link org.eclipse.emf.common.util.Diagnostic#getCode() code} for
* constraint 'Validate Compatible Gates Containing Correct Values' of 'Gate
* Definition'. &lt;!-- begin-user-doc --&gt; &lt;!-- end-user-doc --&gt;
*
* @generated
*/
public static final int
GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES =
71;

To generate the name
GATE_DEFINITION__VALIDATE_COMPATIBLE_GATES_CONTAINING_CORREC T_VALUES emf
uses entities names (in this case Gate, Definition and
CompatibleGatesContainingCorrectValues). And this is ok, I can replicate
what emf does...

But...:
- if the name of the entity is 'Gate_' the result is still
GATE_DEFINITION (it seems the underscore is removed)


</pre>
</blockquote>
<pre wrap="">Yes.

</pre>
</blockquote>
<pre wrap="">But it is removed only if it is at the end of a word? Underscores in the
middle of a word are considered word breaks, that's right?
So, 'Gate_' is like 'Gate', but 'Gate_A' is not like 'GateA' ?

</pre>
</blockquote>
<pre wrap="">Well I think Gate_A is like GateA because both will become "Gate", "A"
in the parsed word list.
</pre>
<blockquote type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">- If the name is Gate_7 the result is 'GATE_7DEFINITION'...


</pre>
</blockquote>
<pre wrap="">Gross name. The _ causes a word break and the 7 being neutral just
associates with the next word.

</pre>
</blockquote>
<pre wrap="">Ok, let's try something more difficult (sorry Ed!):
Gate_1A3Definition generates as Gate_1A3_Definition. Why A3 is not
separated but 3D it is?

</pre>
</blockquote>
<pre wrap="">Because it's assumed that words might contain or end with numbers, but
that they don't start with them.
</pre>
<blockquote type="cite">
<pre wrap="">What does mean that 7 is neutral?
</pre>
</blockquote>
<pre wrap="">It's treated more like a lower case letter.
</pre>
<blockquote type="cite">
<pre wrap="">The rule says 'place underscore
before the last uppercase char in a row of at least 3', but a digit does
affect the row count o not?

</pre>
</blockquote>
<pre wrap="">I could answer many questions but in the end, what I say must match
exactly the logic in the code.
</pre>
<blockquote type="cite">
<pre wrap="">GATEDefinition is a row of 5 uppercase letters so it becomes
GATE_Definition, but GA6EDefinition?

</pre>
</blockquote>
<pre wrap="">Unfortunately I don't think any syntactic rule will every get is all
exactly right, e.g., XSDXPath is really intended to be XSD XPath. Or
EJBRDBMapping is maybe supposed to be ERJ_RDB_Mapping...
</pre>
<blockquote type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">- If the name is already uppercase like 'GATE' the result is still
GATE_DEFINITION, but according to the second rule it should be
GAT_EDEFINITION...


</pre>
</blockquote>
<pre wrap="">The gory details of word breaking are in
org.eclipse.emf.codegen.util.CodeGenUtil.format.

</pre>
<blockquote type="cite">
<pre wrap="">Thanks for helping!

Alex




</pre>
</blockquote>
</blockquote>
<pre wrap="">Thank you again!

</pre>
</blockquote>
<pre wrap="">If you don't like the name that's generated, you can always add by hand
a nicer variant; of course the generated code will always use the
generated form...
</pre>
</blockquote>
<pre wrap=""><!---->That's not the case, generated names are perfectly suitables. The fact
is that I can't replicate them! The goal is to match generated names
with my own generator.
</pre>
</blockquote>
Ah, I see. :-)&nbsp; Reusing the GenModel's methods that return the names
it's using would be best.&nbsp; Reusing CodeGenUtil would be best second
best.&nbsp; And reimplementing the same algorithms by copy and paste would
be a last resort.<br>
<blockquote cite="mid:ggovlh$ng7$1@build.eclipse.org" type="cite">
<pre wrap="">
Thank you Ed!

Alex


</pre>
</blockquote>
</body>
</html>

--------------070906010209020301050103--


Ed Merks
Professional Support: https://www.macromodeling.com/
Previous Topic:emf 2.3.1 plugins
Next Topic:CDO Server crashes at startup with MySQL adapter
Goto Forum:
  


Current Time: Tue Apr 23 10:37:32 GMT 2024

Powered by FUDForum. Page generated in 0.04476 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top