Re: [cdt-dev] Of Char[] and String

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [cdt-dev] Of Char[] and String

From: Alex Blewitt <alex.blewitt@xxxxxxxxx>
Date: Tue, 21 Jul 2009 08:15:22 +0100
Delivered-to: cdt-dev@xxxxxxxxxxx
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=references:message-id:from:to:in-reply-to:content-type :content-transfer-encoding:x-mailer:mime-version:subject:date:cc; b=J6AXxJZzIDdtFo4y1sJrLK5qkOpRDEks8Qvv5Imqvioh0ydTjEt8Uj6j6X33jUGh6p pB179Yoiioob5rebiikE5LhTvall9BSrovj0+TT8HfMlBKIiJ/mE8rsWbZJgCM9DMOVF 4k1MTooGXw67PCYzMjcTKi6eR3JQKTnqyW+9k=
List-archive: <https://dev.eclipse.org/mailman/private/cdt-dev>
List-help: <mailto:cdt-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/cdt-dev>, <mailto:cdt-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/cdt-dev>, <mailto:cdt-dev-request@eclipse.org?subject=unsubscribe>

On 21 Jul 2009, at 00:13, musset <musset@xxxxxxxxxxx> wrote:

When you don't need all the algorithms provided with class String, char Arrays are much more efficient than Strings. The main problem with String is that its representation is final (all String are constants). Thus you are always copying the same characters when you use it in your methods.

But passing a String around doesn't copy the characters.

In fact, using things like .toCharArray() are generally the inefficient part since this will require a duplication of the array itself.

I believe the original issue (in JDT) was more to do with PermGen space, but that tends to be less of an issue if you don't intern() all the time.

But having the literal in the source file defeats the point of not using Strings in the first place.

For further explanations, I recommend this excellent book by Jack: Java Performance Tuning, now available on Google books http://books.google.fr/books?id=iPHtCfZQyqQC.
Read page 150 et seq.

On Mon, 20 Jul 2009 15:51:55 -0400, Doug Schaefer wrote:

--
Nicolas

Ah, memories ;)

This came at a time when we were very worried about size/performance of the parser. The JDT seemed to get away with things using char arrays instead of Strings. Essentially that removes the extra objects which is significant since the parser creates a lot of "strings". To keep things uniform, we used char arrays all over the place.

I don't get your statement about String being more efficient than char array. Since all String does is wrap char arrays with algorithms. I would do a little more research on whether Strings are actually smaller. Constants, maybe, since we may be generating code to convert them from compiler generated String. A good compiler would optimize that out.  At any rate we don't have that many constants, so we didn't worry about that much.

On Mon, Jul 20, 2009 at 2:43 PM, Alex Blewitt <alex.blewitt@xxxxxxxxx> wrote:

As a general observation, I'm confused with the amount of char[] that happens in the CDT codebase. Is this a general consequence of C programmers working in Java, or are there underlying reasons? I happen to come across this today:

   private static final char[] EMPTY_CHAR_ARRAY = new char[0];

private static final char[] ONE = "1".toCharArray(); //$NON-NLS-1$

The problem with char[] is that it's generally a less efficient one for storage than the underlying String model is, and in any case, you end up with the String being backed by a similar array in the first place (which is then interned).

Consider the following class:

public class Tes

  public static final char[] foo = "1".toCharArray();

  // public static final char[] foo = {'1'};

  // public static final String foo = "1";

}

If I compile this (Mac OS X with Java 6) I get the following sizes of class file generated:

char with toCharArray = 329b

char with in-line array = 272b

String = 248b

What I can't understand is why we have the string "1" (which will take up space in the Class' intern pool) and then taking up more space than if we'd just used the string on its own.

There's probably a reason, but one that isn't immediately obvious to me. Perhaps someone could enlighten me? It's probably all related to the fact that Token has a char[] getCharImage(), but that in itself just lends the question to 'why doesn't that return a String ...'

Alex

_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev

_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev

References:
- [cdt-dev] Of Char[] and String
  - From: Alex Blewitt
- Re: [cdt-dev] Of Char[] and String
  - From: Doug Schaefer
- Re: [cdt-dev] Of Char[] and String
  - From: musset

Prev by Date: Re: [cdt-dev] Of Char[] and String
Next by Date: RE: [cdt-dev] Of Char[] and String
Previous by thread: Re: [cdt-dev] Of Char[] and String
Next by thread: RE: [cdt-dev] Of Char[] and String
Index(es):
- Date
- Thread

Breadcrumbs