[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [pdt-dev] H2Cache questions

Hi,

again me

I modified php performance tests bundle (hardcode ZF location + enable all tests from ProjectSuite):
Patched DLTK - my github indexing branch
Original DLTK - dltk master branch
Patched PDT - pdt master + https://git.eclipse.org/r/#/c/28183/

Test machine: MBP 17â , OSX 10.9.3, 12GB RAM, SSD, Oracle Java 1.7u55
CMD: rm -rf ~/.perf_results/ && mvn clean install -o

In general test results are comparable, sometimes faster is first, sometimes second:

Patched PDT + Patched DLTK :
  Test Time: 39.651s 
  Total time: 57.579s
  Memory: 84M/1110M

Patched PDT + Original DLTK:
  Test Time: 40.281s
  Total time: 58.762s
  Memory: 83M/1109M

Similar results are for standard PDT tests, but.

Even betters look any profiler output:
1. Open eclipse with PDT
2. Import any large project
3. Enable all PDT features (folding, highlighting etc..)
4. Open 1k line file
5. Start working on file 
6. Connect any profiler (JVM monitor, Netbeans, YourKit, jProfiler what ever) and look on live results:
7. Less cpu usage, with modified DLTK

Please review ;)

I think, that later, instead of using current h2cache methodology (load all, then manually iterate hash maps), we can try run second h2 with in-memory mode ( copy disk database into memory database on startup).

Sorry for spam and my english :P

--
Dawid PakuÅa 
+48 795 996 064


From: Jacek PospychaÅa jacek.pospychala@xxxxxxxxx
Reply: PDT Developers pdt-dev@xxxxxxxxxxx
Date: 8 czerwca 2014 at 08:56:09
To: PDT Developers pdt-dev@xxxxxxxxxxx
Subject:  Re: [pdt-dev] H2Cache questions

great work Dawid!
so does it mean that this h2cache is useless? Maybe now content-assist
just shows less matches?
It'd be great to see a performance test that exhibits the difference
before and after applying your changes.

On Sun, Jun 8, 2014 at 4:59 AM, Dawid PakuÅa <zulus@xxxxxxxxx> wrote:
> Hi,
>
> I made some Java and SQL optimization:
> * Indexes
> * Replace subquery by inner joins
> * Remove 1=1 from where ;)
> * Tweaks on H2ConnectionFactory
>
> Now on large project in my test environment, h2cache is not required. For
> me, everything is faster, less memory usage.
>
> Here is my experiment:
> https://github.com/zulus/dltk.core/commit/5be4dc62ffed6c19a431b91915c1b9010c7363bb
>
> How Iâm testing:
>
> 2 large Symfony 2 projects in workspace
> Enable all highlighters
> Open 2k file with many nested call
> Format your file , next run CA
> Check your profiler ;)
>
> Advantages:
>
> faster eclipse startup !
> Open PHP type always working (without âwait for indexerâ)
>
> Before I send a patch to bugzilla, I have to perform some code cleanup. I
> also see two other points for optimization in h2 indexer.
>
> --
> Dawid PakuÅa
> +48 795 996 064
>
> From: Kaloyan Raev kaloyan.r@xxxxxxxx
> Reply: PDT Developers pdt-dev@xxxxxxxxxxx
> Date: 23 paÅdziernika 2013 at 09:30:15
> To: PDT Developers pdt-dev@xxxxxxxxxxx
> Subject: Re: [pdt-dev] H2Cache questions
>
> The test I ran yesterday for only for importing huge existing project in the
> workspace. There is a search operation for each file before it is indexed
> (to avoid indexing the same file twice). In this case the h2 cache is indeed
> an overhead because most of the time the searched file isn't available
> neither in the h2 cache, nor in the h2 db.
>
> Today, I ran an additional test - I searched for a method reference in this
> same huge project. Well, without the cache this operation is 3 times slower.
> My system is with SSD. I guess that on HDD it will be even more slower.
>
> So, it's seems that the h2 cache really optimizes search operations.
>
> Kaloyan
>
>
> On Tue, Oct 22, 2013 at 6:35 PM, Alexey Panchenko <alex.panchenko@xxxxxxxxx>
> wrote:
>>
>> Hi Kaloyan,
>>
>> How often does your performance test execute search operations? what
>> operations are executed at all in the test?
>>
>> I afraid this indexer at the moment is used only by PDT, so I don't have
>> any performance data.
>>
>> Regards,
>> Alex
>>
>>
>> On Tue, Oct 22, 2013 at 8:59 PM, Kaloyan Raev <kaloyan.r@xxxxxxxx> wrote:
>>>
>>> Hi again,
>>>
>>> I did a quick experiment with removing H2Cache. My performance tests show
>>> slight improvement without this cache.
>>>
>>> Alex, I'll be curious to hear if it's the same in your adopter's product.
>>> Here is a commit to cherry pick:
>>> https://github.com/kaloyan-raev/dltk.core/commit/e8bfa12aa5408341d230c57530474db281ef132c
>>>
>>> Greetings,
>>> Kaloyan
>>>
>>>
>>> On Mon, Oct 21, 2013 at 10:10 AM, Kaloyan Raev <kaloyan.r@xxxxxxxx>
>>> wrote:
>>>>
>>>> Hi Alex,
>>>>
>>>> The same thoughts crossed my mind when I worked on improving the
>>>> performance in Zend Studio a couple of months ago.
>>>>
>>>> H2Cache is a set of maps with strong references, which makes it really
>>>> look more like an in-memory copy of the h2 db, rather than a cache. Over
>>>> time, I suspect, this may cause memory consumption problems.
>>>>
>>>> I suppose that the H2Cache was introduced in the past, because of some
>>>> inefficiencies in the h2 db schema - remember eclip.se/415137. But now, when
>>>> the necessary index is added to the schema, the benefits of H2Cache are not
>>>> really visible.
>>>>
>>>> One of the idea in my todo list for performance optimizations is indeed
>>>> to try removing the H2Cache and measure the impact. Unfortunately, I was
>>>> distracted from the performance topic with other things, but I hope I'll be
>>>> back on it very soon.
>>>>
>>>> Greetings,
>>>> Kaloyan
>>>>
>>>>
>>>> On Sat, Oct 19, 2013 at 10:13 AM, Alexey Panchenko
>>>> <alex.panchenko@xxxxxxxxx> wrote:
>>>>>
>>>>> Hi PDT-team,
>>>>>
>>>>> I have some questions regarding this class
>>>>>
>>>>>
>>>>> http://git.eclipse.org/c/dltk/org.eclipse.dltk.core.git/log/core/plugins/org.eclipse.dltk.core.index.sql.h2/src/org/eclipse/dltk/internal/core/index/sql/h2/H2Cache.java
>>>>>
>>>>> which was contributed some time ago by Michael and committed by Roy.
>>>>>
>>>>> As I understand the code, it looks like *all* the data from SQL
>>>>> database is loaded into this class and then updates happen to both the
>>>>> in-memory copy and the underlying SQL database.
>>>>> For me, that effectively compromises the SQL database, as the same
>>>>> result could be reached with eventually saving data to file using java
>>>>> serialization.
>>>>>
>>>>> So, I am curious of the following:
>>>>> - how much memory does it use?
>>>>> - is it supposed to be a cache (and contain recently used data) or a
>>>>> full in-memory copy?
>>>>> - how much does it improve the performance? Can the same effect be
>>>>> reached in other ways?
>>>>>
>>>>> Thanks,
>>>>> Alex
>>>>>
>>>>> _______________________________________________
>>>>> pdt-dev mailing list
>>>>> pdt-dev@xxxxxxxxxxx
>>>>> https://dev.eclipse.org/mailman/listinfo/pdt-dev
>>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> pdt-dev mailing list
>>> pdt-dev@xxxxxxxxxxx
>>> https://dev.eclipse.org/mailman/listinfo/pdt-dev
>>>
>>
>>
>> _______________________________________________
>> pdt-dev mailing list
>> pdt-dev@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/pdt-dev
>>
>
> _______________________________________________
> pdt-dev mailing list
> pdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/pdt-dev
>
>
> _______________________________________________
> pdt-dev mailing list
> pdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/pdt-dev
>
_______________________________________________
pdt-dev mailing list
pdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/pdt-dev