Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[orion-dev] Orion out of memory errors explained

Last week we started getting out of memory errors on I have finally figured out what was going on, and am posting a synopsis here in case anyone else running into this problem on their own Orion servers (and for future google searches on the topic).  The errors we were getting looked like this:

Caused by: Map failed
        at org.apache.lucene.index.SegmentCoreReaders.<init>(
        at org.apache.lucene.index.SegmentReader.get(
        at org.apache.lucene.index.IndexWriter$ReaderPool.get(
        at org.apache.lucene.index.IndexWriter.mergeMiddle(
        at org.apache.lucene.index.IndexWriter.merge(
        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(
        at org.apache.lucene.index.ConcurrentMergeScheduler$

The problem was persistent - on server restart it would fail again within 30 minutes. The end user symptom was that indexed search operations on the client side were completely broken.

This is not a JVM heap space problem. It is trying to allocate direct non-heap memory for the purpose of doing memory-mapped I/O. The physical machine we are running the server on has 6GB of RAM, of which 4GB was allocated as JVM heap space. The server is a SLES 11 machine, configured to set a memory limit per process (ulimit -v)  of roughly 5GB.  It looks like essentially we were allocating too much memory to the heap, and there was not enough memory left available to the process for allocating non-heap memory, such as memory mapped file buffers, class file space, etc. I resolved the problem by actually *decreasing* the JVM allocation to 3GB, therefore leaving more memory space available to the process for non-heap memory. After decreasing the heap size to 3GB we have run for 4 days with no further errors. The problem could alternatively be solved by increasing the per-process virtual memory limit. Today I have increased this on to 10GB just to give some breathing room. That will enable us to increase the heap again if needed without running into this problem again.


Back to the top