Openfire down with out of memory exception

Hi everybody!

Not so far my openfire server worked fine with about 1700 connected simultaneously users.

But after the users connection grown up to 2100 users simultaneously, there is a problem with periodically down of openfire service (it`s take effect when all users starts to chating at work time before and after dinner )

My logs says, that before shoting down there are exists errors of java.lang.OutOfMemoryError: unable to create new native thread.

There is some part of error log:

2009.09.29 12:00:27 [org.jivesoftware.openfire.nio.ConnectionHandler.exceptionCaught(ConnectionHand ler.java:110)
]
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Unknown Source)
at net.kano.joscar.ratelim.QueueRunner.startThread(QueueRunner.java:125)
at net.kano.joscar.ratelim.QueueRunner.startThreadIfNecessary(QueueRunner.java:119 )
at net.kano.joscar.ratelim.QueueRunner.update(QueueRunner.java:100)
at net.kano.joscar.ratelim.ConnectionQueueMgrImpl.queueSnac(ConnectionQueueMgrImpl .java:197)
at net.kano.joscar.ratelim.RateLimitingQueueMgr.queueSnac(RateLimitingQueueMgr.jav a:179)
at net.kano.joscar.snac.ClientSnacProcessor.sendSnac(ClientSnacProcessor.java:547)
at org.jivesoftware.openfire.gateway.protocols.oscar.AbstractFlapConnection.sendRe quest(AbstractFlapConnection.java:154)
at org.jivesoftware.openfire.gateway.protocols.oscar.OSCARSession.handleRequest(OS CARSession.java:583)
at org.jivesoftware.openfire.gateway.protocols.oscar.OSCARSession.request(OSCARSes sion.java:606)
at org.jivesoftware.openfire.gateway.protocols.oscar.OSCARSession.request(OSCARSes sion.java:600)
at org.jivesoftware.openfire.gateway.protocols.oscar.OSCARSession.updateStatus(OSC ARSession.java:797)
at org.jivesoftware.openfire.gateway.BaseTransport.processPacket(BaseTransport.jav a:378)
at org.jivesoftware.openfire.gateway.BaseTransport.processPacket(BaseTransport.jav a:198)
at org.jivesoftware.openfire.component.InternalComponentManager$RoutableComponents .process(InternalComponentManager.java:619)
at org.jivesoftware.openfire.spi.RoutingTableImpl.routePacket(RoutingTableImpl.jav a:260)
at org.jivesoftware.openfire.roster.Roster.broadcastPresence(Roster.java:590)
at org.jivesoftware.openfire.handler.PresenceUpdateHandler.broadcastUpdate(Presenc eUpdateHandler.java:283)
at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:124)
at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:112)
at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:176)
at org.jivesoftware.openfire.PresenceRouter.handle(PresenceRouter.java:134)
at org.jivesoftware.openfire.PresenceRouter.route(PresenceRouter.java:70)
at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:76)
at org.jivesoftware.openfire.net.StanzaHandler.processPresence(StanzaHandler.java: 337)
at org.jivesoftware.openfire.net.ClientStanzaHandler.processPresence(ClientStanzaH andler.java:85)
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:254)
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:176)
at org.jivesoftware.openfire.nio.ConnectionHandler.messageReceived(ConnectionHandl er.java:133)
at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.messageReceived (AbstractIoFilterChain.java:570)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.common.IoFilterAdapter.messageReceived(IoFilterAdapter.java:80)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.codec.support.SimpleProtocolDecoderOutput.flush(SimplePr otocolDecoderOutput.java:58)
at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecF ilter.java:185)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.executor.ExecutorFilter.processEvent(ExecutorFilter.java :239)
at org.apache.mina.filter.executor.ExecutorFilter$ProcessEventsRunnable.run(Execut orFilter.java:283)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Unknown Source)

And from warning log:

2009.09.29 12:01:00 handle failed
java.lang.IllegalStateException
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:360)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:395)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488)
2009.09.29 12:01:00 dispatch failed!
2009.09.29 12:01:00 EXCEPTION
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Unknown Source)
at org.mortbay.thread.QueuedThreadPool.newThread(QueuedThreadPool.java:436)
at org.mortbay.thread.QueuedThreadPool.dispatch(QueuedThreadPool.java:143)
at org.mortbay.jetty.nio.SelectChannelConnector$1.dispatch(SelectChannelConnector. java:86)
at org.mortbay.io.nio.SelectChannelEndPoint.dispatch(SelectChannelEndPoint.java:67 )
at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:471)
at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:166)
at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java :124)
at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:707)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488)

I use Windows server 2003 enterprise (86x) 2GB (+ from 2 to 4 virtual page file), Xeon 3.2 GHz. + Sql server as extended DB.

The JAVA settings of virtual memory from openfire-service.vmoptions file are:

-Xms512m
-Xmx1024m

I guess that its not enough java memory. But in web console its writes that memory is more then enought, when 1800 users are connected:

Java Version: 1.6.0_03 Sun Microsystems Inc. – Java HotSpot™ Server VM
Appserver: jetty-6.1.x
OS / Hardware: Windows 2003 / x86
Locale / Timezone: en / Vladivostok (Russia) Time (10 GMT)
Java Memory
139,15 MB of 986,12 MB (14,1%) used

And in task manager of Windows openfire-service use about 350 MB of memory. All system uses 900 MB of RAM

So my question is: What is wrong with my server and what I Need to do to escape this errors.

Thanks everybody to help me to resolve this problem

P.S. I have develope some plugins for openfire and wont to share it. Give me please some contacts at e-mail, that I have ability to write them.

Here there,

I think I’m having the same problem.

See http://kraken.blathersource.org/node/197

and http://kraken.blathersource.org/node/28#comment-653

Let me know, your outocome / thoughts.

Cheers,

M

I think I found the problem

It was only 2GB of Ram in my server. Adding more +2 GB did not help me.

After I change the parameters of Windows Server to allow 3GB to user process and change -Xss to 128k of steck The server working up today.

Whats happened - I look in sources of OpenFire and JVM native memory alloc docs. So for every new connection OF with jvm create a new thread that alloc memory from NATIVE memory, not from java virtual. - thats becouse parameter -Xmm -Xms do not working already.

Adding 2 GB + 2 GB and 3G\ parameter for users process allowed me to use more connections without any errors.

Today its about 2100 users permanently and its growing up (in Russian tradition: knock-knok-kcock by wood)

Hi Kot,

We’re just started load testing before we go live and we’re also getting out of memory errors after a certain amount of time. We get to about 1500 users and the memory and CPU go up and up and up. We’ve removed all plugins to make sure it’s not something we have introduced.

What are the settings you are using? We have:

OPENFIRE_OPTS="-XX:+PrintGCDetails -Xloggc:/tmp/openfire-gc.log -XX:+HeapDumpOnOutOfMemoryError -Xms256m -Xmx2048m -Xss1024k -Xoss1024k -XX:ThreadStackSize=1024".

Thanks,

Michael

I’ve successfully load-testing 1200 messages per second (72,000 packets/minute) on OpenFire 3.6.4 running on Linux (2.6.18 kernel) before running out of CPU (single-core 3Ghrz). Only performance adjustment was increasing java stack size some ’ OPENFIRE_OPTS="-Xms256m -Xmx512m" '. Maybe your database is eating up availaible memory?

thanks noahd - how long did your test run before you ran out of CPU? I’m using MySQL for the database…

We ran the test three times, each for 10 minutes. OpenFire never crashed, but message delivery slowed down when we were maxing out the CPU. It would take 5-10 seconds from pressing enter to receiving the message. We are also using MySQL (v5), and it made up a substantual amount of that CPU load in the testing.

Which JRE are you using? Sun Java 6? or Java 1.5? We’re using the JRE packaged with the OpenFire RPM.

We’re using the RPM installation so it’s the Java that’s bundled with that. It takes a certain amount of time for the memory leak (if that’s actually what it is) to manifest so we don’t normally see it until some minutes in. We’ve been running load tests for 1 hour + at a time today and we get consistent results each time.

Going to try again with a clean OS build on Monday as we’re testing against Fedora 10. Just in case : )

Thanks!

Hmm, I haven’t experianced any type of memory leak. Currently my OpenFire instance has been online for 114 days straight. It normally sits around 1300 concurrent connections, and is steady using around 200-300MB of Java Memory. Could you post a top output of one of your load tests?

(edit) for cut & paste error

At youre request about java settings…

As I wrote above, my settings of java machine are:

-Xms1200m
-Xmx1200m
-Xss128k

physical machine:

4 GB of Ram (3GB usermode), 4GB of mapping file

Windows server 2003 enterprise (86x) , 2* Xeon 3.2 GHz.

Thanks noahd, that’s encouraging, I’ll try to get some load test output.

Where do I find these settings to change? I’ve been searching around for a while and I have the same problem. I’m using a Windows 2003 VM.

What settings do use search? Is that JVM memory tuning or Windows memory settings?