Emmanuel Touzery
2018-10-22 14:04:52 UTC
Hello,
we have a tomee+ 7.0.3 installation with activemq, using kahadb as
a persistent message storage. We have an activemq.xml, we plugged it
though :
BrokerXmlConfig = xbean:file:/opt/app/tomee/conf/activemq.xml
in the tomee.xml. The activemq broken runs within TOMEE:
ServerUrl = tcp://127.0.0.1:61616
We have a prefetch of 2000:
<transportConnector name="nio"
uri="nio://0.0.0.0:61616?jms.prefetchPolicy.all=2000"/>
We use mKaha. We disabled flow control.
So that everything would work, we had to add a couple of JARs in
the TOMEE lib folder:
activemq-spring-5.14.3.jar
spring-beans-3.2.9.RELEASE.jar
spring-context-3.2.9.RELEASE.jar
spring-core-3.2.9.RELEASE.jar
spring-expression-3.2.9.RELEASE.jar
spring-web-3.2.9.RELEASE.jar
xbean-spring-3.9.jar
We are "reading" from JMS through message-driven beans,
implementing MessageListener and with @MessageDriven annotations.
The application is pretty simple... Receive the data from
HTTP/JSON, and store it to SQL (through hibernate).
Everything works fine as long as the traffic is normal. However
when there is a surge of incoming traffic, sometimes the JMS consumers
stop getting called, and the queue only grows. The issue does not get
fixed until TOMEE is restarted. And then we've seen the issue re-appear
again maybe 40 minutes later. After a while, the server clears the queue
and everything is fine again.
We took a jstack thread dump of the application when it's in that
"hung" state:
https://www.dropbox.com/s/p8wy7uz6inzsmlj/jstack.txt?dl=0
What's interesting is that writes fall quite fast, and in steps, in
general not all at once, but as well not slowly:
Loading Image...
After a restart things are fine again immediately.
We're not sure what is the cause. From what we can tell from the
thread dump, the consumers are idle, they just don't get notified that
work is available. The server is certainly aware there are items in the
queue, we monitor the queue through JMX and the queue size keeps growing
during these episodes. We don't see anything out of the ordinary in the
logs. We looked at thread IDs for consumers just before the issue, it
doesn't look like the consumers get some deadlock one after the other
for instance. It seems like a bunch of them are called in the last
minute before the dropoff for instance. Also, during a blackout the JDBC
pool usage is at 0 according to our JMX monitoring, so it doesn't seem
to be about a deadlocked JDBC connection.
We did notice the following activemq warnings in the log file, but
the timestamps don't match with any particular events and from what we
found out, they don't seem to be particularly worrying or likely to be
related to the issue:
WARNING [ActiveMQ Journal Checkpoint Worker]
org.apache.activemq.store.kahadb.MessageDatabase.getNextLocationForAckForward
Failed to load next journal location: null
WARNING [ActiveMQ NIO Worker 6]
org.apache.activemq.broker.TransportConnection.serviceTransportException
Transport Connection to: tcp://127.0.0.1:37024 failed: java.io.EOFException
Do you have any suggestion to try to fix this issue (which we sadly
can't reproduce at will.. and it only happens pretty rarely)? Should we
rather ask on the activemq mailing list?
Regards,
emmanuel
we have a tomee+ 7.0.3 installation with activemq, using kahadb as
a persistent message storage. We have an activemq.xml, we plugged it
though :
BrokerXmlConfig = xbean:file:/opt/app/tomee/conf/activemq.xml
in the tomee.xml. The activemq broken runs within TOMEE:
ServerUrl = tcp://127.0.0.1:61616
We have a prefetch of 2000:
<transportConnector name="nio"
uri="nio://0.0.0.0:61616?jms.prefetchPolicy.all=2000"/>
We use mKaha. We disabled flow control.
So that everything would work, we had to add a couple of JARs in
the TOMEE lib folder:
activemq-spring-5.14.3.jar
spring-beans-3.2.9.RELEASE.jar
spring-context-3.2.9.RELEASE.jar
spring-core-3.2.9.RELEASE.jar
spring-expression-3.2.9.RELEASE.jar
spring-web-3.2.9.RELEASE.jar
xbean-spring-3.9.jar
We are "reading" from JMS through message-driven beans,
implementing MessageListener and with @MessageDriven annotations.
The application is pretty simple... Receive the data from
HTTP/JSON, and store it to SQL (through hibernate).
Everything works fine as long as the traffic is normal. However
when there is a surge of incoming traffic, sometimes the JMS consumers
stop getting called, and the queue only grows. The issue does not get
fixed until TOMEE is restarted. And then we've seen the issue re-appear
again maybe 40 minutes later. After a while, the server clears the queue
and everything is fine again.
We took a jstack thread dump of the application when it's in that
"hung" state:
https://www.dropbox.com/s/p8wy7uz6inzsmlj/jstack.txt?dl=0
What's interesting is that writes fall quite fast, and in steps, in
general not all at once, but as well not slowly:
Loading Image...
After a restart things are fine again immediately.
We're not sure what is the cause. From what we can tell from the
thread dump, the consumers are idle, they just don't get notified that
work is available. The server is certainly aware there are items in the
queue, we monitor the queue through JMX and the queue size keeps growing
during these episodes. We don't see anything out of the ordinary in the
logs. We looked at thread IDs for consumers just before the issue, it
doesn't look like the consumers get some deadlock one after the other
for instance. It seems like a bunch of them are called in the last
minute before the dropoff for instance. Also, during a blackout the JDBC
pool usage is at 0 according to our JMX monitoring, so it doesn't seem
to be about a deadlocked JDBC connection.
We did notice the following activemq warnings in the log file, but
the timestamps don't match with any particular events and from what we
found out, they don't seem to be particularly worrying or likely to be
related to the issue:
WARNING [ActiveMQ Journal Checkpoint Worker]
org.apache.activemq.store.kahadb.MessageDatabase.getNextLocationForAckForward
Failed to load next journal location: null
WARNING [ActiveMQ NIO Worker 6]
org.apache.activemq.broker.TransportConnection.serviceTransportException
Transport Connection to: tcp://127.0.0.1:37024 failed: java.io.EOFException
Do you have any suggestion to try to fix this issue (which we sadly
can't reproduce at will.. and it only happens pretty rarely)? Should we
rather ask on the activemq mailing list?
Regards,
emmanuel