Spark Reconnect Logic

If it loses its connection to the server, Spark attempts to reconnect every 10 seconds. This violates the recommended reconnect logic specified in Section 5.7 of rfc3920bis, which states: “If the first reconnection attempt does not succeed, an entity SHOULD back off exponentially on the time between subsequent reconnection attempts.” This is not friendly. Please fix.

These reminds me when i was testing clients to choose one for our internal IM system. So in Gaim i didnt find option to change these interval. And i asked the developers why there is no option for that. And i’‘ve got the answer. In translation it would be “this is not for a user to decide”. Anyway, Gaim had a lot of other issues (“this is not for a user to use either” ), therefore other client was chosen, but this issue was one of them. Ok, for a server with thousands of users it could be a burden if they would be reconnecting every second, but how can server owner make them to use say Gaim, or make every client developers to follow this rfc? And in the internal use i care that users will be back online as quick as possible. So if connectivity loss will be 11 secs, i prefer them to reconnect after 20 secs, not after 25. Or i prefer Exodus way more, it could try in random intervals. So it could be even 12 secs (57, or 84). So Exodus in it’'s default configuration is violating this rfc too. IMHO, i dont care. It could create big load for my server, so i should care about hardware, and Jive is making improvements in Openfire to make this load less painful. Well i still want a preference to be able to change this value, but 10 secs is ok so far. Not speaking that Quiet Reconnect absense is a more annoying issue.

Well we have 200,000+ users on the jabber.org server, with typically 10k or 15k logged in at once. If the server restarts, the reconnect load is pretty significant!

I understand, but as i said, i see no point in making some client to follow that rfc, because you cant make people to use only that client. Or can you? And from my point of view i need to be able to tweak that feature, cause i have only 160 users. So i dont mind if Derek would add that exponential offset, but with an option to customise it.

BTW, have you any stats about what clients are using jabber.org users?

Hi Peter,

is

  1. First it will try 6 times every 10 seconds.

  2. Then it will try 10 times every 1 minute.

  3. Finally it will try indefinitely every 5 minutes.

fine as mentioned in http://www.igniterealtime.org/builds/smack/docs/latest/javadoc/org/jivesoftware/ smack/ReconnectionManager.html ?

I totally agree that an exponentially reconnect delay is much better to reduce network traffic, but if all clients are dropped at 00:00 and the server is back online at 00:09 all clients could reconnect at 00:15 - this does not really help. So at least the 2nd reconnect delay should contain a random delay value.

I guess that there should be a max delay of 10 or 15 minutes as no one wants to stay offline too long.

LG

Hi Oleg,

at least the Spark users will be the first to reconnect on jabber.org … quite funny to enter a big server with nearly no users.

Also you with 160 users will benefit of such an option if you have a slow server. Doing 160 TLS/SSH handshakes, fetching rosters, sending presences, etc. takes some time.

LG

And we shouldnt forget that most of the clients (as well s Gaim) has a Reconnect button, so users can press it every sec for a long time.

LG, we have a 256MB RAM now! Yes, our server was a bit more slower in the mornings when users were turning on their desktops, though we dont use TLS/SSL. Roster fetching is the longest procedure. But now with more RAM and P4 CPU it’'s faster. And i cant do anything with that, i think jabber.org should be facing the same problems at some hours, even without server restarts.

Hi Oleg,

I guess that they face the same problems without restarts. So you may guess what happens after a server restart.

Also your server may choke if all 160 clients connect at 8 o’'clock.

There are reconnect buttons but I think that no user will press it every second for more than five minutes … this reminds me of an unlimited ammo-cheat for a game and a joystick without auto-fire support … after enough clicks your fingers get tired.

LG

There are reconnect buttons but I think that no user will press it every second for more than five minutes

i wasnt actually meaning that they will do that for real, it was a hyperbola But say someone are chating (maybe few thousands of that 15k logged in at jabber.org, maybe more at rush hours). And they loose connection. Then see Gaim tries to reconnect once and this window with Reconnect just stays on the screen doing nothing (no timer), just the reconnect Button. How many users actually know about some exponential logic in there? I think most of these users who were having some important chating will try to press that button, one, two times, maybe they will quit program and launch it again, and so on.

Anyway, i think random reconnect cycles are not so bad choice. Server will have less load, and there are chances that users will be back online faster than with constant value.