Patch: Improve Message Parser Robustness and Message Body I18N

Hi,

here is a patch that improves the robustness of the message packet parser by interpreting everything within a tag as string. So even if the tag contains other tags (which is forbidden by XMPP specification but may happen since the content is user generated and there may be clients that don’t escape the text correctly) the connection to the XMPP server won’t be closed because of an XMLPullParserException, instead the content will be parsed as string ignoring all tags. It will still crash on invalid XML though.

Additionally for the parser to correctly set message bodies in different languages that may occur in a message packet the patch fixes the faulty handling of localized bodies.
A message tag may contain an xml:lang attribute which sets the default language of the message. This attribute is now not ignored anymore when correlating the message bodies to a language. Every message body now has a language associated (no null language for the default language of the message or application at the moment the body was set).

The patch comes with some unit tests that require the library java-xmlbuilder.
The java-xmlbuilder is a small and powerful library to easily and clearly create XML documents (compared to the ugliness of creating XML by concatenating strings and variables).
Download the library here and put the jar file in the build/build/ folder (next to the xmlunit library).

Best regards,
Henning
improve_message_parse_robustness.patch.zip (4848 Bytes)

Do you think, that your patch can also fix my issue in thread http://www.igniterealtime.org/community/thread/40374 ?

<message xmlns='jabber:client' from='kenai@muc.kenai.com/Frederic Jean'
id='1258145104804+purplef5b633d8' to='jbecicka@kenai.com/NetBeans'
type='groupchat' xml:lang='en'><body>502 - Bad Gateway
A 502 status code indicates that a server, while acting as a proxy,
received a response from a server further upstream that it judged
invalid.</body><body xmlns='http://www.w3.org/1999/xhtml'><span
style='font-weight: bold;'>502 - Bad Gateway</span>A 502 status code
indicates that a server, while acting as a proxy, received a response
from a server further upstream that it judged invalid.</body><delay
xmlns='urn:xmpp:delay' stamp='2009-11-13T20:45:04.804+00:00'/></message>

Thanks,

Jan

Yes,

actually I startet writing this patch because of your thread. But instead of just ignoring the invalid body by catching the exception the whole content of the body is interpreted as string. So your example message packet would create a Message instance that contains the invalid body as string. But since this message contains multiple bodies for the same language only the first one will be available if you call message.getBody().

You can look at the test cases included in the patch to see that your problem is solved.

Best regard,

Henning

Great! Thanks!

Hi!

Nice patches. I just wanted to let you know that you can find your patches in a working version of smack under http://github.com/dereulenspiegel/smack and soon probably under http://github.com/rtreffer/smack. In the first one are also patches for XEP-0115 and Roster Versioning. But the later is highliy experimental.

Thanks for letting me know. I’m new to Smack development and maybe I have stupid questions. But why did you fork Smack? Fragmenting projects usually means, that there is somethning wrong with the project.

I think, that there should be some common consensus, that some patch is good or bad and should be applied or refused or tested in some experimental branch.

I’m little bit frustrated: we have a bug, we have a fix, but we don’t have idea, when it will be merged to trunk or when new version of smack with this patch will be released.

a git branch isn’t exactly forking the project… They’re just maintaining the code with the patch applied for our convenience until whoever has commit access can test and integrate it. (if that ever happens…)

Right. This github repo is just for convience so that there is a place where community patches can be integrated. It is very sad that we have some good developer who supply good and necessary patches and no one integrates them. But thats also the beauty of open source. You can integrate them yourself.

Well, no, I cannot integrate it myself, because I don’t have commit rights.

But I understand you. I can use this patch and release “my” version of smack. But do we want to have several versions of smack?

BTW is there any “release schedule”. Is there any plans for smack 3.1.1 or something like that? I’m willing to help.

Normally I would say that we don’t need more versions of smack. A single version is enough. But the problem is that some patches exist for months (some even over a year) without any of the devs noticing it. So René imported the whole trunk ti github and started cherry picking some patches. And I forked that repo and started to implement some new things.

But thats hopefully that’s only a temporal solution. I still hope that something starts to happen here and development on smack (and also openfire) becomes active again. At least bugs should be fixed quick.

They meant you can integrate the patch yourself in checked out code… (with the patch) or, you can use the integrated version on that git branch.

Jbecicka, read up on how git works. This isn’t a new version of smack… it’s branched code. Whenever you use git for anything you are effectively creating a branch of the main trunk.

It is trivially easy to pull down updates from the main trunk into that branch, and trivially easy to merge those changes back into the main trunk. So, again, not a new version.

If someone went off and created a new SVN repo and called their project “Smack but better” THAT would be a new version. This is just branching without the permission of the missing in action smack devs.

Yes, I fully understand. But if I build smack.jar from the git repository - what version it is? Is it 3.1.0? No it is not. Is it 3.1.1? No it is not. It is smack version from your git repository. And if I distribute this jar and someone files a bug. Is it bug in smack 3.1.0? Maybe. But it can be also bug in those patches.

I really appreciate your work, but I insist on different versions

I agree with @jbecicka that the version number should be changed to reflect that this is no official ignite build.

However, I don’t agree about the fork part. Currently, Smack seems to be in a sorry state, as critical bugs have not been fixed for several years (or in a different case, have maybe been fixed, but no new version has been released).

So forking the project in order to release fixes much more frequently, might currently help Smack much more than it would hurt.