#710 ✓resolved
Lyubomir Ivanov

messages files read broken when they contain certain character

Reported by Lyubomir Ivanov | April 7th, 2011 @ 07:56 PM | in 1.2.1

Trying to migrate to 1.2RC2 ended up with inability to use non-latin language in messages file. Specifically the problem is with a certain letter in Cyrillic alphabet. Suspected code is in OrderSafeProperties class. I did some debugging and I can confirm that OrderSafeProperties does not interpret utf-8 encoding properly. Here is my description:

The Cyrillic letter "х" has UTF-8 code 0xD185. OrderSafeProperties.load() interprets each byte individually and the second byte of this letter it wrongly interpreted of its own not in the context of previous byte (d1). This is a blocking bug, so please give us a fix in 1.2 release or even earlier. Should you need any cooperation with this issue I can do the tests or whatever you need.

As digging deeper in the code I can see that 0x85 is used for detection of a new line. This is Unicode symbol NEL. But in UTF-8 this symbol is represented with two-byte sequence 0xC285. This obviously is not examined in the code. Generally, looking at the code, I can't see interpreting single bytes in the context of the first byte of a utf-8 byte sequence - there might be as many as 4 bytes in a sequence in UTF-8.

P.S. Initially I put this explanation in a previous ticket but since it was marked as invalid I'm not sure that my comment there is seen.

Comments and changes to this ticket

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2>Play framework</h2>

Play makes it easier to build Web applications with Java. It is a clean alternative to bloated Enterprise Java stacks. It focuses on developer productivity and targets RESTful architectures. Learn more on the <a href="http://www.playframework.org">http://www.playframework.org</a> website.<br><br>

<h2>Source code is hosted on github</h2>Check out our repository at <a href="http://github.com/playframework/play">http://github.com/playframework/play</a><br><br>

<h2>Contributing, creating a patch</h2> Please read the <a href="http://play.lighthouseapp.com/projects/57987/contributor-guide">contributor guide</a><br><br>

<h2>Reporting Security Vulnerabilities</h2> Since all bug reports are public, please report any security vulnerability directly to <em>guillaume dot bort at gmail dot com</em>.<br><br>

<h2>Creating a bug report</h2> Bug reports are incredibly helpful, so take time to report bugs and request features in our ticket tracker. We’re always grateful for patches to Play’s code. Indeed, bug reports with attached patches will get fixed far quickly than those without any.<br><br>

Please include as much relevant information as possible including the exact framework version you're using and a code snippet that reproduces the problem.<br><br>

Don't have too much expectations. Unless the bug is really a serious "everything is broken" thing, you're creating a ticket to start a discussion. Having a patch (or a branch on Github we can pull from) is better, but then again we'll only pull high quality branches that make sense to be in the core of Play.

Referenced by