-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DateFormat garbage characters in output #1213
Comments
I'm going to transfer this to the support repository -
When you say "other JVMs" are you suggesting it passes with an equivalent OpenJDK version from other vendors? Can you say which ones? |
THank you @sxa. Other JVMs are the following:
The labels are mine - the lines under the lables are the output of "java -version". |
@artnaseef What you are observing is another version of https://bugs.openjdk.org/browse/JDK-8324308 caused by the CLDR 42.0 update done in JDK 20 (also included in JDK 21). What you need to do is use a custom formatter to get the simple space (over the horizontal non-breaking space before |
Is there a straight-forward way to get the plain-text / ASCII-compatible I'm having a little trouble wrapping my head around |
BTW, I notice the labeled "WAITING ON OP"? Is there something more I need to do here? |
I haven't dug into this too much, but a quick query to Copilot gives me: To ensure that the output format has a simple space instead of any unexpected characters before AM/PM, you can use SimpleDateFormat from the java.text package. Here’s the updated code:
Explanation: When you run this code, the output will look like this: 4:49 PM with a regular space before AM/PM._ |
Thank you for the response. In my case, the formatted date is going to individuals who may be anywhere geographically, so I don't want to use fixed date and time formats - I want to use the formats that are specific to their locale. Ignore the hard-coded locale in my snippet please. |
You could try if |
Thanks Severin. So the standard (CLDR?) does not address this?
Perhaps this is just my lack of understanding UTF-8. Is it reasonable to
expect standard regular expression processors (e.g. java.lang.Matcher) to
treat this non-breaking space as a space (e.g. matching with \s predefined
character class in a java regex)?
Art
…On Tue, Jan 21, 2025 at 8:29 AM Severin Gehwolf ***@***.***> wrote:
You could try if -Djava.locale.providers=COMPAT works, but that option is
gone in later JDKs.
—
Reply to this email directly, view it on GitHub
<#1213 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCCNZWZXT2Y7UGOV3BB7XD2LZRXDAVCNFSM6AAAAABUV3XIO6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBVGA2DQMRRHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Just tested it, and the regex failed to match the non-breaking character. Art |
Any thoughts on how to pursue this further? It feels to me like the JDK is doing the wrong thing here since some of the text tools seem to make use of the full UTF-8 space (e.g. the date formatting), while others ignore it (e.g. regex). If there is a desire to go all-in with UTF-8, then shouldn't the regex handle it? This is a breaking issue. |
Is there another / more-appropriate place to raise this concern? |
Feel free to raise this issue on core-libs-dev on the OpenJDK project. |
The
That doesn't include a narrow non-breaking space, AFAIK. |
What are you trying to do?
Format the date/time with the following code:
DateFormat.getTimeInstance(DateFormat.SHORT, Locale.US).format(date)
Expected behaviour:
The result string contains the properly formatted date and no garbage/extraneous characters.
Observed behaviour:
The result string contains the date, but also contains garbage characters instead of a space preceeding the "AM" / "PM" text.
Any other comments:
Tested with the following:
OpenJDK Runtime Environment Temurin-21.0.2+13 (build 21.0.2+13-LTS)
OpenJDK Runtime Environment Temurin-21.0.5+11 (build 21.0.5+11-LTS)
Also tested SUCCESSFULLY (i.e. no garbage in the output) with the following, and other JVM's:
OpenJDK Runtime Environment Temurin-17.0.9+9 (build 17.0.9+9)
Here is example output using
cat -v
and xxd:4:49M-bM-^@M-/PM
Here is the code of a complete test program:
The text was updated successfully, but these errors were encountered: