Author Topic: UTF-8 Multibyte characters are garbled after source code beautify.  (Read 5055 times)

Toru

  • Community Member
  • Posts: 7
  • Hero Points: 0
Hi SlickEdit developer,

This is my first post on this forum.
Thanks for letting me using great product.

This is a bug report.
As I mentioned in the subject, UTF-8 Japanese multi-byte character are corrupted after Beautify.
This always happen with the processing.
In case of SJIS, no corruption was occurred. (I have no idea with EUC, JIS and the other Japanese encodings)

I've mainly used SE10.0.3 on windows for 4 years.
And I tried SE2009 download trial recently. Badly, it still has the probrem.

Would it be possible for you to see it?

Regards,

Toru

Toru

  • Community Member
  • Posts: 7
  • Hero Points: 0
Re: UTF-8 Multibyte characters are garbled after source code beautify.
« Reply #1 on: December 09, 2009, 08:27:49 am »
I attach the source code with Japanese comments.
And some screen shots.

testUTF8j.cpp   : original source
testUTF8j2.cpp : broken one

You can see that the characters are broken, and hex code are different between them.

David_O

  • Senior Community Member
  • Posts: 152
  • Hero Points: 8
Re: UTF-8 Multibyte characters are garbled after source code beautify.
« Reply #2 on: December 14, 2009, 03:20:43 pm »
Would you please double check the encoding of your sample files?  When I open the file 'testUTF8j.cpp', I do not get anything that looks like the screenshot 'original.jpg'.  However, I am able to create, display and beautify other Japanese files correctly.  I wonder if there is a problem with your samples.  Any other information about how you create your files, would also help.

Thank you,

Toru

  • Community Member
  • Posts: 7
  • Hero Points: 0
Re: UTF-8 Multibyte characters are garbled after source code beautify.
« Reply #3 on: December 15, 2009, 01:53:54 am »
Hello David,

Thanks for your investigation.

testUTF8j.cpp's file encoding is UTF-8 (without BOM).
All editors that I use here can open it. They are notepad, hidemaru(maruo), SE10.0.3, 2009 and MS-Word.
After your comment, I tried UTF-8 (with BOM) on SE 10.0.3, then beautify worked. (Unfortunately, SE2009 trial was expired)

Doesn't Beautify support UTF-8 (without BOM)?

Thanks.

David_O

  • Senior Community Member
  • Posts: 152
  • Hero Points: 8
Re: UTF-8 Multibyte characters are garbled after source code beautify.
« Reply #4 on: December 17, 2009, 02:46:01 pm »
Thank you for your response.  I've been able to reproduce this and we will have a fix for the next release.

Toru

  • Community Member
  • Posts: 7
  • Hero Points: 0
Re: UTF-8 Multibyte characters are garbled after source code beautify.
« Reply #5 on: December 18, 2009, 04:31:24 am »
Hello David,

I'm happy that you reproduced the probrem and will fix it.

BTW, what you meen "next release", do you plan to make new hotfix before v15?
I'm wondiring if I buy v14 now or wait v15.

Thanks for your help.

Toru

jimlangrunner

  • Senior Community Member
  • Posts: 354
  • Hero Points: 30
  • Jim Lang - always a student.
Re: UTF-8 Multibyte characters are garbled after source code beautify.
« Reply #6 on: December 18, 2009, 10:42:16 am »
If you're buying support along with Slickedit (which I would recommend strongly), it does not matter when you buy it, as you'll be able to get upgrades as they become available.

If you're not buying support, I would (personally) recommend you wait for 15, as there is a limited support period, and upgrades are not guaranteed.

Jim

Toru

  • Community Member
  • Posts: 7
  • Hero Points: 0
Re: UTF-8 Multibyte characters are garbled after source code beautify.
« Reply #7 on: February 05, 2010, 04:24:31 am »
I tried v15 beta, and confirmed that this wasn't fixed yet.

> jimlangrunner
Thanks, I bought v14 with 1year support.

David_O

  • Senior Community Member
  • Posts: 152
  • Hero Points: 8
Re: UTF-8 Multibyte characters are garbled after source code beautify.
« Reply #8 on: February 24, 2010, 10:10:23 pm »
This has been fixed and should be in the next beta release for testing.