topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Friday March 29, 2024, 8:18 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Why would brand new software NOT work with Unicode? / Challenges  (Read 6437 times)

daddydave

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 867
  • test
    • View Profile
    • Donate to Member
You are creating an application today. It will not be Unicode compliant. Why? For older programs, it may not be worth the rewrite effort, but if you are writing code from scratch, intended for a worldwide audience, this would seem to be automatic, and yet it doesn't seem to be. What are some of the challenges you face in creating software that leaves the ASCII character set of yesteryear behind (except for backward compatibility)? Be as technical as you want, don't worry about it being over my head. Maybe it will be, but I can still use it as a starting point for further research, and others will benefit.

On the other hand, be as non-technical as you want as well. I think I am seeing a trend of requesting Unicode support for various DonationCoder applications (maybe I am just noticing it more now that I am learning Devanagari script), so I think a lot of people want to know why this isn't easy!
« Last Edit: January 05, 2012, 08:12 AM by daddydave »

vlastimil

  • Honorary Member
  • Joined in 2006
  • **
  • Posts: 308
    • View Profile
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #1 on: January 05, 2012, 09:24 AM »
Just few random thoughts:
* Is unicode supported by default in the used development tool? If not, and someone may not bother with the switch.
* From what I have heard, Windows 98 is still relatively widespread in some countries and unicode support on W9x is problematic.
* There is UCS16, UTF8 and other flavors. Dealing with all of them may be fiddly.

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #2 on: January 05, 2012, 09:30 AM »
It could be the programming environment you're working in that has very poor unicode support (Delphi, C++ Builder, various scripting languages).

Then there's unicode itself - it's more than just "characters are wider than one byte" - you have issues like LTR/RTL, combining points, and whatnot... stuff that I've blissfully chosen to ignore for my hobbyist stuff, and hope I won't have to deal with professionally. Non-english sucks, really.

There is UCS16, UTF8 and other flavors. Dealing with all of them may be fiddly.
UCS-2, you mean? :) (Windows used to be UCS-2 afaik, then switched to UTF-16).
- carpe noctem

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,288
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #3 on: January 05, 2012, 09:33 AM »
Some languages do not include robust unicode support innately. Some require hacks or libraries or mind-numbingly stupid amounts of work to get unicode to work.

The newer, more modern languages all have unicode support innately, or have easily integrated unicode support.

So, for example, if you use an older version of Delphi, you're hosed. Update and you may need to rewrite things -- but ask a Delphi whiz about that for details.

To be honest, it's really a complete disaster. The root cause is that computing resources were very limited and expensive a long time ago, and we're still paying for that now.

Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #4 on: January 05, 2012, 05:37 PM »
I have found unicode universally painful as hell to deal with -- even when working with languages that have good support for it (Python). 

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,288
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #5 on: January 05, 2012, 06:19 PM »
I have found unicode universally painful as hell to deal with -- even when working with languages that have good support for it (Python). 

My personal unicode hell began a very long time ago, and I quickly tried to rectify things. I suppose that since then, I've simply tried to stick to things that pre-empt the problems I've faced in the past, so I really don't have any issues with it now.



Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

app103

  • That scary taskbar girl
  • Global Moderator
  • Joined in 2006
  • *****
  • Posts: 5,884
    • View Profile
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #6 on: January 07, 2012, 12:57 AM »
It could be the programming environment you're working in that has very poor unicode support (Delphi, C++ Builder, various scripting languages).

Bingo!

* From what I have heard, Windows 98 is still relatively widespread in some countries and unicode support on W9x is problematic.

Bingo again!

A lot of my apps were written while stuck on an old WinME PC.

tranglos

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,081
    • View Profile
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #7 on: January 07, 2012, 06:28 AM »
It could be the programming environment you're working in that has very poor unicode support (Delphi, C++ Builder, various scripting languages).

Oooh, I have to correct this :) Delphi has had a brilliant Unicode support since version 2009. The downside is that apps compiled in these new versions of Delphi won't run on anything below Win2k/XP.

Of course, earlier versions of Delphi are still very much in use as well. In those it's still possible to create unicode-enabled apps, but without native compiler support it can be quite painful and you have a very limited choice of third-party libraries.

tranglos

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,081
    • View Profile
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #8 on: January 07, 2012, 06:47 AM »
Non-english sucks, really.

But you could localize applications and handle user data in languages other than English before Unicode. Theoretically, Unicode is only truly required if your app needs to support languages that don't share a ("classic") codepage at the same time. Say, if you need to display English and Russian on the same screen at the same time. Other than that, applications used to run in any language you wished without Unicode.

It's cool to be able to handle  all languages (more or less) without having to think about it twice, but some things have become much harder and much slower too.

In Delphi world, there used to be a bunch of highly optimized string manipulation libraries, written mostly in assembly. They rocked, and I based my whole glossary / translation memory application around one. Even registered my domain specifically to distribute that app, and it's still my username in most places :)
That application knows nothing about unicode, but it is FAST! In unicode world, that application is dead in the water. I could convert it in theory, but the cost of twice as much memory use and searches taking ten times what they used to.



erikts

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 224
    • View Profile
    • Donate to Member
Re: Why would brand new software NOT work with Unicode? / Challenges
« Reply #9 on: January 13, 2012, 01:53 AM »
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
by Joel Spolsky

... A couple of years ago, a beta tester for FogBUGZ was wondering whether it could handle incoming email in Japanese. Japanese? They have email in Japanese? I had no idea. When I looked closely at the commercial ActiveX control we were using to parse MIME email messages, we discovered it was doing exactly the wrong thing with character sets, so we actually had to write heroic code to undo the wrong conversion it had done and redo it correctly. ...