Author Topic: Non-english characters changed to underscore when renaming a screenshot (Read 6241 times)

Jesper Hertel · « **on:** February 10, 2011, 01:02 PM »

When I rename a screenshot by changing the name in the "Name" box below the image, all Danish characters (æ, ø, å, Æ, Ø, Å) are changed into underscores ("_"). It seems completely unnecessary, as file names can contain these characters without problems.

Could this be fixed?

I am trying to attach an image with an example, but I cannot get a preview of it. But maybe it works when I post it?

Jesper

Screenshot - 2011-02-10 , 19_57_01.png

Jesper Hertel · « **Reply #1 on:** February 10, 2011, 01:03 PM »

Yes, it worked when I posted it :-).

What I typed in the Name field was: "trådløst", not "tr_dl_st" as it was immediately changed into by ScreenshotCaptor.

Jesper Hertel · « **Reply #2 on:** February 10, 2011, 01:04 PM »

I am using v2.89.01 of ScreenshotCaptor, by the way.

Jesper Hertel · « **Reply #3 on:** February 10, 2011, 01:06 PM »

On Vista Home Premium SP2

mouser · « **Reply #4 on:** February 10, 2011, 02:03 PM »

i should be able to fix.

f0dder · « **Reply #5 on:** February 10, 2011, 02:35 PM »

mouser, time to move to a compiler+framework that supports unicode? *nudge nudge*

Jesper: if Århus can change to Aarhus, you can write "traadloest netvaerk"

mouser · « **Reply #6 on:** February 10, 2011, 02:45 PM »

Finding a unicode compiler is not so much a problem.. but where do i trade my old out of date brain in for a unicode brain?

f0dder · « **Reply #7 on:** February 10, 2011, 02:49 PM »

Finding a unicode compiler is not so much a problem.. but where do i trade my old out of date brain in for a unicode brain?
-mouser (February 10, 2011, 02:45 PM)

- not as big a problem as getting all the components you're using updated to unicode-supporting versions, imho.

Jesper Hertel · « **Reply #8 on:** February 10, 2011, 04:22 PM »

Ah, so it's a Unicode problem... There is no other way than changing to Unicode?

Can UTF-8 be used in some way as a half-way solution? I know, though, that UTF-8 has its own challenges with characters of differing lengths from 1 to 6 bytes per character, making string slicing and character counting quite difficult.

But... I am able to type the Danish letters in the input box... Then they have to be converted to 1 byte/character strings? ASCII??

And yes, I could use the translations ae, oe and aa for æ, ø and å, but then I have to think of that every time... I hope to avoid that...

f0dder · « **Reply #9 on:** February 10, 2011, 06:54 PM »

UTF-8 is just one form of unicode representation - generally tends to require more work than the UTF-16 Windows uses internally (this used to be UCS-2 for older NT versions). Long story, though

In general a lot of languages, at least simple ones like our Danish, can be handled without going unicode, because of codepages; for non-unicode apps, Windows internally maps back and forth between UTF-16 and a 8bit codepage-specific encoding.

My guess is this particular bug is because mouser, or one of the components he relies on, uses one of the C++ library functions to ask "is this an ASCII character?" or "is this a printable character?" - which tend to say "nope, it isn't" for characters outside the 7bit ASCII range. In an english-only world, it makes sense to replace those "unprintable" characters with underscores

mouser · « **Reply #10 on:** February 10, 2011, 06:59 PM »

f0dder has it right -- in this case the conversion is simply happening because my code is being overly paranoid about trying to clean up the filename before saving it, getting rid of not just known illegal filename characters but any characters that seem out of the ordinary. I should be able to fix it simply be relaxing the conversion rules -- look for an update in the next few days.

f0dder · « **Reply #11 on:** February 10, 2011, 07:01 PM »

There's probably some locale-specific (whether BCB/C++ runtime or WinApi) version of isprint() (or whatever) you can call

Jesper Hertel · « **Reply #12 on:** February 10, 2011, 08:18 PM »

I should be able to fix it simply be relaxing the conversion rules -- look for an update in the next few days.
-mouser (February 10, 2011, 06:59 PM)

That sounds really great! Thanks!

Jesper Hertel · « **Reply #13 on:** February 10, 2011, 08:19 PM »

And thanks to f0dder for your explanations!

Author Topic: Non-english characters changed to underscore when renaming a screenshot (Read 6241 times)

Jesper Hertel

Non-english characters changed to underscore when renaming a screenshot

Jesper Hertel

Re: Non-english characters changed to underscore when renaming a screenshot

Jesper Hertel

Re: Non-english characters changed to underscore when renaming a screenshot

Jesper Hertel

Re: Non-english characters changed to underscore when renaming a screenshot

mouser

Re: Non-english characters changed to underscore when renaming a screenshot

f0dder

Re: Non-english characters changed to underscore when renaming a screenshot

mouser

Re: Non-english characters changed to underscore when renaming a screenshot

f0dder

Re: Non-english characters changed to underscore when renaming a screenshot

Jesper Hertel

Re: Non-english characters changed to underscore when renaming a screenshot

f0dder

Re: Non-english characters changed to underscore when renaming a screenshot

mouser

Re: Non-english characters changed to underscore when renaming a screenshot

f0dder

Re: Non-english characters changed to underscore when renaming a screenshot

Jesper Hertel

Re: Non-english characters changed to underscore when renaming a screenshot

Jesper Hertel

Re: Non-english characters changed to underscore when renaming a screenshot