topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Friday March 29, 2024, 2:31 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Expect some final server hiccups this weekend (2/12/11) - final fixes  (Read 11293 times)

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
We are going to try to finally fix the character set issue on the forum this weekend, so buckle up and expect some hiccups this weekend (2/12/2011).
« Last Edit: February 12, 2011, 07:20 PM by mouser »

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/5/11) - final fixes
« Reply #1 on: February 10, 2011, 04:57 AM »
it was so hairy we had to put it off for a week, but after a day of high stress panic, i think we finally figured out the original cause of the character set problems, and a final fix that should make everything perfect going forward.  forum should be down for a few minutes on thursday while we do the final tweaks.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/12/11) - final fixes
« Reply #2 on: February 12, 2011, 07:22 PM »
Ok the forum was offline for a few minutes today while we implemented the final phase of resolving the character set issues on the forum -- which resulted in full switch over to a UTF8 database.

Everything should be good now, nice and clean and working smoothly.  If you notices any issues of course please speak up.

If people are curious I can post more about the character set issues and our sleuthing and final solutions.

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/12/11) - final fixes
« Reply #3 on: February 13, 2011, 04:54 AM »
If people are curious I can post more about the character set issues and our sleuthing and final solutions.
Please do!
- carpe noctem

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/12/11) - final fixes
« Reply #4 on: February 13, 2011, 05:41 AM »
Well basically here is what seems to have happened:

On the old server, the old mysql forum database was using character set latin1_swedish_ci.
And the forum software was configured to serve up pages using html charset=ISO-8859-1.

This combination worked fine in the past and everything was right with the world.



Ok so now we get to the server move, where we backed up the old forum database and imported it into a new database on the new server.

Now when you backup and restore (export and import) mysql databases (using mysqldump) it appears there are some tricky pitfalls to watch out for regarding character sets.

Now our intention was to keep the exact same database character set.  But when we imported the database on the new server, *something* changed in the character set.  But we weren't exactly sure how or why.. The mysql tables were still latin1_swedish_ci but the data seemed to be being served in UTF8 characters.

At that time we had some threads on the forum about character set issues after the move. That was when I made my first mistake, which was rather than to figure out the exact cause of the problem, I found a solution without understanding the problem completely.  The solution was to switch the forum software into UTF8 mode -- which basically serves up pages labeled as having character set UTF-8 in the html headers.

This essentially solved the character set problems on the forum, and we ran that way for a month.

Except.. There were some very small lingering issues.



The lingering issues were a rare and non-fatal database warning about "Illegal mix of collations", and the fact that at least one non-english signature (dc member fenixproductions signature) stopped working (turned to ????).

Eventually what we did this weekend to solve the remaining problems is re-drop the database and re-import it but this time force all tables and columns to UTF-8.  The source of the remaining issues was that the data was in UTF-8 format and the tables were still marked as latin1_swedish_ci.  So all is good now and every player in the chain is agreed that the data is pure UTF-8.



Now although it may seem like a straightforward solution in hindsight, what really through me for a loop and caused so much consternation and struggle with all this is that after we decided last week that we needed to get to the bottom of the remaining problems, we couldnt figure out WHY the data was being served up from mysql in UTF8 format, when it had always been latin1_swedish_ci.  And we were really nervous about trying to convert everything to UTF8.  smf forums has a script to convert data to UTF8 and we actually tried running as our first attempt at a fix, on our backup forum database, and it failed badly enough and subtley enough to make me really paranoid about converting to UTF8.

It was this fear that some post data might not convert properly to UTF8 using the scripts that made me resist the idea of trying to switch over to UTF8, and more focused on getting the forum database back to its original character set.

In the end though, what *appears* to have happened is that when the database is exported/dumped, it was automatically, without any clues, being CONVERTED by mysql to UTF8 in the export dump.  So that when we imported it back in, even with explicit latin1_swedish_ci tables, it ended up storing the newly converted to UTF8 data in those tables, without our even realizing it.  After satisfying myself that all of the forum data actually was converted to UTF8 smoothly, we decided to embrace UTF8 and so that is what we are using now and going forward.



And with yet another international character set experience under my belt, I can safely say: There is no god.

These character set issues and interactions are truly a nightmare of epic proportions and we can only hope that if there is life on other planets they have figured out how to use a single and SIMPLE (1 byte) universal character set, and that they invade us soon and impose their language upon us.  I welcome our new 1-byte-character-set overlords.

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,612
    • View Profile
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/12/11) - final fixes
« Reply #5 on: February 13, 2011, 08:42 AM »
I welcome our new 1-byte-character-set overlords.

Assuming their bytes are sized 16 of our bits, all would be just fine ;)

Thanks for the detailed explanation.

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/12/11) - final fixes
« Reply #6 on: February 13, 2011, 09:15 AM »
I welcome our new 1-byte-character-set overlords.
Assuming their bytes are sized 16 of our bits, all would be just fine ;)
Actually, it wouldn't - the 16-bit UCS-2 (as opposed to Windows' UTF-16) unfortunately doesn't cover every possibly glyph. I've been considering whether it wouldn't be simpler if we just nuked every country with stuff that can't be represented in UCS-2, but there's probably mathematicians or physicists that disagree :) (and humanists, but who cares about them?)
- carpe noctem

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/12/11) - final fixes
« Reply #7 on: February 13, 2011, 09:20 AM »
I think we should hold a planet-wide lottery -- everyone fills out a ballot writing in their choice of one written language and one spoken language.  then we pick one ticket out of a hat, and we all convert to that and ban the rest.

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,288
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/12/11) - final fixes
« Reply #8 on: February 13, 2011, 10:01 AM »
I think we should hold a planet-wide lottery -- everyone fills out a ballot writing in their choice of one written language and one spoken language.  then we pick one ticket out of a hat, and we all convert to that and ban the rest.


So, I take it that you're up for learning Mandarin or Urdu? :D
Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,612
    • View Profile
    • Donate to Member
Re: Expect some final server hiccups this weekend (2/12/11) - final fixes
« Reply #9 on: February 13, 2011, 10:17 AM »
So, I take it that you're up for learning Mandarin or Urdu? :D

It's all Greek to me :tellme: