Author Topic: The New line character \n\n (Read 10724 times)

Gothi[c] · « **on:** September 01, 2008, 11:56 AM »

I stumbled on the NewLine article on wikipedia today during some random browsing, and I never expected it to be so fascinating.

It shows some interesting history, and how many different Operating Systems have different control characters to represent a new line, and for different reasons.

There is also some confusion as to whether newlines terminate or separate lines. If a newline is considered a separator, there will be no newline after the last line of a file. The general convention on most systems is to add a newline even after the last line, i.e., to treat newline as a line terminator. Some programs have problems processing the last line of a file if it isn't newline terminated. Conversely, programs that expect newline to be used as a separator will interpret a final newline as starting a new (empty) line. This can result in a different line count being reported for the file, but is otherwise generally harmless.

wow... I never even thought about it that way. I guess it can be interpreted both as a separator and as a terminator. I can see how some interoperability problems could occur. This is one of those things that, when I really think about it, makes me realize that it's a small miracle that software as we have it today works at all! - When programs can't even agree on what a new line is, or even how to treat it.

It doesn't stop there, the problem persists on our internets:

Most textual Internet protocols (including HTTP, SMTP, FTP, IRC and many others) mandate the use of ASCII CR+LF (0x0D 0x0A) on the protocol level, but recommend that tolerant applications recognize lone LF as well. In practice, there are many applications that erroneously use the C newline character '\n' instead (see section Newline in programming languages below). This leads to problems when trying to communicate with systems adhering to a stricter interpretation of the standards; one such system is the qmail MTA that actively refuses to accept messages from systems that send bare LF instead of the required CR+LF.

jgpaiva · « **Reply #1 on:** September 02, 2008, 04:39 PM »

Hey, just found this on the blog.
Good find, Gothi[c]!
I didn't know about that CR+LF rule, I think it'll be useful in the future

[edit]
On the same page, read about the origin of CR+LF:

The sequence CR+LF was in common use on many early computer systems that had adopted teletype machines, typically an ASR33, as a console device, because this sequence was required to position those printers at the start of a new line. On these systems, text was often routinely composed to be compatible with these printers, since the concept of device drivers hiding such hardware details from the application was not yet well developed; applications had to talk directly to the teletype machine and follow its conventions. The separation of the two functions concealed the fact that the print head could not return from the far right to the beginning of the next line in one-character time. That is why the sequence was always sent with the CR first. In fact, it was often necessary to send extra characters (extraneous CRs or NULs, which are ignored) to give the print head time to move to the left margin. Even after teletypes were replaced by computer terminals with higher baud rates, many operating systems still supported automatic sending of these fill characters, for compatibility with cheaper terminals that required multiple character times to scroll the display.
-http://en.wikipedia.org/wiki/Newline

[/edit]

lanux128 · « **Reply #2 on:** September 02, 2008, 11:01 PM »

now this shed some light on why some subtitles (*.sub, *.srt) display well on my PC but not on my media player.

Author Topic: The New line character \n\n (Read 10724 times)

Gothi[c]

The New line character \n\n

jgpaiva

Re: The New line character \n\n

lanux128

Re: The New line character \n\n