Tuesday, March 22, 2005

Tab Wars


EPISODE II
It is a period of civil unrest in programming circles as the interesting and highly irrelevant conflict over programming style rages on. While all citizens agree on the importance of indentation, turmoil has arisen with the emergence of the TAB CHARACTER.

Striking from a secret base on ASCII 9, the Tab stealthily convinces amateur and professional programmers alike to use its alluring single-character-indent to create indented code, yet its highly display-dependent behaviour could spell certain doom for GOOD PROGRAMMING PRACTISES.

It is now up to a small band of intrepid programmers to rebel against the Tab to restore order to the University...
Hehehe... OK its time to go public on this! (You're real crayzeh Matt!)

Obviously the debate is not new. Wikipedia even makes reference to it in the Tab article:
In computer programming, the use of tabs for code formatting and indentation is an ongoing debate. Programmers are generally divided into two camps - those who use hard tabs in their code, and those who configure their editors to insert actual space characters when they press the tab key. When tabs are replaced to spaces in this way they are referred to as soft tabs.
The reason I am currently in a ranting mood is that I have just gotten my verification on my programming project back and failed every test because my output did not match the expected. Even after I spent an hour and a half in the lab trying to get my output to match, I could not. Finally, it turned out the "expected" output was using hard tabs (ASC-9) while my program was using soft tabs (a set of SPACEs, ASC-32). What made it altogether confusing was that some of the tabs looked like spaces, some of them looked like tabs (only when selected) and in general the output, being displayed in three different environments looked completely different.

Rant over, now time for the arguments. This is a long-running debate I've been having with Tim - so feel free to comment any arguments you may have (I won't post your emails without permission).
I simply find so-called "hard tabs" to be very annoying (and feel I have a right to complain since I've been editing an awful lot of other people's hard-tabbed code as of late). Essentially, one will notice that every OS and in fact every program can choose to display tabs differently. Some display them as four spaces, some display as eight. Even more annoying is when they simply choose a number of pixels, and it doesn't line up with any text columns at all.

ASCII 9 is the only displayable ASCII character (besides line-break) which does not take up exactly one char-width in a monospaced font. Therefore it ruins the column spacing (ie. programs you wish to keep within 80 columns will be difficult to count) - furthermore this makes it impossible to get lines with different numbers of tabs to line up. The bottom line is: hard-tabs give programmers no control over how their code appears.

With soft-tabs (ie. spaces) it is possible to have complete control over appearance. All characters are the same width and display is identical on all monotype displays. Surely any style-control freak wants this advantage.

There seem to be only two advantages of hard-tabs - smaller file size, and ease of typing. Smaller source code files are hardly relevant today. And most modern editors will (or should) automatically type a predetermined number of spaces when the Tab key is hit. Personally, when working with tab characters, I find it disturbing when the tabs at the end of a line (eg. between code and comment) move around as I'm typing.

The only argument I find remotely valid is the one that tabs make code more "dynamic" ie. the indent length is not fixed after code has been typed - the user can alter the display dynamically. The simple response to this is - I have never seen a program which allows you to choose the internal representation of tab size. Therefore it is not up to the user, but the user's program, as to how tabs are represented. Don't like.

And finally, lets see what Wiki has to say on the Indent style page:
Many early programs used tab characters for indentation, for simplicity and to save on source file size. Unix peripherals would generally have tabs equivalent to eight characters, while Macintosh environments would set them to four, creating confusion when code was transferred back and forth. Modern programming editors are now often able to set arbitrary indentation sizes, and will insert the appropriate combination of spaces and tabs. They can also help with the cross-platform tab confusion by being configured to insert only spaces.
Tab:
There are many arguments for and against using hard tabs in code. What can be said without doubt is that one early benefit of tabs, i.e. compression (see above), is now less relevant as storage is so cheap, and sophisticated compression algorithms can provide much greater benefits.

Tabs versus Spaces: An Eternal Holy War

So go ahead, rant your brains out, refute, agree... it will be interesting to see if you join the rebellion or empire.

"The section now illuminated is the Floating Point Unit, one of my personal favorite units."
- Professor Frink

15 Comments:

At 9:00 pm, Blogger Eat_My_Shortz said...

Oh, and I'd like to add some languages like Haskell have a strictness with indentation. If tabs are used, programmers cannot be sure how the language itself is interpreting the indent.

 
At 9:03 pm, Blogger Andrew said...

Great post.

I like the concept of soft tabs better.

 
At 10:27 pm, Blogger Toria/Deb said...

You say this "The bottom line is: hard-tabs give programmers no control over how their code appears." and the guy on here http://www.jwz.org/doc/tabs-vs-spaces.html

says this " I just care that two people editing the same file use the same interpretations, and that it's possible to look at a file and know what interpretation of the TAB character was used, because otherwise it's just impossible to read."
and even though I don't really understand the true basis of the arguement, I'd say to heck with hard tabs. :) Mind you, I may a leeetttllleeee biased ;)

 
At 10:48 pm, Blogger Eat_My_Shortz said...

Yay andrew. Thx.
Toria - umm although ur not a programmer (hehe) thx for ur comment too. I'm not sure which way that quote is going, it actually seems to be FOR hard-tabs.

I guess I was a little vocal before. Its just that every time I try to use hard-tabs it pisses me off to no end. As far as I'm concerned the world would be a better place if there was no such thing as ASCII 9.

 
At 11:18 pm, Blogger Tim Cuthbertson said...

Long live ASCII 9!

"I have never seen a program which allows you to choose the internal representation of tab size."
That's a lie, that is! Both text editors I use (notepad doesn't count) allow the internal tab space to be set in number of characters. And I know for a fact that you've used one of these, being dev-cpp.

Some people like big indents (6 spaces), some like small indents (2 spaces). With hard tabs this preference can be set individually by the user. With soft tabs we have to go with whatever the author happens to like. Also, It's far easier to convert hard tabs to soft tabs, but mich more troublesome to convert spaces into tabs (lest we accidentally convert spaces that are in the code themselves).

Tabs give more semantic meaning to an indented document. In a word processor should you keep making line breaks until you get to the end of a page, or should you use a page break? You should use a page break, because it is a single entity which is designed for that purpose. The same goes for tabs.

BTW - good post. It sure got me fired up ;)

 
At 11:38 pm, Blogger Eat_My_Shortz said...

Ahh u can always rely on Tim to come up with an explanation which makes u stop and think...

>I know for a fact that you've used one of these, being dev-cpp.
Erm... have i? Damn.

>With soft tabs we have to go with whatever the author happens to like.
Yes thats also true, but I'd argue thats because the author set it up nice.
By that logic one could also argue for a system whereby variables are given indices and then the IDE reads from a list of indices to give variables a name, such that the current reader of the code can choose the variable names that suit them.
The point is - the author makes the choices. THEY ARE THE AUTHOR.

>Also, It's far easier to convert >hard tabs to soft tabs, but mich >more troublesome to convert spaces >into tabs
Can't argue with that (the fact its easier).

Heres a good analogy - cordial comes neat but u then mix with water to make something actually good. Its easier to make neat cordial into mixed cordial but bloody hard to go back the other way.
But which is more useful? Do u actually want to go back?

>Tabs give more semantic meaning to an indented document.
Stop using that word "semantic". It doesnt mean that!

Ahh I see your point with page breaks. Some editors leave a gap and others dont.
Well - word processors which dont should be shot.
Text editors do not, but they read ASCII and are a totally different format.

 
At 12:31 am, Blogger Tim Cuthbertson said...

Well I like the word semantic, so sue me ;P

"The point is - the author makes the choices. THEY ARE THE AUTHOR."
This is true, and your case of the variable naming is a good one. But variable names give meaning, whereas indentation is merely a visual element. And people have preferences about visual elements. Would you want to have to stick to the author's syntax colour-coding scheme? Editors are configurable to our individual needs and preferences, and I don't see why indentation should have to conform to the author's preference.

Also, we all have variable naming conventions that we stick to. If we could apply these automatically to others' code I'm sure we would, but in that case it's simply too difficult to automate.

Kind of true about the cordial thing, although you're missing the fact that I would be drinking the straight cordial in this case, as would many others. I never convert tabs to spaces, as I have no objection to the TAB character. So
since some people do drink it straight, won't you please think of the suger-hyped kiddies and let them decide weather they want it diluted or not?

Oh and you made a point earlier about haskell depending on indentation. The point is, as long as you are consistent (ie you don't use tab for some lines and spaces for others) it is irrelevant weather tabs or spaces are used.

hehe this is fun.. I should be in bed, but pffft ;)

 
At 12:54 am, Blogger Eat_My_Shortz said...

>hehe this is fun..
Yess.... at last a blogwar!
And a friendly one too!

>Would you want to have to stick to the author's syntax colour-coding scheme?
Yeah true. But what about brace style. I know I like BSD/Allman and you like K&R. (This battle I am not prepared to fight since I am not so patriotic about it ;)) But the point is - every author chooses their style too. Thats NOT like vairbale names. Its like visual style ;)

>The point is, as long as you are >consistent...
Man, i have seen a lot of peoples projects tonight. We're talking about (lets say ____ is tab and '.' is space)
____.____..Code.____./*.Comment____.*/
EEP

>you're missing the fact that I
>would be drinking the straight
>cordial in this case, as would many
>others.
No, no. That's exactly my point.
IT'S BAD FOR YOU AND YOU'RE ALL NUTS!

 
At 1:33 am, Blogger Toria/Deb said...

LOL @ U 2 and your "cordial" comparisions. My take is that you've both got valid, equal arguments and there's no "right or wrong" here. I think, honestly, it's a matter of adopting a standard of practice, which doesn't seem to have happened quite yet. According to a programmer friend here's his take on it "Tabs are fine, and are used by professional and long term programmers. They're only dangerous when you don't have your software client configured properly and it uses a different encoding for a TAB and inserts a different, more subjective meaning into the code." I, myself, don't know enough about it to make a opinion.

As far as Cordial straight, *blech* gag me green! But hey, I'm not biased right? ;)

Good luck in the next 2-3 days and see you on the other side. Yayyyyy, Easter and chocolate and bunnies and daffodils and tulips and soft, cuddly toy bunnies coming up. May the kid in "all" of us emerge at Easter. Happy Easter to all!

 
At 2:02 am, Anonymous Anonymous said...

Oh fsck the /bunnies

>The point is - the author makes the choices. THEY ARE THE AUTHOR.
Yes and you are (for some reason or other) trying to read the code. Some code layouts are confusing and hard to read - others may also find your style hard to read.

That's why reformatter utilities exist and group projects often have strict formatting and code style rules so the developers involved don't have arguments/heated debate.

How's that for a nice neutral answer ;)

 
At 2:06 am, Blogger Eat_My_Shortz said...

I get the feeling I'm losing the argument, not from the debate point of view, but just from sheer numbers (he seems neutral on here, but on MSN its a different story :D)

Anyway, reformatters sound like a good idea. Especially if you get one of the dodgy spyware ones which read your code and send it secretly to the author....

Anyway, I am looking forward to this argument resurfacing in 3rd year when we actually have to have a project! Rest assured I will make sure the code style on my team is top-notch. (I'll try not to be too fussy about tabs, but the other shonky code practises will have to go!)

 
At 2:10 am, Anonymous Anonymous said...

>Anyway, reformatters sound like a good idea. Especially if you get one of the dodgy spyware ones which read your code and send it secretly to the author....

Now that's a new idea. I'll have to check the source to KDevelop now ;o

 
At 2:12 am, Blogger Eat_My_Shortz said...

Well... its just a little paranoid fear I have... that programs you ask to read your code are going to do more than you think ...

 
At 2:24 am, Anonymous Anonymous said...

News flash.
GCC, KDevelop, Autoconf, Automake, Libtool and many more are under intense scrutiny for possible Remote Code Transmission (RCT).
No comment from Linus Torvalds yet. Microsoft declined to comment on any specifics of VB or Visual Studio .NET while hinting that developers should check their product EULA documents carefully and "some active intellectual property and patent methods are incorporated in the compiler toolchain".

 
At 3:02 am, Blogger Eat_My_Shortz said...

What does that mean?
And are you having me on?

(Hey! This is my blog! I decide who should be had on!)

 

Post a Comment

<< Home