Wednesday, March 30, 2005

The latest programming language: "English"

Back in primary school, I recall one of my (few) programmatically-inclined friends telling me he had this cool program which allowed him to simply tell it "I want a golf game, and the user should be able to select from a number of clubs. When you hit the ball with the club it should have physics applied to it," etc. At the time, both of us knew he was joking around. But now it looks like his dream may become a reality. :o

Cheque out this article:
(And thanks to Toria for pointing this out)
Tool turns English to code. That's right. Some tool has gone and done it.

Researchers at MIT have come up with a piece of software, named "Metafor", which takes simple English sentences, given as part of a moderated conversation with the software itself, and converts it into language structures in languages such as Java, Python and Lisp (yay).

Check out this screenshot for a pretty cool demo.

Although it's still got a long way to go, it does seem to be able to generate some simple data structures (there is probably some way to go with algorithms!). The article states that it can do about 20% of the program for you.

The interviewee, Hugo Liu, stated: "Natural language is so semantically rich and flexible that if it could be computationalized as a programming language, maybe everyone could write programs."
I'm thinking... maybe it *isn't* such a good idea! Can you imagine all of the horribly setup data structures if we let the average redneck English-talker in on the vast secrets of programming.

It seems to be designed to teach beginning programmers how to set up data structures. Since most of us are taught to think about structures and even algorithms in English first before coding, this may be a useful logical step. So I'm saying, it looks promising, if only as a teaching aid. I, however, will not give up my semantically structured programming languages for the promise of human code.

One phrase does come to mind: Very very very very high level.

It does not yet have its own wiki. I think I need a break. ;)

In, err, somewhat lower-level programming bloginess, check out this piece of code I found on the L33t programming language website.
#define print(x) main(){printf(x);return 0;} /*
>+++++[<++>-]<[>++++
# +++<-]>++.+.[-]>+++++[<++>-]<.[-][
#
# This polyglot prints "HI" when run in
# Brainf***, C, COW, Perl, Python, Gammaplex, l33t, and ruby
#
# */
print ("HI\n")
#/*
# @X"H"Xr X"I"Xr RE
# moOMoOMoOMoOMoOMoOMOOmOoMoOMoOmoOMOomoo
# mOoMOOmoOMoOMoOMoOMoOMoOMoOMoOmOoMOomoo
# moOMoOMoOMooMoOMooMOOMOomoomoOMoOMoOMoO
# MoOMoOMOOmOoMoOMoOmoOMOomoomOoMooMOOMOo
# moo 5 0 7 99999998 1 7 0 1 8 9999998 1 91
# ] */


Note: The original code, where mine says "Brainf***" featured the F word, which I do not like to include on my blog. Also the second line should be up on the first line, but it was too long to display. I'm not sure if these changes ruined it for any of the languages. If it doesn't work, you know which word to put back in.

This is a polyglot, which sounds like some sort of multiple-throat-infection, but is actually a program which runs in multiple languages.
This one is pretty cool - it works in eight different languages, four of which (Brainf***, COW, Gammaplex and L33t) are esoteric.

You can pretty easily see how it works. The C code for example, consists of just the #define on the first line. Then there is a C-style comment (/* */) for most of the rest of the program. So C doesn't have to worry about all the other crap.

Of course, the inclusion of the esoteric languages made this probably easier to write, since Brainf***, l33t and COW do not care about any characters besides their own (><+-,.[] in Brain***, numerals in l33t, and variants on the word "Moo", "Mmm" and "Oom" in COW). Therefore they pretty much have separate programs written and treat the rest as comments.

I've verified the code for C, Brainf*** and COW. L33t is a very very difficult language - I've mostly verified it (by hand) but I can't quite work out one of the instructions - (by my calculations it would all be fine if the word "l33t" wasn't in the program - perhaps it isn't meant to be there?). Also l33t does not seem to generate the \n (newline) character while the other languages I've checked out do.

There are more exciting polyglots on this wiki page. And an even cooler one here which actually works in eight real languages - COBOL, Pascal, Fortran, C, Postscript, Shell script {shudder}, Perl, and even x86 machine language! Czech it out!

Lisa: "Dad you can't judge a place you've never been to."
Bart: "Yeah, that's what they do in Russia."

2 Comments:

At 11:00 am, Blogger Eat_My_Shortz said...

Wow! I just tried running the x86 machine code version and it worked! No compilation/assembly needed, this is an actual binary!

DOS/Windows users:
1. Download the ZIP file - that is formatted to the CRLF (DOS) standard.
2. Unzip and get out the text file, rename it to ".com". (COM is a raw-formatted executable used by DOS).
3. Run it from the command line.

Unix/Linux users:
1. Download the TEXT file - that is formatted to the LF (UNIX) standard.
2. Set execute permission.
3. Run it from the command line.

Isn't that cool!?

 
At 11:36 am, Blogger Toria/Deb said...

Totally, awesome, amazingly cool, indeed. And to think that silly old me, just cruising 'ole /. came across a actually "interesting" article ;) hee hee, glad you liked it and Whoopee, you blogged about it :) :) :) Now, all you programmer types, get working!

 

Post a Comment

<< Home