Getting rid of context dependency in programming languages

Getting rid of context dependency in programming languages

An Example to illustrate the problem: There are only three kinds of parentheses "()[]{}" and two quotation marks (" and ') in pretty much all todays languages. This is a totally unnecessary limitation today when we have unicode and utf8. Sure there are limited amount of keys on the keyboard, but there ways to get around that (see below).


In the 70s when the C and C-style languages were developed, one had only 7 bit characters and in the 80s and 90s one had 8 bit characters but Microsoft/unix/apple couldn't agree on a common standard so it made sense to keep the characters within those character-sets that where common to the different standards.

The way one dealt with the limited amount of characters was to let them be context dependent. (I think the philosofy was that it could work just as normal languages, where words are context dependent).

Letting characters and symbols be context dependent doesn't add anything to the user, it just requires the parsing tools (compilers, optimizers, syntax coloring tools ...) to be more complicated (typically by having an intermediate representation of the language).


In Uppsala university in Uppsala where I live they have a "compilers course" which I haven't read yet, so I am a bit on thin ice here but:

I think one might be able to skip the intermediate representation of the compiler/optimizer and other tools and thereby make the tools much easier.

Drawbacks of using limited number of characters (and letting characters be context dependent):

Complexity of the tools

Lots of modern programing languages doesn't have any optimization at all. Javascript has, (php didn't have any when I last checked, but I've read that they might have fixed it). Ruby, python doesn't have any optimization. A test one can make is for($i=0; $i<BIGNUMBER; $i++){ $a=$a+$i; }; (although I think javascript fails this)


I also read somewhere that the optimization of JVM and V8 is basically thanks of the same guy. My conclusion from this is that the lack of optimization in interpreted languages is simply because its hard.

C/C++, and Java have optimization, so good for them, but it is via a compiler. (And in my mind they have other drawbacks (every language has drawbacks) (I do like C/C++ in some occasions, so don't get me wrong))

Limited functionality:

Security problems

Ex: json-hijacking (a security problem in javascript) (The problem is that the "{}"-parentheses are used both for blocks of code and also for declaring objects)

Drawbacks of using more (utf8) characters, and ways to deal with it

The drawback of using more characters is of course at the input, the keyboard is limited. I can think of two ways to deal with this:


Magnus Andersson 1EaH9tedmQSM7oJLtp2wsGixTZ7JfKqp2t