This summer I’ve been doing something about the fact that I don’t get
as much chance to work with C as I would like. One of the problems
I’ve had is that I just don’t have enough programming problems that
would require the badassery of C. 99% of the time it is much easier
for me to dash off a program in Python and be done quickly. Reflecting
on why it is that I tend to not use C I realized that Python has a
list of features that are so extremely useful that C is hard to
justify. Giving this some thought, I realized that I have been missing
the point of C. No, C does not come with the features I love about
Python, but nor does it come with the "features" I don’t necessarily
care for in PHP. What it does come with is a blank slate, the minimum
components to express yourself in software (that is portable). And
here’s the important part - if something is missing in C that you
think is important, you are free to implement it and #include
it
for the rest of your life. Just the way you like it.
With this new spirit I’ve been merrily attacking problems in C which I normally would expediently solve with Python. I’ve been adding my own strategies for dealing with dynamic memory, strings, data structures, etc. I find C is already extremely good at dealing with Linux calls, file system operations, unvarnished IO, etc. And once into seriously hard stuff (regexp, hard math, etc) C has libraries that are often the last word on the topic.
When I started this agenda, I had hesitated because there was one thing about Python that I loved above all other features in all programming languages I’d ever seen and that was semantic whitespace. I was thinking that it would be a real downer to have to start using all those horribly inscrutable curly braces like sloppy programmers use. I say sloppy programmers because if you are not sloppy, as any Python programmer will tell you, you do not need curly braces to delimit blocks. Basically this horrible syntax element, "{…}", rewards you with the license to write sloppy code. No thanks, and no thanks.
I realized that Python is a C program. There is no reason why I couldn’t use C to write a conversion parser that took correctly indented braceless clean code and output compilable C code. This was a great test of C done the Right Way, i.e. my way. Your way may differ. That’s the beauty of C.
Although the details were quite a puzzle, I finally got the core functioning. The main test was to strip down a version of the program to my version of C, convert it with the unstripped version, and then compile that result. Here is what my code looks like for the function that scans a long character array containing the code and printing a valid C program along the way (i.e. this function is an example of input as well as the primary functionality).
The extension for this style is ".cno" as in "C, no braces" or "snow", the white style of C.
int scan_subsection(char *s, int l)
int n, i, line_end, line_start, in_indent=1, ilevel=0, plevel=0;
ss *thess= NULL; // A simple stack.
for (n= 0;n<l;++n)
if (in_indent)
if (s[n] != ' ')
in_indent= 0;
line_start= n;
else
++ilevel;
if (s[n] == '\n')
if (!in_indent)
line_end= n;
in_indent= 1;
if (ilevel > plevel)
thess= ss_push(plevel,thess);
printf(" \173 \057\057 %d\n",ilevel);
else if (ilevel < plevel)
printf("\n");
while (thess && ilevel < plevel)
spacesx(ss_val(thess));
printf("\175 \057\057 + %d\n",plevel);
plevel= ss_val(thess);
thess= ss_pop(thess);
if (plevel != ilevel)
printf("Indentation error at %d!\n",line_start);
else
printf("\n");
spacesx(ilevel);
for (i= line_start;i<line_end;++i)
printf("%c",s[i]);
plevel= ilevel;
ilevel= 0;
while (thess)
printf("\n");
spacesx(ss_val(thess));
printf("\175 \057\057 %d wrap-up\n",ss_val(thess));
thess= ss_pop(thess);
free(thess);
return -1;
Obviously syntax highlighting doesn’t work (yet). Here is the output, i.e. hopefully normal C.
int scan_subsection(char *s, int l) { // 4 int n, i, line_end, line_start, in_indent=1, ilevel=0, plevel=0; ss *thess= NULL; // A simple stack. for (n= 0;n<l;++n) { // 8 if (in_indent) { // 12 if (s[n] != ' ') { // 16 in_indent= 0; line_start= n; } // + 16 else { // 16 ++ilevel; } // + 16 } // + 12 if (s[n] == '\n') { // 12 if (!in_indent) { // 16 line_end= n; in_indent= 1; if (ilevel > plevel) { // 20 thess= ss_push(plevel,thess); printf(" \173 \057\057 %d\n",ilevel); } // + 20 else if (ilevel < plevel) { // 20 printf("\n"); while (thess && ilevel < plevel) { // 24 spacesx(ss_val(thess)); printf("\175 \057\057 + %d\n",plevel); plevel= ss_val(thess); thess= ss_pop(thess); } // + 24 if (plevel != ilevel) { // 24 printf("Indentation error at %d!\n",line_start); } // + 24 } // + 20 else { // 20 printf("\n"); } // + 20 spacesx(ilevel); for (i=line_start;i<line_end;++i) { // 20 printf("%c",s[i]); } // + 20 plevel= ilevel; ilevel= 0; } // + 16 } // + 12 } // + 8 while (thess) { // 8 printf("\n"); spacesx(ss_val(thess)); printf("\175 \057\057 %d wrap-up\n",ss_val(thess)); thess= ss_pop(thess); } // + 8 free(thess); return -1; } // + 4
It turns out that it was pretty hard to get (bootstrap) a program that could repeat the process. I had a version that would convert a cno version to a C program that actually compiled but that executable could not replicate the trick. The problem turned out to be an indentation error in the original C which should never be allowed to happen! That’s a major point of this way of doing things!
You can see that I named the function scan_subsection
. This is all
very early work and I’ll probably change that name but I point it out
because I originally thought to use recursion to do this job. This is
also why the function still returns -1 (stop condition) even though
that is no longer used. Recursion seemed a reasonable and clever way
to use the call stack to store what level was being processed. The
problem is that you need so much context from other parts of the code
that it became too complex (for me) to do that way. I’m not saying it
can’t be done. Cleverer programmers than I could surely do it, but I
eventually just implemented my own stack to keep track of indent level
and the complexity eased up quite a bit.
This is just a rough prototype at the moment but it serves as a nice
proof of concept that C code does not need the suboptimal syntax it
was originally planned with. Next I’ll focus on the semi-colons which
should be optional in a multi-line program. Also comments need to be
nicely handled and there are a few C details that need special
treatment such as struct
definition. Overall, I am quite pleased
with this small bit of progress.
UPDATE 2020-01-12
I am not alone!