This summer I’ve been doing something about the fact that I don’t get as much chance to work with C as I would like. One of the problems I’ve had is that I just don’t have enough programming problems that would require the badassery of C. 99% of the time it is much easier for me to dash off a program in Python and be done quickly. Reflecting on why it is that I tend to not use C I realized that Python has a list of features that are so extremely useful that C is hard to justify. Giving this some thought, I realized that I have been missing the point of C. No, C does not come with the features I love about Python, but nor does it come with the "features" I don’t necessarily care for in PHP. What it does come with is a blank slate, the minimum components to express yourself in software (that is portable). And here’s the important part - if something is missing in C that you think is important, you are free to implement it and #include it for the rest of your life. Just the way you like it.

With this new spirit I’ve been merrily attacking problems in C which I normally would expediently solve with Python. I’ve been adding my own strategies for dealing with dynamic memory, strings, data structures, etc. I find C is already extremely good at dealing with Linux calls, file system operations, unvarnished IO, etc. And once into seriously hard stuff (regexp, hard math, etc) C has libraries that are often the last word on the topic.

When I started this agenda, I had hesitated because there was one thing about Python that I loved above all other features in all programming languages I’d ever seen and that was semantic whitespace. I was thinking that it would be a real downer to have to start using all those horribly inscrutable curly braces like sloppy programmers use. I say sloppy programmers because if you are not sloppy, as any Python programmer will tell you, you do not need curly braces to delimit blocks. Basically this horrible syntax element, "{}", rewards you with the license to write sloppy code. No thanks, and no thanks.

I realized that Python is a C program. There is no reason why I couldn’t use C to write a conversion parser that took correctly indented braceless clean code and output compilable C code. This was a great test of C done the Right Way, i.e. my way. Your way may differ. That’s the beauty of C.

Although the details were quite a puzzle, I finally got the core functioning. The main test was to strip down a version of the program to my version of C, convert it with the unstripped version, and then compile that result. Here is what my code looks like for the function that scans a long character array containing the code and printing a valid C program along the way (i.e. this function is an example of input as well as the primary functionality).

The extension for this style is ".cno" as in "C, no braces" or "snow", the white style of C.

int scan_subsection(char *s, int l)
    int n, i, line_end, line_start, in_indent=1, ilevel=0, plevel=0;
    ss *thess= NULL; // A simple stack.
    for (n= 0;n<l;++n)
        if (in_indent)
            if (s[n] != ' ')
                in_indent= 0;
                line_start= n;
            else
                ++ilevel;
        if (s[n] == '\n')
            if (!in_indent)
                line_end= n;
                in_indent= 1;
                if (ilevel > plevel)
                    thess= ss_push(plevel,thess);
                    printf(" \173 \057\057 %d\n",ilevel);
                else if (ilevel < plevel)
                    printf("\n");
                    while (thess && ilevel < plevel)
                        spacesx(ss_val(thess));
                        printf("\175 \057\057 + %d\n",plevel);
                        plevel= ss_val(thess);
                        thess= ss_pop(thess);
                    if (plevel != ilevel)
                        printf("Indentation error at %d!\n",line_start);
                else
                    printf("\n");
                spacesx(ilevel);
                for (i= line_start;i<line_end;++i)
                    printf("%c",s[i]);
                plevel= ilevel;
                ilevel= 0;
    while (thess)
        printf("\n");
        spacesx(ss_val(thess));
        printf("\175 \057\057  %d wrap-up\n",ss_val(thess));
        thess= ss_pop(thess);
    free(thess);
    return -1;

Obviously syntax highlighting doesn’t work (yet). Here is the output, i.e. hopefully normal C.

int scan_subsection(char *s, int l) { // 4
    int n, i, line_end, line_start, in_indent=1, ilevel=0, plevel=0;
    ss *thess= NULL; // A simple stack.
    for (n= 0;n<l;++n) { // 8
        if (in_indent) { // 12
            if (s[n] != ' ') { // 16
                in_indent= 0;
                line_start= n;
            } // + 16
            else { // 16
                ++ilevel;
            } // + 16
        } // + 12
        if (s[n] == '\n') { // 12
            if (!in_indent) { // 16
                line_end= n;
                in_indent= 1;
                if (ilevel > plevel) { // 20
                    thess= ss_push(plevel,thess);
                    printf(" \173 \057\057 %d\n",ilevel);
                } // + 20
                else if (ilevel < plevel) { // 20
                    printf("\n");
                    while (thess && ilevel < plevel) { // 24
                        spacesx(ss_val(thess));
                        printf("\175 \057\057 + %d\n",plevel);
                        plevel= ss_val(thess);
                        thess= ss_pop(thess);
                    } // + 24
                    if (plevel != ilevel) { // 24
                        printf("Indentation error at %d!\n",line_start);
                    } // + 24
                } // + 20
                else { // 20
                    printf("\n");
                } // + 20
                spacesx(ilevel);
                for (i=line_start;i<line_end;++i) { // 20
                    printf("%c",s[i]);
                } // + 20
                plevel= ilevel;
                ilevel= 0;
            } // + 16
        } // + 12
    } // + 8
    while (thess) { // 8
        printf("\n");
        spacesx(ss_val(thess));
        printf("\175 \057\057  %d wrap-up\n",ss_val(thess));
        thess= ss_pop(thess);
    } // + 8
    free(thess);
    return -1;
} // + 4

It turns out that it was pretty hard to get (bootstrap) a program that could repeat the process. I had a version that would convert a cno version to a C program that actually compiled but that executable could not replicate the trick. The problem turned out to be an indentation error in the original C which should never be allowed to happen! That’s a major point of this way of doing things!

You can see that I named the function scan_subsection. This is all very early work and I’ll probably change that name but I point it out because I originally thought to use recursion to do this job. This is also why the function still returns -1 (stop condition) even though that is no longer used. Recursion seemed a reasonable and clever way to use the call stack to store what level was being processed. The problem is that you need so much context from other parts of the code that it became too complex (for me) to do that way. I’m not saying it can’t be done. Cleverer programmers than I could surely do it, but I eventually just implemented my own stack to keep track of indent level and the complexity eased up quite a bit.

This is just a rough prototype at the moment but it serves as a nice proof of concept that C code does not need the suboptimal syntax it was originally planned with. Next I’ll focus on the semi-colons which should be optional in a multi-line program. Also comments need to be nicely handled and there are a few C details that need special treatment such as struct definition. Overall, I am quite pleased with this small bit of progress.

UPDATE 2020-01-12

I am not alone!

jc.jpg