"C is quirky, flawed, and an enormous success."

— Dennis Ritchie

These notes are not a complete tutorial or reference. They are a useful collection of important topics for someone who has programmed in C but might be rusty. The idea is that the material here can get you started with a programming project quickly.

Contents

Compiling

C is a compiled language meaning that it’s source code needs to be translated into something the computer can understand (in its entirety) before it is actually run. The "compiler" does this.

Then typical way to compile a program looks like so:

gcc -o typical typical.c

If you don’t specify a -o option (for output) your executable program will be named a.out which is not terrifically useful. It’s best to not do too much of that or you’ll have one a.out overwriting another.

In the old days compiling C programs on a Linux system was kind of a giant pain. These days things tend to work much smoother but for reference, I’ll include some notes on things to try to solve typical compile issues.

  • Use -D_GNU_SOURCE early and often. Modern Linux systems seem to come with a gcc that is aware that it’s a Linux system and does the right thing, but it wasn’t always so. If you want some more serious detail on programming in Linux specific environments, this seems like a good resource.

  • If include (.h) files are "lost" try an option like -I/usr/X11R6/include/X11/magick/ which can provide hints where to find include files.

  • Math not working even though you added a #include <math.h>? Maybe only some of math ( undefined reference to "floor")? Try -lm which often fixes that. I do not understand the logic of this requirement, but sometimes it solves these problems.

  • Are you nuts and compiling something against Xlib? You might need something like this: -L/usr/X11R6/lib -lX11

  • It turns out that the order of your gcc options and arguments is important. This nice web page points out that external libraries should be to the right of the thing that calls them. This is why ` gcc square.c -o square -lGL -lGLEW -lglut` works but gcc -lGL -lGLEW -lglut square.c -o square will produce tons of square.c:(.text+0x49): undefined reference to... errors. This drove me crazy until I figured it out. A little change in my Makefile and suddenly, everything was wildly broken.

If you’re curious about the resulting executable, it can be analyzed with readelf -a myprog. See this nice article on analyzing Linux executables for details and hints. Also, nm lists symbols from object files. In fact GNU Binutils is full of useful stuff.

Preprocessor

Including Libraries

#include "file_in_this_directory.h"
#include "/an/explicit/path.h"
#include <look_in_the_normal_place.h>

There are many useful functions in standard libraries. It looks like Wikipedia has a pretty good list of Posix C libraries. This is a specification of libraries a sane system should provide. Here are some of the classic ones with some of the defined functions listed.

stdio.h

Includes the super important printf. It also includes stddef.h. Also fwrite, fread, fprintf, fputc, putc, putchar, ungetc, fflush, fopen, freopen, fclose, remove, rename, rewind, FILE

math.h

Pretty much anything involving the eponymous topic of math. Here are some useful ones: ceil (nearest whole above), exp, floor (nearest whole below), pow, sqrt. And most of the others: acos, asin, atan, atan2, cos, cosh, fabs, fmod, frexp, ldexp, log, log10, modf, sin, sinh, tan, tanh.

stddef.h

size_t, offsetof, NULL

stdlib.h

exit, abort, assert, perror, atexit, getenv, system, malloc, calloc, realloc, free, atoi, atol, atof, strtod, strtol, strtoul, rand, srand, qsort, bsearch Here’s a notable use: char *u; u= getenv("USER");

ctype.h

isalnum, isalpha, isdigit, isxdigit, isgraph (visible character), isprint (printable character), isupper (case), islower, iscntrl, ispunct, isspace, tolower, toupper

string.h

strlen, strcpy, strncpy, memcopy, memmove, strcat, strncat, strcmp, strncmp, memcmp, strchr, strrchr, memchr, strcspn, strpbrk, strspn, strstr, strtok, strerror, memset

unistd.h

Includes the getopt function.

locale.h

setlocale, localeconv

time.h

asctime, ctime, clock, difftime, gmtime, localtime, mktime, time, strftime

signal.h

Defines functions and MACROS for handling signals. signal, raise also SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, SIGTERM

Preprocessor Macros

All kinds of mischief can be had with preprocessor tricks. Generally it seems that this should only be used to manage the software development aspect of the program and not the program’s actual functionality. Here’s an example of a preprocessor macro in use:

Preprocessor Macro Example
#include <stdio.h>
#define TYPE(T,V) T V;printf(#T"= %d\n", sizeof V);
int main(void){
    TYPE(char,a_char)
    TYPE(short int,a_short_int)
    TYPE(int,an_int)
    TYPE(unsigned int,an_unsigned_int)
    TYPE(long,a_long)
    TYPE(unsigned long,an_unsigned_long)
    TYPE(float,a_float)
    TYPE(double, a_double)
    TYPE(long double,a_long_double)
return 0;}

To see what this preprocessor macro does, see below.

Other preprocessor tricks would be stuff like general constants that should be flexible depending on how someone might want to compile the program:

#define PRECISION .0001

Often there is a big maze of include files and it’s easy to have one place include a library and then another place try to do it too resulting in some kind of clash. The following checks to see if the special library has been loaded and if not, it loads it. Subsequent uses of this will be ignored.

#ifndef SPECIAL
#include "special.h"
#define SPECIAL
#endif

Preprocessor macros can be useful for debugging messages too.

#define VERBOSE 4
...
if (VERBOSE > 2) {printf("A level 2 message.");}

Just set the value to 0 to turn off verbose messages. This allows the programmer to set up a bunch of diagnostic print statements that can be turned on or off easily.

Debugging With Preprocessor Tricks

While there is no shortage of tricksy ways to use the preprocessor for debugging, it seems to come down to sticking to a few guidelines. * Let the compiler see the debugging code so that any future warnings are caught so you’re not taken by surprise when you turn the debugging on 20 years in the future. * Keep it simple.

#define DEBUG
#ifdef DEBUG
 #define D
#else
 #define D for(;0;)
#endif

Idiosyncratic C Operators

x++

Increments variable x by 1 after using it in this spot.

++x

Increments variable x by 1 before using it in this spot.

x--

Decrements variable x by 1 after using it in this spot.

--x

Decrements variable x by 1 before using it in this spot.

{test}?{true}:{false}

An "if" statement for saving punch card chad.

x,y

Evaluate and discard x, evaluate and retain y.

x=y=z=3

All of the variables x,y,z are 3. Assignment is an expression. This works left to right.

Bitwise

&

bitwise AND

|

bitwise inclusive OR

^

bitwise exclusive OR

<<

bit shift left

>>

bit shift right

~

bitwise NOT

Main Structure Of A C Program

A C program is a collection of functions, routines that possibly take some input and possibly return some output. All C programs that run must have one and only one function called main.

Here is a typical structure showing how the main function can be passed the command line arguments. This program is useful for diagnosing exactly what the C program is receiving from the executing shell. It also shows the polite return code (0 is usually success and 1 is usually failure while other numbers can signify fancy modes of failure or other things).

Accessing Command Line Arguments
/* Comments look like this! */

#include <stdio.h>

int main(int argc, char *argv[]) {
    int i= argc;
    for (i-1;i+1;i--) { printf("Argument #%d:%s\n",i,argv[i]); }
    return 0;
}

Or here’s a less readable version:

#include <stdio.h>
int main(int argc, char *argv[]){ while (argc--) printf("%s\n", *argv++); }
Note
To only show the arguments and not the program name (element 0), just make both of the postfix modifiers (-- and ++) into prefix modifiers.

If you don’t care about the command line arguments use something like:

int main(void) { /*code goes here*/; return 0; }

Types

C is a "strongly typed" language meaning that it carves out memory for various purposes based on very explicit definitions of the resources which will be used. Important types:

int

Integer, i.e. non fraction whole numbers.

float

Numbers that can represent a continuous value (to the accuracy a binary representation ultimately provides).

char

A character.

enum

An enumeration. Used to create a type with a constrained set of possible values. enum lightswitch {Off, On}; Here lightswitch can be either "Off" or "On" which is the same as 0 and 1. If you wanted different values, use something like enum lightswitch {Off=-1, On=1};

union

Define with something like: union Lights { int Switch; float Dimmer;} This allows one thing (Lights) to either have an int value if it’s just a switch and a float value if it’s a dimmer. It’d be good to keep a separate variable around to store which you’re using at any given time or confusion will result.

struct

A structure. Used to create custom types that hold collections of things. struct point { int x; int y; int z}; To declare a variable of this type you need to do struct point LastKnown;

Custom

Sometimes you want to make some complex named type have a simple name. To do that use the typedef statement: typedef short int twobyter; Now declaring something as twobyter X= 0 is the same as saying short int X=0; This can be a handy trick when setting up arbitrary data structures that may find utility handling different payloads. Just typedef the data component of the complex structure and tailor that to your needs at the beginning of the program.

The various types require different amounts of storage in memory. It is best to choose the most economical type which satisfies requirements. This is a nice feature to be able to optimize in this way, however, since it is not optional it is also one of those pains that makes C programming a bit tedious. Here is the output of the preprocessor example above which shows the size in bytes of various storage types on my machine.

char= 1
short int= 2
int= 4
unsigned int= 4
long= 4
unsigned long= 4
float= 4
double= 8
long double= 12

printf Format Codes

The return value for printf is the number of characters written which can be handy.

The format specifier has this form:

% [flags] [field_width] [.precision] [length_modifier] conversion_character

Here are the modifier flags:

-

left justified

+

always mark a + or - for signed numbers

<space>

use - for negative numbers and space for postive

0

pad leading zeros to specified width

#

Modifies style of each type. For example, for x types, there will be a prefix 0x, for X prefix 0X. For [gG] trailing zeros will be included. For [eEfFgG] all output will have a decimal point.

Field width is the minimum space that will be used if the output is shorter than the value provided.

Precision is the number of digits after the decimal point in numbers with decimals and the number of total digits in others. In [gG] it is the number of significant digits.

The length modifier is h (short or unsigned short), l (long), or L (long double).

The conversion character is one of the following:

d or i

int signed base10

o

int unsigned octal

u

int unsigned base10

x and X

int unsigned hex specifying the x’s case

c

single char (like an unsigned int)

e and E

double or float in scientific notation specifying the e’s case

f

double or float base10

g and G

either like e or f depending on size

n

the argument is a pointer to which the number of characters converted thus far is assigned. Nothing is output.

s

output a string, i.e. a pointer to a char. Characters are output until a \0 is encountered or the number of the precision specifier has been reached.

p

output an implementation representation of a pointer - use for debugging?

%

output an actual "%"

Bit Masks

Basically you can store a set of many boolean variables in one C variable. Since you’re going to need a minimum of 8 bits to do about anything, the theory goes that if you’re just needing to store 1 bit, you might as well have that state variable serve multiple purposes. Despite sounding horrible, this actually produces code that is surprisingly readable.

The basic technique is to set the various flags with bit shifts to store them in the right places. Then when you want to create a state collection, just "or" them together with |. When you want to check to see if a flag is set in a collection, just use & to get that back out. Here’s an illustrative example:

/* An example of how to use bitmasks. */
#include <stdio.h>
#define LIGHTS_ON        ( 1 << 0 )
#define BRAKE_LIGHTS_ON  ( 1 << 1 )
#define WIPERS_ON        ( 1 << 2 )
#define HORN_ON          ( 1 << 3 )

int main(int argc, char *argv[]) {
    unsigned int car_status;
    car_status= LIGHTS_ON | WIPERS_ON | BRAKE_LIGHTS_ON;
    if (car_status & LIGHTS_ON) {
        printf("Lights on.\n");}
    if (car_status & BRAKE_LIGHTS_ON) {
        printf("Brake lights on.\n");}
    if (car_status & WIPERS_ON) {
        printf("Wipers on.\n");}
    if (car_status & HORN_ON) {
        printf("Horn on.\n");}
    return 0; }

This program will produce this result:

Lights on.
Brake lights on.
Wipers on.

Note that this technique is not especially type safe since a function expecting a well crafted collection of bits can be sent any old value that works and the compiler won’t notice. This is one reason that in C++ bool types are more robust. But bitmasks can be useful and they definitely pop up a lot in various libraries; understanding how they work is important.

Pointers

Objects in C can be handled by their names (which imply their contents), but a far more powerful and flexible technique is to work with them by only specifying the address where the data of interest is. The reason for this is that it’s computationally expensive to shuffle things around in memory if you don’t really need to. It’s better to leave the bulk of the thing alone and just refer to it where it is needed. It’s a bit like money. You could trade gold specie for the things you want, but for most transactions, it’s easier to leave the gold in a vault somewhere and just trade promissory notes referring to it. (Assuming a gold standard) writing a check is like referring to a reference (bank notes) to actual money (the gold). This is like a C pointer’s ability to point to a pointer.

So if you have a variable called big_thing with a lot of data in it, you can do things with that variable by name, but sometimes it is more effective to just refer to the location where that thing lives. It’s quite like addresses in real life: you don’t have to specify the exact nature of a house at a particular address or if it’s a strip mall or whatever, just the address is sufficient to deal with it for many purposes.

Important ideas with pointers:

  • Pointers are a data type that holds exactly one memory address. What that address actually is should seldom ever be of concern.

  • Pointers can point to other pointers.

    int x;

    Defines an integer type called x.

    int *ptr2x;

    Reads "Define the thing ptr2x points to as an integer." This (*) is technically called the "indirection operator".

    ptr2x= &x;

    Reads "Set ptr2x to the address of the object defined by x."

    p->n= 0;

    Sets to zero the subcomponent n in the structure that pointer p points to. This is technically called the "indirect member access operator".

Void Pointers

Untyped pointers can be created with the void type.

void *anyptr;

To dereference such a pointer, it must be type cast with something. In this example the contents of two different kinds of variables, ib and fb, are set from the dereferencing of the same pointer.

#include <stdio.h>
void main() {
    int ib,ia= 666;
    float fb,fa= 3.14;
    void *anyptr;
    anyptr= &ia;
    ib= *((int*)anyptr);
    printf("ib now is: %d\n",ib);
    anyptr= &fa;
    fb= *((float*)anyptr);
    printf("fb now is: %.2f\n",fb);
}

Arrays

A[i] is the same as (*((A)+(i)))

So these are equivalent.

A[4]= 'x';
*(A+4)= 'x';

You can load the array at definition.

float origin[3]= {0,0,0};
char mystring[]= {'x','e','d','\0'}; /* The '\0' makes it a "string". */
char mystring[]= {"xed"}; /* Equivalent. */

I’m pretty sure you can’t define the array and then later set it with {0,0,0} or something like that. Just keep in mind why strcpy and memcpy exist. Here’s a way to use memcpy to initialize an array.

int colpos[MAXNUMFIELDS];
memset(colpos,0,sizeof(colpos));

Here are ways to initialize arrays that are possibly compiler specific.

int colpos[MAXNUMFIELDS]={[0 ... MAXNUMFIELDS-1]=0}; // Works on gcc!
int colpos[MAXNUMFIELDS]={0};                        // Works on gcc!

Brackets are actually a postfix operator for manipulating the array specified by the operator. (Confusing? Yes.)

Elements of arrays are stored in successive pointer address locations.

&origin[1]-&origin[0] == 1

Find the length of an array:

int length= sizeof origin / sizeof origin[0]
Note
In case of confusion, note that *argv[] is the same as **argv.

Chars and Strings

An array of objects of the char type has some special syntactical properties in C. This is to facilitate the handling of "strings".

char alphabet[26];
char theFword[4] = {'f', 'u', 'n', '\0'};
char string[6] = "twine";
char gray[] = {'g', 'r', 'a', 'y', '\0'};
char salmon[] = "salmon";

If you want fancy string capabilities, you might need a custom library to do what you want. Here is an interesting one.

Branching

Basically computers compute by making logical decisions. In C, the main decision making feature is the if statement:

if (test_expression) {statement_block} else {statement_block}

The else if construction allows a single choice to be made from a series of possible conditions.

if (te1) {sb1} else if (te2) {sb2} else if (te3) {sb3} ... else {sb}

For if statements, the test expression can be anything that reduces to an integer which equals 0 (which is false) or something else (which is true).

A fancier form of branching can be done with the switch and case statements. Here’s how it works:

An Example Of switch/case And getopt
#include <stdio.h>
#include <unistd.h>

int main (int argc, char **argv ){
static char optstring[]="a:b:c"; int o;
while ( (o = getopt(argc, argv, optstring)) != -1)
 switch(o) {
     case 'a': { printf("Option argument for `a` is: %s\n",optarg); break; }
     case 'b': { printf("Option argument for `b` is: %s\n",optarg); break; }
     case 'c': {  printf("Option `c` has no argument.\n"); break; }
     default: { printf("Option `%c` is unknown.\n", o); }
 }
return 0;}
Note
For a more comprehensive example of option parsing, see the Option Parsing section.

Looping

Interesting software is a result of many logical decisions being repeatedly performed in interesting ways. The main way to achieve multiple iterations of an action in the C idiom is with the for loop:

for ({initial};{test_before_each_iteration};{eval_after_each}){thing_to_do}

Here’s a more interesting example:

for (hi=100,lo=60;hi>=lo;hi--,lo++){converge(hi,lo);}
Note
You need to define the variables that appear in the for statement prior to using it. If that really bugs you, you can try compiling with -std=c99 but that seems kind of non standard to me in some slight way. The less compiler magic, the better IMO.

The other two important loop structures are similar with a subtle difference. These are the while loops. The most basic works like so:

while ({test_expression}) {do_this_stuff}

If before any attempt to execute the body of the loop the test expression is 0 or NULL then the loop is skipped and control is passed on.

If you want the test evaluated after the loop body code is run (which implies the loop body will always run at least once) use this form:

do {do_this_stuff} while ({test_expression});

Exiting loops

continue

This statement jumps control to the end of the current loop body statement as if it had completed an iteration and was now ready for more. It allows for short circuiting some code that might otherwise be performed on every iteration.

break

This statement jumps control just past the end of the current loop body statement as if the last iteration had just occurred and finished. This statement basically says that this looping structure is completely finished, not just this iteration.

return [expression]

This is the way to break out of a function. The optional expression is passed back to the calling function by value (so use pointers where that’d be unpleasant). If the function was defined as void then don’t include an expression. A function can have several return points depending on the situation. If you have stdlib included, you can return EXIT_SUCCESS or EXIT_FAILURE.

Dynamic Memory

Anytime you are working with an amount of data that you can not explicitly define an upper bound on ahead of time, you probably need to use dynamic memory. The main mechanism of dynamic memory is the malloc() function (include stdlib.h) which runs around looking for enough contiguous memory to reserve for some run time defined purpose. Once malloc() finds the memory you’ve requested, it returns a pointer to that location so you can start doing stuff with it. The format for using malloc() is a bit fussy:

p= (struct Thing *) malloc (sizeof (struct Thing))

Here the sizeof() function returns exactly the value (in bytes) for just how much memory an instance of struct Thing would need. That memory is is reserved and the pointer that is returned is cast (forced) by the first parentheses to point to memory that is configured as a struct Thing.

When your program is finished with some memory that has been allocated, it’s polite (or maybe even critical) that it be returned to the system for use. The way to do that is with the free() function which takes a pointer to the memory you want recycled.

Note
If you’re really interested in C’s low level memory management, here is an interesting guide to writing your own malloc and friends.

Simple Stack Implementation

Before C can be made into anything useful, you really need to create some tools to make certain tasks easier to implement. One theme that comes up over and over in more substantial programming tasks is the need to hold an arbitrary bunch of data somewhere. Since C requires very explicit declarations of all memory used, this can be challenging to always attend to it. It is therefore useful to create some templates that can get you into more interesting parts of the problem quickly.

Here is an implementation of a simple stack system. The stack is fed data with a Push() command, that is data is appended to the end of the stack (a FILO queue). Data is retrieved (and removed) from the stack with a Pop() function. Note the type definition cargo_type allows the stack to carry whatever kinds of data types you want simply by redefining this.

#include <stdlib.h>
#include <stdio.h>
#include <time.h>

/* Custom type definitions. */
typedef int cargo_type;
struct linkbox { cargo_type cargo; struct linkbox* next;};
typedef struct linkbox lbox;

/* Function prototypes show inputs and outputs so subsequent */
/* mentions of them aren't confusing (seemingly undefined)   */
/* to the compiler.                                          */
void Push(cargo_type v, lbox** p2mylist);
cargo_type Pop(lbox** p2mylist);
cargo_type Iter(lbox** current);
int dice(int sides);

int main(void){
    int i,m;
    srand(time(NULL)); m= dice(20);
    lbox *mylist=NULL;
    for (i=0;i<m;i++){
        Push( dice(6), &mylist); }
    lbox *index= mylist;
    int sum=0, n=0;
    while (index) {
        sum += Iter(&index);
        n++;
        /*printf("Iter:%d\n",Iter(&index));*/ }
    printf("Average:%f\n",(float)sum/n);
    while (mylist) {
        printf("Popping:%d\n",Pop(&mylist)); }
    return 0;}

int dice(int sides){
    return rand() % sides + 1;}

cargo_type Iter(lbox** c){
    cargo_type t= (*c)->cargo;
    *c= (*c)->next;
    return t;}

void Push(cargo_type v, lbox** p2mylist){
    lbox* latestbox;
    latestbox= (lbox *) malloc(sizeof(lbox));
    latestbox->cargo= v;
    printf("Pushing:%d\n",v);
    latestbox->next= *p2mylist;
    *p2mylist= latestbox;
    return;}

cargo_type Pop(lbox** p2mylist){
    cargo_type t= (*p2mylist)->cargo;
    lbox *dead= *p2mylist;
    *p2mylist= (*p2mylist)->next;
    free(dead);
    return t;}
Linked List Example Memory Layout

Linked List Memory Map

This program is also an example of passing function arguments by reference. It needs a pointer, so the pointer is pointed to by another pointer which gets sent to the function. When the transporter pointer is dereferenced, the original pointer that was supposed to show up at the function is ready to go. The reason this is necessary is that C function arguments are copied over and if you copy a pointer, it’s a different pointer (even if it points to the same place). If you inserted a new node between a function copy of the pointer to the list and the list, then you’d lose track of the (complete) list when the function variable’s memory was freed on function exit.

File Operations

After being able to allocate memory you need you often need to use the file system to read actual data to fill that memory. Using files is a fundamental operation that has its quirks in C. The following example reads, character by character, a file called ./fileio.c and prints it to the output, and writes it to a file called /tmp/fileio-copy.c. The hard to memorize bits are including stdio.h and creating a FILE pointer. Also opening and closing the file require fopen and fclose.

Simple Example Reading And Writing Files
#include <stdio.h>
int main(int argc, int *argv) {
    FILE *fpi,*fpo;
    fpi= fopen("./fileio.c","r");
    fpo= fopen("/tmp/fileio-copy.c","w");
    char curchar;
    curchar= fgetc(fpi);
    while (curchar != EOF) {
        printf("%c",curchar);
        fputc(curchar,fpo);
        curchar= fgetc(fpi);
    }
    fclose(fpi);
    fclose(fpo);
    return 0;
}

While I’ve shown fgetc and fputc, other possibilities include fprintf and fscanf. Also fread and fwrite (for binary).

It’s also worth pointing out that a more C styled way to do the main read loop would probably be something like:

while  ( (curchar= fgetc(fpi)) != EOF ) {...}

I think that one of the best ways to read in data is fgets. Here’s a pretty solid way to do that using dynamic buffers that grow if needed using realloc.

./revtac ./revtac.c | ./revtac
/* Here's an example of using realloc to grow the buffer to as much as
 * needed when bringing in data. This particular example will take the
 * specified file, or standard input, and render it backwards. Imagine
 * rev and tac combined. Running this twice should cancel.
 * $ md5sum revtac.c <(./revtac ./revtac.c | ./revtac) */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) {
    FILE *fp;
    if (argc-1) fp= fopen(argv[1],"r");
    else fp= stdin;
    if (!fp) {perror("Could not open file."); exit(EXIT_FAILURE);}
    char *str= malloc(4096), *s= str;
    int len= 0;
    while (fgets(s,4096,fp)) {
        len += strlen(s);
        str= realloc(str, len+4096);
        s= str+len;
    }
    fclose(fp);
    int n;
    for (n=len++;n;n--){
        printf("%c",str[n-1]);
    }
    return(EXIT_SUCCESS);
}

The previous example had two limitations. First, because it needed to know the end of the input before beginning it’s output, it loaded the entire contents of the input into memory. This is not ideal for very big jobs where sequential processing can be applied. Second, it only handled one file. Proper Unix utilities should be able to accept data on standard input and/or as one or more files to open. The quintessential utility that reliably does this is cat. To show how to create a program which can use an arbitrary number of input sources like cat and address each line as they come, I have rewritten cat from scratch. Note that I am not Richard Stallman and I’m not claiming this is the most robust cat implementation ever, but if you need a program that does about the same thing as cat but with a bit of C code thrown in, this can be a better place to start than the source code for the real cat (which is also reasonable).

alleycat.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAXLINELEN 666

void process_line(char *line) {
    // PUT THE ESSENTIAL LOGIC FOR THIS PROGRAM HERE!!
    printf("%d %s",(int)strlen(line),line); // For example: line length and line.
}

void process_file(FILE *fp){
    char *str=malloc(MAXLINELEN), *rbuf=str;
    int len=0, bl=0;
    if (str == NULL) {perror("Out of Memory!\n");exit(1);}
    while (fgets(rbuf,MAXLINELEN,fp)) {
        bl=strlen(rbuf); // Read buffer length.
        if (rbuf[bl-1] == '\n') { // End of buffer really is the EOL.
            process_line(str);
            free(str); // Clear and...
            str=malloc(MAXLINELEN); // ...reset this buffer.
            rbuf=str; // Reset read buffer to beginning.
            len=0;
        } // End if EOL found.
        // Read buffer filled before line was completely input.
        // Allocate more memory for this line.
        else { // Add more mem and read some more of this line.
            len+=bl;
            str=realloc(str, len+MAXLINELEN); // Tack on some more memory.
            if (str == NULL) {perror("Out of Memory!\n");exit(1);}
            rbuf=str+len; // Slide the read buffer down to append position.
        } // End else add mem to this line.
    } // End while still bytes to be read from the file.
    fclose(fp);
    free(str);
} // End function process_file

int main(const int argc, char *argv[]) {
    FILE *fp;
    int optind=0;
    if (argc == 1) { // Use standard input if not files are specified.
        fp=stdin;
        process_file(fp);
    }
    else {
        while (optind<argc-1) { // Go through each file specified as an argument .
            optind++;
            if (*argv[optind] == '-') fp=stdin; // Dash as filename means use stdin here.
            else fp=fopen(argv[optind],"r");
            if (fp) process_file(fp); // File pointer, fp, now correctly ascertained.
            else fprintf(stderr,"Could not open file:%s\n",argv[optind]);
        }
    }
    return(EXIT_SUCCESS);
}

Change the process_line function to do whatever it is you need to do to the data.

Option Parsing

When running programs from the command line, the main function can be supplied with a list of optional parameters passed from the calling program or shell. To properly parse this in a sensible way, C has some nice functions that help keep things consistent and error free. Here is an example of a complete option parsing routine which handles long options. Long options are like --help, --verbose etc., and tend to be popular with GNU utilities.

Example Of getopt_long
#include <stdio.h>
#include <getopt.h>
#include <stdlib.h>

int main (const int argc, char **argv) {
    int help= 0; int i=0; int j=10; float k= 0;
    int o;
    while (1) {
        static struct option long_options[] = {
            {"help"  , no_argument,       NULL, 'h'}, /* Bools work well in C++. */
            {"ivalue", required_argument, NULL, 'i'}, /* Integer arg required. */
            {"jvalue", optional_argument, NULL, 'j'}, /* Integer arg optional. */
            {"kvalue", required_argument, NULL, 'k'}, /* Float arg required. */
            {0, 0, 0, 0} /* must be filled with zeros */
        };

        /* getopt_long stores the option index here. */
        o = getopt_long(argc, argv, "hi:j::k:", long_options, NULL);
        if (o == -1) break; /* Detect the end of the options. */
        switch (o) {
            case 'h': help= 1; printf("Help=%d\n",help); break;
            case 'i': i= atoi(optarg); break;
            case 'j': if (optarg){ j= atoi(optarg); } else { j=99; } break;
            case 'k': k= atof(optarg); break;
            default: printf("Unknown Option.\n"); return 0;
        } /* End switch construct */
    } /* End while loop */

    /* State of variables initialized by options.  */
    printf("i=%d, j=%d, k=%f, help=%d\n",i,j,k,help);
    printf("Option Index: %d\n", optind); /* optind is defined by getopt.h */
    /* Print any remaining command line arguments (not options). */
    if (optind) { printf ("Non-option ARGV-elements: \n"); }
    while (optind < argc) { printf("%s \n", argv[optind++]); }
    return 0;
} /* End main */
Note
In the example program above the option -j (aka --jvalue) is defined as having an optional argument. Optional arguments cause some ambiguity and to use them, you must run your program specifying these arguments like: -j99 or --jvalue=99. If you try -j 99 or --jvalue 99 then the 99 will be considered unattached to the option.

Useful Tricks

Random Numbers

To get a random number between 1 and 100 do something like this:

rand_from_time.c
#include <stdlib.h>
#include <stdio.h>
#include <time.h>

int main (void) {
    srand(time(NULL));
    int mystery= rand() % 100 + 1;
    printf ("Random number from 1 to 100: %d\n", mystery);
    return (0); }

You need the srand() to seed the random number generator. The rand() function returns random numbers between 0 and RAND_MAX. If you need a random number between 0 and 1, another way to do that would be to do rand()/(RAND_MAX+1).

Warning
The method of seeding srand() with a time(NULL) function is ok in many situations, but remember that this can be reversed engineered. This means you don’t want to write a real-money gambling game that is randomized in this way. Also if you run the program quickly the time may be the same to within a second and this will cause the "random" output to possibly repeat itself.

If you are using a proper operating system (like Linux or a fruit-based computer) there is a managed resource that collects entropy for use by various processes in establishing randomness. This source of randomness is presented as a file by the kernel and automagically filled with pretty high quality random numbers (see man random for gory details). Here is a way to get random numbers using a seed pulled from this source:

rand_from_os.c
#include <math.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main (int argc, char *argv[]) {
    FILE *urandom;
    unsigned int seed;
    urandom = fopen ("/dev/urandom", "r");
    if (urandom == NULL) {
        fprintf (stderr, "Cannot open /dev/urandom!\n");
        exit (EXIT_FAILURE); }
    fread (&seed, sizeof (seed), 1, urandom);
    srand (seed);
    printf ("Random number from 1 to 100: %d\n",
            (int) floor(rand() * 100.0 / ((double) RAND_MAX + 1) )+ 1);
    exit (EXIT_SUCCESS); }

A good illustration of the difference can be seen by running these numerous times very quickly. If run 10,000 times, a random number between (and including) 1 and 100 should pop up roughly 100 times. You can see that producing random numbers from the OS’s seed does roughly that. The time based one, however, does a terrible job. Most of the time it will produce zero results with a particular preselected number ("88" in the following example).

$ for x in `seq 10000`;do ./rand_from_os | grep ' 88$' ; done | wc -l
97
$ for x in `seq 10000`;do ./rand_from_os | grep ' 88$' ; done | wc -l
94
$ for x in `seq 10000`;do ./rand_from_time | grep ' 88$' ; done | wc -l
0
$ for x in `seq 10000`;do ./rand_from_time | grep ' 88$' ; done | wc -l
512

This is because over the course of a few seconds to run, the time only changes a few times and most of the values will be from only a handful of seeds. Ironically, this problem is worse on higher performance machines.

Debugging

Print Error Messages

Something like this:

fprintf(stderr,"Prints to standard error.\n");

Also with #include <stdio.h> assumed, you can also use this.

perror("File not found.\n");

Core Dump Analysis

What if you get the dreaded Segmentation fault? This means something bad happened at run time. Most errors are caught at compile time but sometimes your program looks fine to the compiler and does a silly thing once you actually fire it up. Besides mystical intuition the best methodical way to analyze the problem is to have the system create a memory dump at the time of the error and then use a special tool to look through this memory file to figure out what went wrong. To get a misbehaving program to create a core dump file compile like this:

gcc -g -o sketchy sketchy.c

Or if you’re definitely going to use gdb:

gcc -ggdb -o sketchy sketchy.c

If it still has a seg fault and you’re not getting a (core dumped) message appended to it, try changing your environment with:

ulimit -c unlimited

This removes any restriction on the size of core files allowed by the shell.

Note
When you’re done playing with core files, you might want to do ulimit -c 0 so that segmentation faults don’t generally produce core files. Normally, it’s a pain to have these files mysteriously lying around every time something crashes.

gdb

Assuming you have a core dump called core, run gdb like this:

gdb sketchy core

The core should load and allow you to investigate it. It might just tell you about the error and where it occurred.

Or if you don’t need a core dump, you can just run gdb sketchy and type run to run the program and see if your error happens in a more interesting and verbose way. Here are some of the important commands to be aware of when using gdb.

Table 1. Useful gdb Commands

<enter>

previous command

help

very sensible help

run

continuous run - can be followed by args (see set args)

start

start execution but in single step (stop at main), args ok

step

proceed execution to next source code line

next

like step but consider all subroutine lines as one

finish

execute until stack frame returns (stop at end of current function generally)

print <var>

prints the current value of the specified variable

set args <arg1..argN>

what is passed to programs started with run command

show args

query what arg list was set

bt

backtrace (or a nested list of function calls) good for finding where your program seg faulted

break <n>

set break point a source code line number n

cont

(also c) continue from stop at break point

shell

run a shell sub process using sh

layout next

set TUI split screen display to track through registers, assembly, or source

refresh

Refresh TUI screen (think Ctl-L)

skip function <name>

Skip named function in stepping, current if none given.

until <line>

Run until specified line number.

quit

leave gdb

Keywords

These words are all reserved for C. Don’t name things with the same name:

auto, break, case, char, const, continue, default, do, double, else, enum, extern, float, for, goto, if, int, long, register, return, short, signed, sizeof, static, struct, switch, typedef, union, unsigned, void, volatile, while

The fact that this list is so amazingly short is the good news in C! Enjoy!