# 3. Basics of C¶

There are certain rules in every language; certain grammar which dictates the way language will be spoken and written. It has a script to write with. Similarly, programming languages have BNF (Backus-Naur Form) context-free grammar. There are valid characters in a programming language and a set of keywords. However, programming language ruleset is very small compared to a natural programming language. Also, when using natural programming language like talking to someone or writing something the other person can understand your intent but in programming you cannot violate rules. The grammar is context-free. Compilers or interpreters cannot deduce your intent by reading code. They are not intelligent. You make a mistake and it will refuse to listen to you no matter what you do. Therefore, it is very essential to understand these rules very clearly and correctly.

Note that C language is governed by ISO specification ISO/IEC 9899:2011. As much as I would like to refer to specification there are financial reasons why I will not because it is expensive and I do not expect all of readers to buy this. Thus I would use n1570.pdf mentioned in previous chapter. Sections of this document will be referred like $$\S(\text{iso.section number})$$.

## 3.1. The C Character Set¶

The following form the C character set you are allowed to use in it:

[a-z] [A-Z] [0-9] ~ ! # % ^ & * ( ) - = [ ] \ ; ' , . / _ + { } | : " < > ?

This means along with other symbols you can use all English alphabets (both uppercase and lowercase) and Arabic numerals. However, English is not the only spoken language in the world. Therefore in other non-English speaking counties there are keyboard where certain characters present in above set are not present. The inventors of C were wise enough to envision this and provide the facility in form of trigraph sequences. Here, I am presenting table of trigraph sequences.

Trigraph Equivalent Trigraph Equivalent Trigraph Equivalent
??= # ??’ ^ ??! |
??( [ ??) ] ??< {
??> } ??/ \ ??- ~

## 3.2. Keywords¶

The following are reserved keywords for C programming language which you are not allows to use other than what they are meant for:

Keywords of C
auto enum restricted unsigned
break extern return void
case float short volatile
char for signed while
const goto sizeof _Bool
continue if static _Complex
default inline struct _Imaginary
do int switch
double long typedef
else register union

Following keywords were added in C11 specification:

_Alignof _Atomic _Generic _Noreturn _Static_assert _Thread_local

These keywords serve specific purpose. You will come to know about all of them as you progress through the book. Next we look at identifiers.

## 3.3. Identifiers¶

The names which we give to our variables are known as $$\S(\text{iso.6.4.2})$$. Something with which we identify. As you have already seen what is allowed in C’s character set but not all are allowed in an identifiers name. Only alphabets from English language both lowercase and uppercase, Arabic digits from zero to nine and underscore (_) are allowed in an identifiers name. The rule for constructing names is that among the allowed characters it can only begin with only English alphabets and underscore. Numbers must not be first character. For example, x, _myVar, varX, yourId78 are all valid names. However, take care with names starting from underscore as they are mostly used by different library authors. Invalid identifier examples are 9x, my$, your age. Invalid identifier examples are 9x, my$} and your age. If the identifier name contains extended characters(i.e. other than what is mentioned for simplicity like, Chinese, European, Japanese etc) then it will be replaced with an encoding of universal character set, however, it cannot be first character.

Length of an identifer for 31 characters, which, acts as minimum limits, as specified in $$\S(\text{iso.5.2.4.1})$$, is guaranteed across all platforms.

## 3.4. Programming¶

Now is time for some programming. Let us revisit our first program and try to understand what it does. Here I am giving code once again for quick reference:

// My first program
// Description: This program does nothing.

#include <stdio.h>

int main(int argc, char* argv[])
{
return 0;
}


You can now issue a command as $gcc nothing.c where nothing.c is the filename by which you saved the source code. Note that $ is the prompt not part of command itself. Then you can do an ls and you will find that a.out is a file which has been produced by gcc. Now you can run this program by saying ./a.out and nothing will happen. But if you type echo $? then you will find that 0 is printed on screen which is nothing but 0 after return of our program. As you can see this program does almost nothing but it is fairly complete program and we can learn a lot from it about C. The first line is a comment. Whenever C compiler parses C programs and it encounters // it ignores rest of line as code i.e. it does not compile them. This type of single line comment were introduced in C99 standard and if your compiler is really old the compiler may give you error message about it. The second and third lines are also comments. Anything between /* and */ is ignored like //. However, be careful of something like /* some comment */ more comment */. Such comments will produce error messages and your program will fail to compile. Comments are very integral part of programming. They are used to describe various things. You can write whatever you want. They may also be used to generate documentation with tools like doxygen. Typically comments tell what the program is doing. Sometimes how, when the logic is really complex. One should be generous while commenting the code. #include is a pre-processor directive. It will look for whatever is contained in angular brackets in the INCLUDEPATH of compiler. For now you can assume that /usr/include is in include path of compiler. Basically what it does is that it looks for a file names stdio.h in the INCLUDEPATH. If that is found the content of that file is pasted here in our program.If you really want to see what happens then you can type $gcc -E nothing.c. You will see lots of text scrolling on your screen. The -E switch tells gcc that just preprocess the file, do not compile it, and send the resulting output to standard output (we will know about this more later), which happens to be your monitor in this case..

Next line is int main(int argc, char* argv[]). Now this is very special function. Every complete executable(shared objects or dlls do not have main even though they are C programs) C program will have one main function unless you do assembly hacking. This function is where the programs start. The first word int is a keyword which stands for integer. This signifies the return type of function. main is the name of the function. Inside parenthesis you see int argc which tells how many arguments were passed to program. While char* argv[] is a pointer to array which we will see later. For now it holds all the arguments to the program.

Next is a brace. The scope in C is determined by braces. Something outside any brace has global scope (we will see these later), something inside first level of brace has function or local scope. Something inside second or more level of braces have got that particular block scope. Scope here means that when there will be a closing brace that particular variable which is valid in that scope will cease to exist. However, we do not have to worry about that yet as we do not have any variable. Just note that a corresponding closing brace will be the end of main function.

Next line is return 0; This means whoever has called main() will get a 0 as return is returning 0. In this case, receiver is the shell or operating system which has invoked the very program. The semicolon is called the terminator and used also on Java or C++ for example. The very requirement of semicolon is to terminate the statement and move on to next statement.

However, the program shown does not do much. Let us write a program which has some more functionality and we can explore more of C. So here is a program which takes two integers as input from users and presents their sum as output. Here is the program:

// My second program
// Description: It adds two numbers

#include <stdio.h>

int main()
{
int x=0, y=0, sum=0;

scanf("%d", &x);

scanf("%d", &y);

sum = x + y;

printf("%d + %d = %d\n", x, y, sum);

return 0;
}


and the output is:

shiv@shiv:~/book/code$./addition Please enter an integer: 7 Please enter another integer: 8 7 + 8 = 15 shiv@shiv:~/book/code$


Note that shiv@shiv:~/book/code$ is the prompt. The Makefile is also updated: check-syntax: gcc -o nul -Wall -S$ (CHK_SOURCES)

nothing:nothing.c
gcc nothing.c -o nothing



## 3.8. Complex Types¶

For complex types, there is a system header complex.h which internally includes various other headers. However I am giving you the summary here. There are following #define macros:

complex: Expands to _Complex

_Complex_I: Expands to a constant expression of type const float _Complex with the value of the imaginary.

imaginary: Expands to _Imaginary.

_Imaginary_I: Expands to a constant expression of type const float _Imaginary with the value of the imaginary value. I: Expands to either _Imaginary_I or _Complex_I. If _Imaginary_I is not defined, I expands to _Complex_I.

Complex types are declared as given below:
1. float complex fCompZ;
2. double complex dCompZ;
3. long double ldCompZ;

Now I will present a summary of library functions provided by complex.h

//cabs, cabsf, cabsl - these compute and return absolute value
//of a complex number z

double cabs(double complex z);
float cabsf(float complex z);
long double cabsl(long double complex z);

//carg, cargf, cargl - these compute and return argument of a complex
//number z. The range of return value's range from one +ve pi radian

double carg(double complex z);
float cargf(float complex z);
long double cargl(long double complex z);

//cimag, cimagf, cimagl - these compute imaginary part of a complex
//number z and return that as a real number.

double cimag(double complex z);
float cimagf(float complex z);
long double cimagl(long double complex z);

//creal, crealf, creall - these compute real part of a complex
//number z and return the computed value.

double creal(double complex z);
float crealf(float complex z);
long double creall(long double complex z);

//conj, conjf, conjl - these functions compute the complex conjugate
//of z, by reversing the sign of its imaginary part and return the
//computed value.

double complex conj(double complex z);
float complex conjf(float complex z);
long double complex conjl(long double complex z);

//cproj, cprojf, cprojl - these functions compute a projection of z
// onto the Riemann sphere: z projects to z, except that all complex
//infinities (even those with one infinite part and one NaN (not a
//number) part) project to positive infinity on the real axis. If z
//has an infinite part, then cproj( z) shall be equivalent to:
//INFINITY + I * copysign(0.0, cimag(z))
//These functions return the computed value.

double complex cproj(double complex z);
float complex cprojf(float complex z);
long double complex cprojl(long double complex z);

//cexp, cexpf, cexpl - these functions shall compute the complex
//exponent of z, defined as e^z and return the computed value

double complex cexp(double complex z);
float complex cexpf(float complex z);
long double complex cexpl(long double complex z);

//clog, clogf, clogl - these functions compute the complex
//natural (base e) logarithm of z, with a branch cut along
//the negative real axis and return complex natural logarithm
//value, in a range of a strip mathematically unbounded along
//real axis and in the interval -ipi to +ipi along the
//imaginary axis.

double complex clog(double complex z);
float complex clogf(float complex z);
long double complex clogl(long double complex z);

//csqrt, csqrtf, csqrtl - these functions compute the complex
//square root of z, with a branch cut along the negative real
//axis and return the computed value in the range of the right
//half-plane (including the imaginary axis)

double complex csqrt(double complex z);
float complex csqrtf(float complex z);
long double complex csqrtl(long double complex z);

//cpow, cpowf, cpowl - these functions compute the complex
//power function x^y, with a branch cut for the first
//parameter along the negative real axis and return the
//computed value.

double complex cpow(double complex x, double complex y);
float complex cpowf(float complex x, float complex y);
long double complex cpowl(long double complex x,
long double complex y);

//csin, csinf, csinl - these functions compute the complex
//sine of z and return the computed value.

double complex csin(double complex z);
float complex csinf(float complex z);
long double complex csinl(long double complex z);

//ccos, ccosf, ccosl - these functions compute the complex
//cosine of z and return the computed value.

double complex ccos(double complex z);
float complex ccosf(float complex z);
long double complex ccosl(long double complex z);

//ctan, ctanf, ctanl - these functions compute the complex
//tangent of z and return the computed value.

double complex ctan(double complex z);
float complex ctanf(float complex z);
long double complex ctanl(long double complex z);

//casin, casinf, casinl - these functions compute the complex
//arc sine of z, with branch cuts outside the interval
//[-1, +1] along the real axis and return the computed value
//in the range of a strip mathematically unbounded along the
//imaginary axis and in the interval -0.5pi to +0.5pi radian
//inclusive along the real axis.

double complex casin(double complex z);
float complex casinf(float complex z);
long double complex casinl(long double complex z);

//cacos, cacosf, cacosl - these functions compute the complex
//arc cosine of z, with branch cuts outside the interval
//[-1, +1] along the real axis and return the computed value
//in the range of a strip mathematically unbounded along the
//imaginary axis and in the interval -0 to +pi radian
//inclusive along the real axis.

double complex cacos(double complex z);
float complex cacosf(float complex z);
long double complex cacosl(long double complex z);

//catan, catanf, catanl - these functions compute the complex
//arc tangent of z, with branch cuts outside the interval
//[-i, +i] along the real axis and return the computed value
//in the range of a strip mathematically unbounded along the
//imaginary axis and in the interval -0.5pi to +0.5pi radian
//inclusive along the real axis.

double complex catan(double complex z);
float complex catanf(float complex z);
long double complex catanl(long double complex z);

//csinh, csinhf, csinhl - these functions compute the complex
//hyperbolic sine of z and return the comupted value.

double complex csinh(double complex z);
float complex csinhf(float complex z);
long double complex csinhl(long double complex z);

//ccosh, ccoshf, ccoshl - these functions shall compute the
//complex hyperbolic cosine of z and return the computed
//value

double complex ccosh(double complex z);
float complex ccoshf(float complex z);
long double complex ccoshl(long double complex z);

//ctanh, ctanhf, ctanhl - these functions compute the
//complex hyperbolic tangent of z and return the computed
//value.

double complex ctanh(double complex z);
float complex ctanhf(float complex z);
long double complex ctanhl(long double complex z);

//casinh, casinhf, casinhl - these functions compute the
//complex arc hyperbolic sine of z, with branch cuts
//outside the interval [-i, +i] along the imaginary axis and
//return the complex arc hyperbolic sine value, in the range
//of a strip mathematically unbounded along the real axis
//and in the interval [-i0.5pi, +i0.5pi] along the imaginary
//axis.

double complex casinh(double complex z);
float complex casinhf(float complex z);
long double complex casinhl(long double complex z);
cacosh, cacoshf, cacoshl - theese functions compute the

//complex arc hyperbolic cosine of z, with a branch cut at
//values less than 1 along the real axis and return the complex
//arc hyperbolic cosine value, in the range of a half-strip
//of non-negative values along the real axis and in the
//interval [-ipi, +ipi] along the imaginary axis.

double complex cacosh(double complex z);
float complex cacoshf(float complex z);
long double complex cacoshl(long double complex z);

//catanh, catanhf, catanhl - these functions shall compute the
//complex arc hyperbolic tangent of z, with branch cuts outside
//the interval [-1, +1] along the real axis and return the
//complex arc hyperbolic tangent value, in the range of a strip
//mathematically unbounded along the real axis and in the
//interval [-i0.5pi, +i0.5pi] along the imaginary axis.

double complex catanh(double complex z);
float complex catanhf(float complex z);
long double complex catanhl(long double complex z);


Hers is a small demo program which explains two functions:

// Complex Number Program
// Description: Demo of complex data type

#include <stdio.h>
#include <complex.h>

int main()
{
double complex z = 4.0 + 3.0i;

printf("Absolute value of z is %lf\n", cabs(z));

double complex zConj = conj(z);
printf("Imaghinary part of conjugate is now %lf\n", cimag(zConj));

return 0;
}


and the output is:

Absolute value of z is 5.000000
Imaghinary part of conjugate is now -3.000000


You must note that in Makefile you must compile it like \$gcc complex.c -o complex -lm. Note the -lm part. It tells to look for definition of these functions in Math library of C. Without it the program won’t compile. At this point I encourage you to further explore different functions presented in the summary.

There are even more data types for integral type. I am sorry but I am unwrapping the layers one by one. These types are defined in inttypes.h and stdint.h. The types are int8_t, int16_t, int32_t, uint8_t, uint16_t and uint32_t. The numbers tell you how many bits each data type will occupy. The types without leading u are of signed type and the ones with it are of unsigned type. You can use the good old %d or %i for decimal integers and others for octals and hexes. Have a look at headers and try to decipher them.

## 3.9. Void and Enum Types¶

There are these four types remianing. void type comprises an empty set of values; it is an incomplete type that cannot be completed. You cannot declare an array of void. It is a generic type in the sense that any other pointer to any type can be converted to pointer type of void and vice-versa. It is a low level type and should be only used to convert data types from one type to another and sparingly. A type occupies one byte. Typically you never declare a variable of void type. It is used mostly for casting.

enum comprises a set of named integer constant values. Each distinct enumeration constitutes a different enumerated type. In C enums are very much equivalent to integers. You can do all operations of an enum on an enumeration member. An enumeration is is a set of values. It starts from zero by default and increments by one unless specifically specified. Consider the following example:

// Description: Demo of enum

#include <stdio.h>

int main()
{
typedef enum {zero, one, two} enum1;
typedef enum {alpha=-5, beta, gamma, theta=4, delta, omega} enum2;

printf("zero = %d, one = %d, two=%d\n", zero, one, two);
printf("alpha = %d, beta = %d, gamma=%d, theta=%d, delta=%d, omega=%d\n", \
alpha, beta, gamma, theta, delta, omega);

return 0;
}


and the output is:

zero = 0, one = 1, two=2
alpha = -5, beta = -4, gamma=-3, tehta=4, delta=5, omega=6


## 3.10. Constants¶

We have seen some variables now let us see some constants. There are five categories of constants: character, integer, floating-point, string, and enumeration constant. We will see enumeration constants later first we see remaining four types of constants. There are certain rules about constants. Commas and spaces are not allowed except for character and string constants. Their range cannot outgrow the range of there data type. For numeric type of stants they can have a leading (-)minus sign.

Given below is an example:

// Integer constants
// Description: Demo of integer constants

#include <stdio.h>

int main()
{
int decimal = 7;
int octal = 06;
int hex = 0xb;

printf("%d %o %x\n", decimal, octal, hex);

return 0;
}


and the output is:

7 6 b


As you can see there are three different categories for integer constants: decimal constants (base 10), octal constants (base 8) and hexadecimal constants (base 16). Also, you must have noticed how a zero is prefixed before octal type and a zero and x for hexadecimal type. The %d format specifier is already known to you for signed decimals. However, now you know two more %o and %x for unsigned octal and unsigned hexadecimal respectively. For unsigned integer it is %u. There is one more format specifier which you may encounter for signed decimal and that is %i.

Note that there is nothing for binary constants. I leave this as an exercise to you to convert a number in any base shown above to binary and print it. Also vice-versa that is take a input in binary and convert to these three. Later I will show you this program.

Now let us move to floating-point constants. Again, I will explain using an example:

// Floating-point constants
// Description: Demo of floating-point constants

#include <stdio.h>

int main()
{
float f = 7.5384589234;
double d = 13.894578834538578234784;
long double ld = 759.8263478234729402354028358208358230829304;

printf("%f %lf, %Lf\n", f, d, ld);

return 0;
}


and the output is:

7.538459 13.894579, 759.826348


We will learn to change precision later when we deal with format specifiers along with printf and all input/output family. Here also, you learn three format specifiers. Other are %e or %E for scientific notation of float family. Then there is %g or %G which uses shorter of %e and %f types.

Now we move on to character and string type constants and as usual with a small program.

// Character constants
// Description: Demo of character constants

#include <stdio.h>

int main()
{
char c = 'S';
char* str ="Shiv S, Dayal";

printf("%c %s\n", c, str);

return 0;
}


and the ouput is:

S Shiv S, Dayal


As I had said that commas and blanks are not allowed in numeric types but you can see both are allowed on character and string types. Also, the string is a character pointer that is it can point to memory location where a character is stored. In this case the string is stored in an area of memory called stack. When memory is allocated the compiler knows how much has been allocated. For string there is something called null character represented by ‘\0’ which is used to terminate string. By using this mechanism the program knows where the string is terminating. It is treated in next section as well.A very interesting thing to be noted is char is considered to be an integral type. It is allowed to perform addition etc on char type. Till now you have learnt many format specifiers and have seen they all start with %. Think how will you print % on stdout. It is printed like %%. It was simple,wasn’t it? C program have got something called ASCII table which is a 7-bit character table values ranging from0 to 127. There is also something called escape sequences and it is worth to have a look at them.

## 3.11. Escape Sequences¶

Chracter Escape Sequences ASCII Value
null \0 000
backspace \b 008
horizontal tab \t 009
newline (line feed) \n 010
vertical tab \v 011
form feed \f 012
carriage return \r 013
quotation mark (“) " 034
apostrophe (‘) ' 039
question mark \? 063
backslash \ \ 092

Note that there is no space between two backslashes. Sphinx does not allow me to write four continuous backslashes. Now we will talk about all these one by one. \0 which is also known as NULL is the string terminating character, as said previously, and must be present in string for it to terminate. For example, in our character constant program the str string is “Shiv S. Dayal”. So how many characters are there 13? Wrong 14! The NULL character is hidden. Even if we say str=”“; then it will contain one character and that is this NULL. Many standard C functions rely on this presence of NULL and causes a lot of mess because of this.

The bell escape sequence if for a bell from CPU. Let us write a program and see it in effect.

// Bell Program
// Description: Demo of bell escape sequence

#include <stdio.h>

int main()
{
printf("hello\a");

getchar();

return 0;
}


The output of this program will be hello on stdout and an audible or visible bell as per settings of your shell. Notice the getchar() function which waits for input and reads a character from stdin. Next is backspace escape sequence. Let us see a program for its demo as well:

// Backspace Program
// Description: Demo of backspace escape sequence

#include <stdio.h>

int main()
{
printf("h\b*e\b*l\b*l\b*o\b*\n");
printf("\b");

getchar();

return 0;
}


and the output is:

*****


It is hello replaced by *. A minor modification in this program to replace the character as soon as key is pressed by some other character will turn it into a password program. Backspace escape sequence means when it is encountered the cursor moves to the previous position on the line in context. If active position of cursor is initial position then C99 standard does not specify the behavior of display device. However, the behavior on my system is that cursor remains at initial position. Check out on yours. The second printf function determines this behavior.

Next we are going to deal with newline and horizontal tab escape sequences together as combined together they are used to format output in a beautiful fashion. The program is listed below:

// Newline and Horizontal tab program Program
// Description: Demo of newline and horizontal tab escape sequence

#include <stdio.h>

int main()
{
printf("Before tab\tAftertab\n");
printf("\nAfter newline\n");

getchar();

return 0;
}


and the output is:

Before tab      Aftertab

After newline
`

Here I leave you to experiment with other escape sequences. Feel free to explore them. Try various combinations. Let your creative juices flow.