Final Paper
Tips
String Manipulation
When working with strings in C, always account for the null character by stopping the loop at len - 1
(excluding) to avoid overwriting the null character. For example,
String Literal
A string literal refers to a string written between two "
characters, such as "Hello world!"
. And it is stored in a read-only memory region (not the stack).
The common between str1
and str2
is that both of themselves are on the stack. The difference between the two is that str1
points to a read-only region in the memory (but str1
itself is a pointer on the stack), while str2
contains a copy of the string on the stack.
To create a copy of the string literal on the stack using
char
arrays, we have two methods:
char str[]
or
char str[num]
, wherenum
is an integer number specifying the string's lengthAnd it is only when we define a pointer that points to the read-only memory region can't we modify its content.
Illegal Memory Access
Illegal Memory Access needs to satisfy two requirements:
the memory address is Illegal (
NULL
and the Examples for Illegal Memory Access in below)you try to access the memory address
In this the code below:
If list[0]
is NULL, as long as we don't access its memory address (read/write), we are not doing anything illegal.
We are allowed to have pointers pointing to arbitrary region in memory, but as long as we don’t access those memory, we are not doing anything illegal.
Memory allocation
The code below is a classic memory allocation error:
Here a
should be a pointer pointing to an array of long
, but here you assume it should be pointing to an array of long *
. The implementation above is wrong. The correct one should be as follows:
Examples for Illegal Memory Access
When you return an address on the stack that has been "cleared"
When you try to write to (access) the address stored in an unintialized pointer.
When you did not allocate enough memory on the heap for the variable that a pointer points to. And you try to "access" the pointer by trying to write value into it.
When you access some address like decimal number 10, etc
Multidimensional Array
Array Name Decay
Since an array in C consists only of a contiguous region of memory that stores the elements of the array, the address of an array is the same as the address of the first element of the array. (An element here doesn't need to be of a basic data type, like long
, double
. It can be a pointer variable also!) The following five statements would print out exactly the same values.
From this example, by observing each pair of the new lines that have the same meaning. We can see the essence of array decay is: If we have a 2-D array matrix
, then matrix
will decay to &matrix[0]
. Similarly, matrix[0]
will decay to &matrix[0][0]
.
Add an
&
operator can be considered as adding one "array" degree (e.g. 1D array to 2D array.long
to 1D array)
Contiguous Memory Allocation
In the code below, 10 is the num_of_rows
. What we have done here is to allocate a chunk of memory with size num_of_cols * 10
once. After that, we iteratively point the remaining 9 pointers to the correct position.
To free this kind of 2-D array, just use
Why we cannot use
free(buckets)
here?Accoding to the Linux Programmer's Manual,
void free(void *ptr)
should follow:The
free()
function frees the memory space pointed to byptr
, which must have been returned by a previous call tomalloc()
,calloc()
, orrealloc()
. Otherwise, or iffree(ptr)
has already been called before, undefined behavior occurs. Ifptr
is NULL, no operation is performed.In our case, due to array-decay,
buckets
is not a "heap-object" returned bymalloc()
,calloc()
, orrealloc()
. Rather, it is an "stack-object", so we cannot passbuckets
directly intofree()
. If so, we will get warnings from the compiler.
Pointer Arithmetic
We can perform arithmetic operations on pointers, but not in the way you expect. Suppose we have a pointer:
Suppose that x
is stored in memory address 1000, after Line 4, ptr
would have the value of 1000. After the line ptr += 1
, using normal arithmetic operation, we would think that ptr
will have a value of 1001. However, the semantics for arithmetic operations differ for pointers. The +
operation for ptr
causes the ptr
variable to move forward by the size of the variable pointed to by the pointer. In this example, ptr
points to long
, assuming that long
is 8 bytes, after ptr += 1
, ptr
will have the value of 1008.
Macro
Be careful with situations like this:
Therefore, we should always use brackets around the arguments of a macro, i.e., SQUARE(x) (x) * (x)
is safe.
Time Complexity
Quickly Judge Method
To judge the time complexity quickly, we can only pay attention to the terminating condition and the loop changing condition to decide the time complexity of the loop. For example,
In this problem, the time complexity for the outer loop is just . And for inner loop, it's just . So, the overall time complexity will be .
Note that this method works at most of the cases except for
the edge cases. e.g.
for (i=0;i<n;i*=2)
, here it's an infinite loop andwhen the initialization involves terms of input
n
. e.g.for(i=n;i<n+1;i+=1)
.
Comparing Rate of Growth
Given two functions and , how do we determine which one has a higher rate of growth? We say that grows faster than if we can find a , such that for all and for some constant .
For instance, which one grows faster? or ? Pick , we have . Pick , we have . Pick , we have now, and we can see that for any , , so we can conclude that grows faster than .
Some ready results
or its equivalent has the time complexity .
, its time complexity is . An example is the Tower of Hanoi
, its time complexity is
and have the same time complexity as . An example is the Permutations
and have the same time complexity as
Useful Calculation Tips
When you expand the geometric sequence, suppose the common ratio is and the first term is . Then the sum can be expressed as , where is the number of terms in this geometric sequence.
Knowing the last term in the geometric sequence, a quick way to get the number of terms in the geometric sequence is to do the logarithmic operation. e.g. Suppose the last term is and the common ratio is , then
Call Stack Diagram
For the following code:
Its call stack diagram looks as follows:
The most important thing to note is that inside baz()
to call qux()
, we are actually passing the address of the num
and ptr
which are in the main()
! So the pointer should point to these two variables in the main()
, instead of pointing to the temp variable in the baz()
!
Sorting
Below is the table summarises the running time for different sorting algorithms under different cases:
Bubble Sort
() (not optimized)
Insertion Sort
Selection Sort
Below is the table that summarises the number of swaps each sorting algorithm needs under different cases:
Bubble Sort
0
Insertion Sort
0
Selection Sort
Note that for Selection Sort, if the input is a sorted array, every time we will swap tje element with itself! So, it needs
Comparison between Insertion Sort, Bubble Sort and Selection Sort
Insertion Sort
Fast for nearly sorted lists: Insertion sort performs well on almost sorted arrays, requiring minimal comparisons and shifts.
Efficient for small datasets: Insertion sort is simple and efficient for small arrays due to low overhead.
Adaptive nature: It naturally adapts to the degree of disorder in the input, reducing unnecessary operations for sorted or nearly sorted inputs.
Bubble Sort
Detects sorted arrays quickly (with optimization): If implemented with an early-exit condition, Bubble Sort can terminate early when no swaps are made in a pass.
Simple to implement: Bubble Sort is conceptually easy to understand and implement for basic sorting tasks.
Stable sorting: It preserves the relative order of equal elements, making it useful in certain situations.
Selection Sort
Good for minimizing swaps: Selection Sort is ideal when the cost of swapping elements is significant since it performs exactly swaps.
Predictable performance: The number of comparisons and swaps is independent of the input's order, making its behavior predictable.
Useful for memory-constrained environments: Selection Sort requires minimal extra memory as it sorts in place.
Counting Sort
Efficient for small range of integers: Counting sort works exceptionally well when the range of input elements (the difference between the maximum and minimum values) is small relative to the number of elements.
Linear time complexity for small integer ranges: When the range of values is not large, Counting Sort can achieve O(n + k) time complexity, where nnn is the number of elements and kkk is the range of the input.
Stable sorting: Counting Sort is stable, meaning it preserves the relative order of elements with equal values.
Non-comparison-based: It doesn’t rely on comparisons between elements, making it faster than comparison-based sorts (like Bubble, Insertion, or Selection Sort) for certain types of data.
Struct
Using typedef
on struct
frees us from typing the word struct
every time. We can do so with either:
or
In either case, we can just use module
like any other type:
Standard I/O
printf
printf()
takes in a variable number of arguments. For the first argument, it should be a format string containing one or more format specifiers, like %s
. And the general format for the format modifier is:
The specifier controls the interpretation of the argument. s
for string, c
for character, d
for integer (base 10), f
for floating-point number, p
for pointer (base 16). We can additionally prepend this with length modifier. ld
for long
integer, lld
for long long
, and lf
for double
.
To format the output, we can prepend it with a number to indicate its field width, or minimum space used when printing. E.g., %3d
will pad the number printed with space if the number printed is less than 3 digits. Adding a flag 0 in front, %03d
, will pad the number with 0s if the number printed is less than 3 digits. For floating-point numbers, we can additionally control the precision, or the number of digits printed after the decimal point. e.g. %3.4lf
will print a double to four decimal points. The first 3 indicates that if the whole floating point number (integer + floating parts + 1 for the .
) is less than length 3, white spaces will be padded at front. Otherwise, nothing will be padded.
Some examples:
Mismatch Number of Arguments
Since printf
expects a variable number of arguments, you can pass it fewer arguments than expected and the code would still compile (with warnings). If you push ahead and run it anyway, printf
will start to fetch arguments from the stack, pretending that it is there, causing weird things to happen.
Printing User Input
We should also never do this:
The reason is that we have no control over what the user would type as input: the user may type "%s" into the standard input, so the variable str
now points to %s
, which printf
treats as a format modifier, and output the content of the stack! This is a huge security risk and is known as the externally-controlled format string vulnerability.
We should always print a string using:
scanf
Like printf
, scanf
takes in one or more arguments, with the first argument being a format string containing one or more format specifiers. The format specifier for scanf
is simpler and has the following pattern:
Buffer Overflow
A buffer overflow is a specific kind of undefined behavior resulting from a program that tries to write more data to an (array) variable than this variable can hold.
Automatically add null character at the end
When scanf()
is finished parsing into a string, it appends the null character ('\0'
) automatically, and there must be space left for it. e.g. if we define a char str[40]
, we should use field with as 39 in scanf (scanf("%39s", str)
)
fgets
fgets()
does a simple thing, it reads up to a given maximum number of characters, but stops at a newline, which is read as well. In other words: It reads a line of input.
There are two very nice things about this function for what we want to do:
The parameter for the maximum length accounts for the necessary
0
byte, so we can just pass the size of our variable.The return value is either a pointer to
str
orNULL
if, for any reason, nothing was read.
To read in a 6-letter module code (e.g., "CS1010") from the standard input. We can use the following:
C Operator Precedence
Miscs
(Variable and Proof) For
long
variables, it can be negative! Always try to think about negative number when trying to form the counter example.(Variable and Proof) The keyword "if and only if" needs to prove both forwardly and reversely!
(Loop Invariant) To find the correct loop invariant, think more about what the code is doing rather than giving out the exact expression directly because sometimes the loop invariant even cannot be expressed in the exact expression. e.g. in one of the final papers' question, the strong and correct loop invariant is "the minimum element of the array is in
a[l .. r]
"(Recursion) For the recursion question, pay attention to how many times of recursion calls we will do every time. And, we need to think all cases to decide how we can modify e.g. to make the recursion finite.
(Memory Leak) Whenever you call a
malloc()
, always be careful not to point the pointer to else where. This will cause memory leak!(Uninitialized Variables) Uninitialized variables can cause unpredictable behavior and lead to errors that are difficult to diagnose.
(Time Complexity) Given that the running time of a recursive program is . The meaning of means the running time of its base case. For the nqueens problem, should be .
(Time Complexity) The time complexity to print a string with length
n
is .(Compilation Error vs. Runtime Error) Errors that occur during compiling is called "compilation error". In constrast, errors that occured during execution of a program is called a "run-time error". e.g. When you access the "out of bound" index of an array, it won't generate compilation error, but it may generate runtime error or undefined behavior.
(
const
behavior) An example is:const long* a
meansa
is a pointer to const, and any attempt to write viaa
will error out. e.g.*a=10
is invalid. This is different fromlong* const a
, which declares a const pointera
that cannot be reassigned to point to another place. e.g.a=<pointer>
is invalid.(Insertion Sort) Insertion sort is fast when it comes to sorting an almost sorted list.
Valuable Problems from Past Year
Below are the problems from past year that I think are important:
Here the number refers to the exact pages in the corresponding past year papers' comments version.
Last updated