Before giving any formal definition of Recursion, I will like to discuss a scenario…
Browing your files on local disk is a very common task. Let us assume you have to write a procedure that performs an action (like calculating the size) on every file on your C:
The algorithm will be something like follows:
Apply the action on all files in the root directory. Then pick up its sub-diectory & apply the action on all its files. Then pick up the sub-sub directory & continue the procedure until there’s no more nested sub-directories.
The next step is to backtrack to the parent directory & perform the entire action again on another sub-directory… ( I hope u understand what’s going on)
The entire procedure goes on until all of the files have been processed.
This is the ideal scenario where Recursion can kick in.
The entire concept of recursion is as simple as this:
Recursion can be applied to any problem that can be divided into sub-problems that are similar in nature to the original problem.
Coming back to the above problem, the original problem was to Browse the root directory.
It could be divided into sub-problems as performing the action on root directory files plus Browsing the sub-directory files…
This is Recursion.
Let’s take another scenario. Lets assume you have to calculate the factorial of a number (say 5).
The technical definition of a factorial is (remember factorials apply only to integers >=0):
fact(n) = n*fact(n-1); n>0
= 1 ; n=0
This again is a recursive definition. The original problem is divided into sub-problem (n*fact(n-1)) that is exactly similar to the original problem….
That’s Recursion in its simplest terms…
Recursive Conditions
The last example of Recursion brought us to some very important conclusions. To implement any recursive definition, it should satisfy some conditions.
In a nut-shell, any such definition should invariably satisfy the following three conditions:
1. A recursive function should contain a call to itself (thus, the task should be divisible into sub-tasks, similar in nature to the original one).
2. There must be a condition which at some point of time becomes true, after which no further recursive calls are made.
3. Every recursive call should take us a step closer to satisfying that condition.
In the case of factorial definition, the three conditions are satisfied as follws:
1. Calculating factorial of a number first requires calculating factorial of a number smaller than it.
2. The condition for no further recursive calls is
fact(n) = 1; n=0
So 5! = 5 * 4!
= 5 * 4 * 3!
= 5 * 4 * 3 * 2!
= 5 * 4 * 3 * 2 * 1!
= 5 * 4 * 3 * 2 * 1 * 0!
= 5 * 4 * 3 * 2 * 1 * 1
Notice that in the last step, when n=0, instead of making further calls to itself, the procedure ends returning a value & making no further calls.
3. The third condition is obvious, after every call, the number is decremented… which ultimately takes us to the condition when n becomes 0 (satisfying it so that no further calls are made).
The third point above also gives an idea, why factorials of negative numbers don’t exist. They can be decremented to infinity in each step, & so the condition of n becoming 0 will never be satisfied.
Detailed Example
Now, lets move from the theoretical to the practical aspects of Recursion.
Lets define a recursive function fact() to calculate factorial of a given positive number.
The function would be as follows:
{syntaxhighlighter brush: cpp;fontsize: 100; first-line: 1; }int fact(int n)
{
if (n==0)
return(1);
else
return(n*fact(n-1));
}{/syntaxhighlighter}
The above function implements the already discussed recursive definition of factorial.
This brings us to another aspect. You may ask, why should I use Recursion when the same task could be performed with a simple loop…
fact=1; for(i=0;i<n;i++) fact=fact*i;
Agreed….
In theory, every problem that is defined recursively can be implemented using loops.
This means that there is always a work around to Recursion.
But there are situations where a recursive procedure makes more sense, and is easy to implement.
I will gve two examples to support this statement:
1. First consider the above problem of Browing the directory structure.
Its not hard to imagine the complexity had the problem been solved with loops.
Implementing it with recursion is as easy as follows:
(The listing is given in VB for understanding purposes as collections in VB make Browing folders easier. Anyone having a problem with this code can request a c equivalent of the same)
Sub scanfolder(thisFolder as Folder) Dim allFolders as Folders Set allFolders=thisFolder.SubFolders 'Extract all sub folders For Each Folder in allFolders ' Apply the process here scanfolder(thisFolder.path) Next End Sub
2. We all have studied ‘Trees’. These data structures play a very crucial role in any data management.
I assume u all have studied the ‘Tree Traversal’ algorithms. (If u dont, u can ask me to reproduce them here).
If u recall, they are implemented via stack if we dont use recursion & are quite complex at the first instance (especially the Post-Order traversal).
So, here’s the Recursive implementation of Post-Order tree traversal in C.
{syntaxhighlighter brush: cpp;fontsize: 100; first-line: 1; }void PostOrder (Node xx)
{
if (xx = NULL)
return;
if (xx->left != NULL)
PostOrder(xx->left);
if (xx->right !=NULL)
PostOrder(xx->right);
‘Apply Process u want to node xx here
}{/syntaxhighlighter}
Recursion – Finer Points
And now, I will discuss some finer points about Recursion.
First, a recursive definition is always less efficient than when the same procedure is implemented with loops. This is because the overhead related with the function calling itself again & again.
Second, it’s very easy to goof-up recursive implementations. That’s primarily because one of the above three points a recursive definition should satify are violated.
The second & third points are very important in this context.
Remember to include a condition that terminates further recursive calls. Also see, that each recursive call takes a step further to satisfy that condition.
Violating any of these would result in any infinite loop, ultimately causing a stack overflow (i believe everyone of us has encountered this error, when developing our first recursive procedure ).
Just for a quick recall, I would reiterate how the above function satisfied all the three conditions for Recursion…
1. Clearly, it calls itself in its body… So its recursive.
2. The condition to terminate further calls is:
‘When no more sub-folders of current folder exist, dont make further calls & return’.
As u can see, this statement is implemented via the for loop in the above code.
3. All of us know, that folders do not have sub-folders endlessly. At some point, a folder will have no further sub-folder. So, in each step, we are coming close to that situation. This is where there will be no further calls….
Closing Remarks
Some final points from my side…
The above example underlines an advantage of Recursion. Suppose u had to implement it via loop without Recursion.
The major problem u would have had was to handle the allFolders collection.
In principle, u would have two problems.
1. You never not know how many levels of nesting folders, a client’s computer have. So, u would have problem deciding how many times to run the loop, & also how to keep the track of nesting. This would leave u no choice, but dynamic memory management for the allFolders collection, which we all know can get quite messy.
With Recursion, this is not a problem, as u dont have to decide either loop count or the level of nesting. Everytime a new recursive call is made, a new collection alFolders is created. So a new data structure is created on every call & there’s no need of dynamic memory management.
This point would be further clear, if u consider the example of tree traversal presented earlier.
Without recursion, u would have to use a stack, which had to be resized dynamically.
With Recursion, there’s no such thing (Internally, all recursive calls occur on memory stack u know!!!)
2. You would have problem in backtracking. With Recrsion, backtracking occurs automatically.
I am making a couple of further points, u can safely ignore if u r not that well accustomed to Recursion.
You may ask, how do i pass information between successive recursive calls.
Simple, either use parametres…
or for another approach, u can declare Static variables in the function & manipulate it accordingly to pas* information back & forth between the recursive calls…
As i discussed, every recursive definition can be implemented non-recursively.
Remember, u would always have to use a stack anytime u do so…
A Question:
Someone asked me the following question regarding the Directory Traversing problem:
‘As this program is using recursion, then what is the role of For Loop here..??
Please explain.’
The Answer:
As I mentioned, the original problem should be divisible into sub-problems of similar nature. Now if only one sub-problem is generated in each step (like in factorial), there’s no for loop…. But if multiple sub-problems can be generated, a loop is required to handle all of them.
Take for example
Sub scanfolder(thisFolder as Folder)
Dim allFolders as Folders
Set allFolders=thisFolder.SubFolders ‘Extract all sub folders
For Each Folder in allFolders
‘ Apply the process here
scanfolder(thisFolder.path)
Next
End Sub
Suppose ur directory strcuture is as follows:
C:
C:\Sample
C:\Sample\Child1
C:\Sample\Child2
C:\Sample\Child1\MyFolder
Lets consider, how the above program segment would work.
You intially invoke the scanfolder as follows
First Call:
scanfolder(C:\Sample)…. Right???
Now
Dim allFolders as Folders
declares a collection. If u sre not familiar with collections, then at this point u can imagine it to be an array like structure. ok…
Next,
Set allFolders=thisFolder.SubFolders ‘Extract all sub folders
This statement fills allFolders (an array as we might consider it) with the paths of subfolders of the ‘Sample’ folder. This is done by VB internally, we are not concerned here how it is done.
After this step, we get
allFolders(0) = C:\Sample\Child1
allFolders(2) = C:\Sample\Child2
Now, as i discussed, we need to apply the same procedure on each of these childs.
The For loop simply selects each of these childs one after the another, & then we call the recursive procedure on that child.
So, in the first iteration of this loop, a new function call is made as follows:
Second Call:
scanfolder(C:\Sample\Child1)
Now, remember, for each new function call, a new set of variables for that function is generated.
So for this call, a new collection
allFolders
is again created, which is completely independent of the previous collection.
The statement
Set allFolders=thisFolder.SubFolders ‘Extract all sub folders
now as*igns
allFolders(0)= C:\Sample\Child1\MyFolder
Next, in the for loop, scanFolder is again recursively called as:
Third Call:
scanfolder(C:\Sample\Child1\MyFolder)
In this version, again a new collection allFolders is created.
The statement
Set allFolders=thisFolder.SubFolders ‘Extract all sub folders
doest not as*ign anyhting to allFolders as there is no sub-folder of MyFolder in our example.
allFolders remains Null.
So, the For loop is not executed even once in this case & there are no further recursive calls.
Thus, the third call marked above terminates, & the control returns back to the second call made earlier.
Back to Second Call:
Recall, the allFolders in second call was:
allFolders(0)= C:\Sample\Child1\MyFolder
So, the For loop had to execute only once, & terminates now.
This finishes off the second recursive call & the control returns back to the first call.
Back to First Call:
Recall, the allFolders in first call was:
allFolders(0) = C:\Sample\Child1
allFolders(2) = C:\Sample\Child2
The for loop has already executed once for allFolders(0).
In the next iteration, it runs for allFolders(1), & so a new Call is made to scanfolder as follows:
scanfolder(C:\Sample\Child2)
Fourth Call:
Child2 has no sub-folder. So
Set allFolders=thisFolder.SubFolders ‘Extract all sub folders
still keeps allFolders=Null.
This means that the for loop is not executed even once & so, the function terminates to go back to its parent calling function, which was the Firat Call in our case…
Back to First Call again:
Now, the for loop in this call also terminates as it has traversed the entire allFolders collection.
Thus, finally, at this point of time, all recursive calls have terminated,
And the control returns back to the function, which originally called the scanfolder function…
Hope, its clear now to everybody…