Lesson 4 - Polymorphism

by Zoran Horvat

In previous lessons we have developed types derived from a base type and demonstrated pointer substitution principle. With that powerful tool at hand, we were free to call base functions and then to supply derived type instance to them and still everything worked perfectly, like in this piece of code:

``````struct Ellipse *ellipse = (struct Ellipse*)malloc(sizeof(struct Ellipse));
Ellipse_Constructor1(ellipse, 1.0F, 2.0F, 3.0F, 4.0F);

Shape_set_Name((struct Shape*)ellipse, "Ellipse");

Shape_PrintOut((struct Shape*)ellipse);
``````

This code demonstrates pointer substitution in calls to Shape_set_Name and Shape_PrintOut functions, which both expect pointer to base type Shape but pointers to subtypes were passed to them. What matters more is that Shape_PrintOut function operates on base type only and produces the following output:

```            ```
Ellipse's location is (1.00, 2.00).
```
```

``````struct Ellipse
{
struct Shape _base;
};
``````

Base type's Shape_PrintOut function cannot use the added fields because they are not part of the base type. In order to make a printout function which puts these fields to the output, we have to define that function on a derived type. We want to use base type's PrintOut function to print the base part of the instance, and then to append more text to express derived type's content. But there is a problem, as Shape_PrintOut puts a full stop and new line characters at the end of its output. Solution is in redesigning the Shape type by adding another printout function which receives desired suffix as its argument:

``````void Shape_PrintOut(struct Shape *_this)
{
Shape_PrintOut1(_this, ".\n");
}

void Shape_PrintOut1(struct Shape *_this, const char *suffix)
{

if (_this->name != NULL)
printf("%s", _this->name);
else
printf("<null>");

printf("'s location is (%.2f, %.2f)", _this->locationX, _this->locationY);

if (suffix != NULL)
printf("%s", suffix);

}
``````

Similar redesign can be applied to C++ version of our code:

``````// Partial listing of shape.hpp
#ifndef SHAPE_HPP
#define SHAPE_HPP

class Shape
{
public:
...
void PrintOut();
void PrintOut(const char *suffix);
...
};

#endif
``````
``````// Partial listing of shape.cpp
#include <iostream>
#include <stdlib.h>
#include <string.h>
#include "shape.hpp"
...
void Shape::PrintOut()
{
PrintOut(".\n");
}

void Shape::PrintOut(const char *suffix)
{

cout.flags(ios::fixed);
cout.precision(2);

if (name != NULL)
cout << name;
else
cout << "<null>";

cout << "'s location is (" << locationX << ", " << locationY << ")";

if (suffix != NULL)
cout << suffix;

}
...
``````

C++ offers one more feature to help method overloading - optional arguments. PrintOut function can be declared only once, but to cover both of its flavors:

``````// Partial listing of shape.hpp
#ifndef SHAPE_HPP
#define SHAPE_HPP

class Shape
{
public:
...
void PrintOut(const char *suffix=".\n");
};
#endif
``````

Definition of the method with suffix parameter in shape.cpp file remains the same as before while parameterless definition is completely removed. This shorthand notation in C++ encourages design approach which we have just applied.

On a related note, C# 4.0 introduces optional arguments, which were not supported in earlier versions. Thus in versions before C# 4.0, we would have to provide two implementations of the PrintOut method and then to call the one with suffix argument from inside the other one - just as we did in plain C. After introduction of the optional arguments, we can code one PrintOut function just as it was coded in C++, so that default value for the suffix is ".\n". But this would not be in the spirit of .NET programming. We should not specify \n for new line character because it is not known in advance which are the new line characters on system on which program will run. Instead, suggested way is to use Environment static class and its NewLine property to discover new line characters on system at hand. However, this property is not a constant and cannot be used as default value for the method argument. For all these reasons, we are forced to write two methods in C# as we did in C:

``````// Partial listing of Shape.cs
using System;

namespace Geometry
{
public class Shape
{
...
public void PrintOut()
{
PrintOut("." + Environment.NewLine);
}
public void PrintOut(string suffix)
{
Console.Write("{0}'s location is ({1:0.00}, {2:0.00}){3}",
Name, locationX, locationY, suffix);
}
...
}
}
``````

If we turn back to our shape printing problem and implementation in C, we can try to be consistent and to provide the same two implementations of PrintOut method in derived types - one without parameters and another with the suffix parameter. Here are the functions declarations in the Ellipse type:

``````/* Partial listing of ellipse.h */
#include "shape.h"

#ifndef ELLIPSE_H
#define ELLIPSE_H

struct Ellipse
{
struct Shape _base;
};

...
void Ellipse_PrintOut(sturct Ellipse *_this);
void Ellipse_PrintOut1(struct Ellipse *_this, const char *suffix);

#endif
``````
``````/* Partial listing of ellipse.c */
#include "ellipse.h"
#include <stdio.h>

...
void Ellipse_PrintOut(string Ellipse *_this)
{
Ellipse_PrintOut1(_this, ".\n");
}

void Ellipse_PrintOut1(struct Ellipse *_this, const char *suffix)
{

Shape_PrintOut1((struct Shape*)_this, NULL); /* No suffix after base class's printout */

if (suffix != NULL)
printf("%s", suffix);

}
``````

Quite the same implementation can be provided for the corresponding functions in the Rectangle type, only radiuses are replaced with width and height (declarations in header file are omitted as they bring nothing new to the picture):

``````/* Partial of rectangle.c */
#include "rectangle.h"
#include <stdio.h>

...
void Rectangle_PrintOut(struct Rectangle *_this)
{
Rectangle_PrintOut1(_this, ".\n");
}

void Rectangle_PrintOut1(struct Rectangle *_this, const char *suffix)
{

Shape_PrintOut1((struct Shape*)_this, NULL);

printf("; Width=%.2f, Height=%.2f", _this->width, _this->height);

if (suffix != NULL)
printf("%s", suffix);

}
``````

Dynamic Dispatch

Up to this point we have been dealing with object-oriented design, but now another obstacle arises which cannot be solved by traditional means. We will depict the problem on an example of a function which just prints out details of all shapes from one array:

``````void PrintShapes(struct Shape *shapes[], int count)
{
int i;
for (i = 0; i < count; i++)
Shape_PrintOut(shapes[i]);
}
``````

This function is more useless than short, as can be seen if we try to actually use it:

``````struct Ellipse *ellipse = NULL;
struct Rectangle *rectangle = NULL;
struct Shape *shapes[2];

ellipse = (struct Ellipse*)malloc(sizeof(struct Ellipse));
Ellipse_Constructor1(ellipse, 1.0F, 2.0F, 3.0F, 4.0F);
Shape_set_Name((struct Shape*)ellipse, "Ellipse");

rectangle = (struct Rectangle*)malloc(sizeof(struct Rectangle));
Rectangle_Constructor1(rectangle, 3.0F, 4.0F, 5.0F, 6.0F);
Shape_set_Name((struct Shape*)rectangle, "Rectangle");

shapes[0] = (struct Shape*)ellipse;
shapes[1] = (struct Shape*)rectangle;

PrintShapes(shapes, 2);
``````

This code produces output:

```            ```
Ellipse's location is (1.00, 2.00).
Rectangle's location is (3.00, 4.00).
```
```

The PrintOut function did what it normally does. It prints out only the base part of each instance, ignoring the rest.

Solution to this problem, which is regularly used in object-oriented languages, is to build a so-called virtual method table to contain actual addresses of methods associated with the object. Every object must contain a pointer to its corresponding virtual method table. Only then, correct pointer to every particular function will be discoverable when function should be called on an object. Let's see what it looks like when applied to our geometric shapes. We will first define a simple framework behind the virtual method table implementations:

``````/* Listing of vtable.h */
#ifndef VTABLE_H
#define VTABLE_H

typedef void* vtable_entry;
typedef vtable_entry* vtable_ptr;

void vtable_Initialize();

#endif
``````
``````/* Listing of vtable.c */
#include "vtable.h"
#include "shape.h"

void vtable_Initialize()
{
}
``````

Function vtable_Initialize is the placeholder - that is the function in which virtual method tables will be initialized; that function will then be called at the very beginning of the main function so that it can initialize the tables before any object is created. Now we can implemnet virtual method table in the Shape type:

``````/* Partial listing of shape.h */
#include "vtable.h"

#ifndef SHAPE_H
#define SHAPE_H

#define VT_SHAPE_PRINTOUT 0
#define VT_SHAPE_PRINTOUT1 1
#define VT_SHAPE_END 2

vtable_entry vtable_Shape[VT_SHAPE_END];

struct Shape
{
vtable_ptr vtable;
char *name;
float locationX;
float locationY;
};

typedef void (*vcall_Shape_PrintOut)(struct Shape*);
...
``````

New elements are definitions of symbols: VT_SHAPE_PRINTOUT and VT_SHAPE_PRINTOUT1 will point to the fixed locationS for the PrintOut and PrintOut1 functions in the shape's virtual method table. The next symbol defined, VT_SHAPE_END, simply points to the end of the virtual method table and is used only to define table's dimension. Only one step is required before finishing modifications to the Shape type - initializing the virtual method table pointer in instaces of the type:

``````/* Partial listing of shape.c */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "shape.h"

void Shape_Constructor(struct Shape *_this)
{

_this->vtable = vtable_Shape;

_this->locationX = 0.0F;
_this->locationY = 0.0F;
_this->name = NULL;

}

void Shape_Constructor1(struct Shape *_this, float locationX, float locationY)
{

_this->vtable = vtable_Shape;

_this->locationX = locationX;
_this->locationY = locationY;
_this->name = NULL;

}
...
``````

New constructor implementations are adding valid virtual method table pointer to each new object of the type. Now we are ready to override PrintOut and PrintOut1 functions in Ellipse and Rectangle types so that derived types can provide their own implementations for these virtual methods:

``````/* Listing of ellipse.h */
#include "shape.h"

#ifndef ELLIPSE_H
#define ELLIPSE_H

vtable_entry vtable_Shape_Ellipse[VT_SHAPE_END];

struct Ellipse
{
struct Shape _base;
};
...
#endif
``````

Observe that, apart from declaring the virtual method table for Ellipse type, everything else remains the same. Only constructors need to be changed to initialize specific virtual method table pointer in instances of Ellipse type:

``````/* Partial listing of ellipse.c */
#include "ellipse.h"
#include <stdio.h>

void Ellipse_Constructor(struct Ellipse *_this)
{

Shape_Constructor((struct Shape*)_this);

_this->_base.vtable = vtable_Shape_Ellipse;

}

void Ellipse_Constructor1(struct Ellipse *_this, float locationX, float locationY,
{

Shape_Constructor1((struct Shape*)_this, locationX, locationY);

_this->_base.vtable = vtable_Shape_Ellipse;

}
``````

Pointers to ellipse-specific virtual method table is set after base type constructor execution ends - in that way derived type effectively overrides the virtual method table.

The same process can be repeated in the Rectangle subtype. We will define its own vtable for methods inherited from ShapeL

``````/* Partial listing of rectangle.h */
#include "shape.h"
#include "vtable.h"

#ifndef RECTANGLE_H
#define RECTANGLE_H

vtable_entry vtable_Shape_Rectangle[VT_SHAPE_END];

struct Rectangle
{
struct Shape _base;
float width;
float height;
};
...
#endif
``````
``````/* Partial listing of rectangle.c */
#include "rectangle.h"
#include <stdio.h>

void Rectangle_Constructor(struct Rectangle *_this)
{

Shape_Constructor((struct Shape*)_this);

_this->_base.vtable = vtable_Shape_Rectangle;

_this->height = 0.0F;
_this->width = 0.0F;

}

void Rectangle_Constructor1(struct Rectangle *_this, float locationX, float locationY,
float width, float height)
{

Shape_Constructor1((struct Shape*)_this, locationX, locationY);

_this->_base.vtable = vtable_Shape_Rectangle;

_this->width = width;
_this->height = height;

}
...
``````

Pointer to new virtual method table has been initialized in all Rectangle instances. It only remains to populate the table with overridden method implementations in the Rectangle type:

``````/* Listing of vtable.c */
#include "vtable.h"
#include "shape.h"
#include "ellipse.h"
#include "rectangle.h"

void vtable_Initialize()
{

vtable_Shape[VT_SHAPE_PRINTOUT] = Shape_PrintOut;
vtable_Shape[VT_SHAPE_RPINTOUT1] = Shape_PrintOut1;

vtable_Shape_Ellipse[VT_SHAPE_PRINTOUT] = Ellipse_PrintOut;
vtable_Shape_Ellipse[VT_SHAPE_PRINTOUT1] = Ellipse_PrintOut;

vtable_Shape_Rectangle[VT_SHAPE_PRINTOUT] = Rectangle_PrintOut;
vtable_Shape_Rectangle[VT_SHAPE_PRINTOUT1] = Rectangle_PrintOut1;

}
``````

We will now provide the whole of main.c source file which includes redesigned PrintShapes function and a call made to vtable_Initialize global function which initializes virtual method tables for our types. PrintShapes itself is redesigned to consult virtual method table of each object before making a call.

``````/* Listing of main.c */
#include <stdlib.h>
#include "ellipse.h"
#include "rectangle.h"
#include "shape.h"
#include "vtable.h"
#include <stdio.h>

void PrintShapes(struct Shape *shapes[], int count)
{
int i;
for (i = 0; i < count; i++)
{
vcall_Shape_PrintOut f = (vcall_Shape_PrintOut)shapes[i]->vtable[VT_SHAPE_PRINTOUT];
f(shapes[i]);
}
}

int main(char args[])
{

struct Ellipse *ellipse = NULL;
struct Rectangle *rectangle = NULL;
struct Shape *shapes[2];

vtable_Initialize();

ellipse = (struct Ellipse*)malloc(sizeof(struct Ellipse));
Ellipse_Constructor1(ellipse, 1.0F, 2.0F, 3.0F, 4.0F);
Shape_set_Name((struct Shape*)ellipse, "Ellipse");

rectangle = (struct Rectangle*)malloc(sizeof(struct Rectangle));
Rectangle_Constructor1(rectangle, 3.0F, 4.0F, 5.0F, 6.0F);
Shape_set_Name((struct Shape*)rectangle, "Rectangle");

shapes[0] = (struct Shape*)ellipse;
shapes[1] = (struct Shape*)rectangle;

PrintShapes(shapes, 2);

Ellipse_Destructor(ellipse);
free(ellipse);

Rectangle_Destructor(rectangle);
free(rectangle);

}
``````

After we have implemented virtual methods up to the finest detail, it should come as no surprise that this source code produces full output when PrintShapes function is called:

```            ```
Ellipse's location is (1.00, 2.00); radiuses are (3.00, 4.00).
Rectangle's location is (3.00, 4.00); Width=5.00, Height=6.00.
```
```

This is the ultimate demonstration of the power of virtual methods. Function which declaratively operates on instances of base type is dynamically linked to methods of derived types. Even more, each instance is bound to its own implementation of virtual methods. But that is not all - if derived type is satisfied with base type's implementation of a virtual method, then it is free to put the pointer to base type's method into its own vtable and thus decide not to override the method.

Virtual Methods in Object-Oriented Languages

Implementing virtual methods in C++ is very easy. In the base class we simply put the virtual keyword at the beginning of method declaration and all work regarding base class is finished:

``````// Partial listing of shape.hpp
#ifndef SHAPE_HPP
#define SHAPE_HPP

class Shape
{
public:
...
virtual void PrintOut(const char *suffix=".\n");
...
};

#endif
``````

When compiler encounters a virtual function, it immediately creates a virtual method table for that class and puts that and any other methods marked virtual into the table. Moreover, all derived classes will also be generated with their versions of vtable, which is then populated with function pointers accordingly. It can be shown on example of the Ellipse class:

``````// Listing of ellipse.hpp
#include "shape.hpp"

#ifndef ELLIPSE_HPP
#define ELLIPSE_HPP

class Ellipse: public Shape
{
public:
...
void PrintOut(const char *suffix=".\n");
...
};

#endif
``````

Derived class simply provides a method with the same name, arguments and return type as the virtual method in its base class. As soon as such method is declared, compiler adds pointer to it to derived class's version of the vtable. By doing so, derived class has overridden base class's implementation of the method. Exactly the same thing is done in the Rectangle class to provide yet another override of the PrintOut method:

``````// Listing of rectangle.hpp
#include "shape.hpp"

#ifndef RECTANGLE_HPP
#define RECTANGLE_HPP

class Rectangle: public Shape
{
public:
...
void PrintOut(const char *suffix="\n");
...
};

#endif
``````

Implementation of method overrides is similar to that in C programming language. For example, PrintOut override in Ellipse class may look like this:

``````// Partial of ellipse.cpp
#include <iostream>
#include "ellipse.hpp"

using namespace std;

void Ellipse::PrintOut(const char *suffix)
{

Shape::PrintOut(NULL);

cout.flags(ios::fixed);
cout.precision(2);

if (suffix != NULL)
cout << suffix;

}
``````

Ellipse class is referencing base class implementation of the method by stating the class name in this line:

``````Shape::PrintOut(NULL);
``````

This is sufficient to delegate the call to base implementation of the method.

Implementation in C# is equally simple. Declaring the method virtual is exactly the same as in C++. Only overriding is done a little bit differently as we will shortly see. Here are implementations of the classes:

``````// Partial listing of Shape.cs
using System;

namespace Geometry
{
public class Shape
{
...
public virtual void PrintOut()
{
PrintOut("." + Environment.NewLine);
}

public virtual void PrintOut(string suffix)
{
Console.Write("{0}'s location is ({1:0.00}, {2:0.00}){3}",
Name, locationX, locationY, suffix);
}
...
}
}
``````

Putting a prefix virtual in method declaration has the same impact as it had in C++. Compiler will create virtual method table for the Shape class and for all its derived classes. Since we have implemented PrintOut function as two overloaded methods, both have to be declared virtual. (This is not a syntactical necessity, though - code will compile even if only one is virtual and the other one not, but it would be hard to justify such design decision.) Overriding a method in C# is the same as in C++, only such method must be marked with another keyword: override. Here is the source code for Ellipse class:

``````// Partial listing of ellipse.cs

using System;

namespace Geometry
{
public class Ellipse: Shape
{
...
public override void PrintOut()
{
PrintOut("." + Environment.NewLine);
}

public override void PrintOut(string suffix)
{
base.PrintOut(null);
}
...
}
}
``````

Apart from the override keyword, another difference compared to C++ is that base class is referred using the base keyword.

These couple of lines in C#, and those in C++ demonstrate the ease with which we can implement dynamic dispatch in modern object-oriented languages. All those piles of code developed in C to maintain and later resolve pointers to virtual functions are not required any more. Off course, not required from the programmer to be coded explicitly. They are still present, under the hood, implemented on our behalf by the compiler. Pointer to vtable is silently added to all objects of classes that possess virtual methods. Virtual method tables initialization code is internally called before the program starts (that was the vtable_Initialize global function in C programming language). Pointers to virtual methods are copied to virtual tables before the first instance of the class is created. Compiler and the runtime libraries are taking care that all moving parts are in their proper positions when execution comes to the point of calling a virtual method.

Adding Virtual Methods to the Subtype

Suppose that we want to declare new method in the Ellipse type, method called MakeCircular. Its task would be to make radiuses equal, e.g. by reducing the larger one to become equal to smaller one. We want this method to be virtual so that it can be changed by types deriving from Ellipse. Here is the listing which demonstrates adding the virtual method to Ellipse class with virtual method table appended at the end of inherited table with methods defined by Shape.

``````/* Listing of ellipse.h */
#include "shape.h"

#ifndef ELLIPSE_H
#define ELLIPSE_H

#define VT_ELLIPSE_MAKECIRCULAR VT_SHAPE_END
#define VT_ELLIPSE_END VT_SHAPE_END + 1

vtable_entry vtable_Shape_Ellipse[VT_ELLIPSE_END];

struct Ellipse
{
struct Shape _base;
};

typedef void (*vcall_Ellipse_MakeCircular)(struct Ellipse*);

...
void Ellipse_MakeCircular(struct Ellipse *_this);
...
#endif
``````

We have used constants defined by the Shape type to discover index at which pointer to MakeCircular method would be stored in the extended virtual method table. This modification requires vtable_Initialize global function to be changed as well, so that new entry in the table is properly initialized:

``````/* Listing of vtable.c */
#include "vtable.h"
#include "shape.h"
#include "ellipse.h"
#include "rectangle.h"

void vtable_Initialize()
{

vtable_Shape[VT_SHAPE_PRINTOUT] = Shape_PrintOut;
vtable_Shape[VT_SHAPE_PRINTOUT1] = Shape_PrintOut1;

vtable_Shape_Ellipse[VT_SHAPE_PRINTOUT] = Ellipse_PrintOut;
vtable_Shape_Ellipse[VT_SHAPE_PRINTOUT1] = Ellipse_PrintOut1;
vtable_Shape_Ellipse[VT_ELLIPSE_MAKECIRCULAR] = Ellipse_MakeCircular;

vtable_Shape_Rectangle[VT_SHAPE_PRINTOUT] = Rectangle_PrintOut;
vtable_Shape_Rectangle[VT_SHAPE_PRINTOUT1] = Rectangle_PrintOut1;

}
``````

And that is all that was required to add virtual functions to the derived type. Calling this virtual function on an object is the same as it was with the base type methods:

``````struct Ellipse *ellipse = (struct Ellipse*)malloc(sizeof(struct Ellipse));
Ellipse_Constructor1(ellipse, 1.0F, 2.0F, 3.0F, 4.0F);
Shape_set_Name((struct Shape*)ellipse, "Ellipse");

vcall_Ellipse_MakeCircular f = (vcall_Ellipse_MakeCircular)ellipse->_base.vtable[VT_ELLIPSE_MAKECIRCULAR];
f(ellipse);

Ellipse_Destructor(ellipse);
free(ellipse);
``````

Should the type derived from Ellipse provide its own override for this method, dynamic dispatch mechanism would be there to call the overridden implementation.

Virtual Getters and Setters

In terms of virtual functions, getter and setter methods are treated as any other. If these methods are declared as virtual, then their addresses are added to virtual method table and derived types can override them. C# brings in the properties, as shorthand notation for getter and/or setter methods. We have already seen that C# property X with get and set sections is internally replaced with get_X and set_X methods. Then it should not be a problem to accept that properties can be declared virtual as if they were true methods. Compiler is there to find place for property getter and setter in the virtual method table.

As an example, suppose that Shape class allows derived classes to override Name property. To do so, it is sufficient to declare the Name property virtual:

``````// Partial listing of Shape.cs
using System;

namespace Geometry
{
public class Shape
{
...
public virtual string Name
{
get
{
return name ?? string.Empty;
}
set
{
name = value;
}
}
...
}
}
``````

Overriding the Name property then goes like defining it again in the derived class, only with override keyword added to the declaration:

``````// Partial listing of ellipse.cs
using System;

namespace Geometry
{
public class Ellipse: Shape
{
...
public override string Name
{
get
{
return base.Name;
}
set
{
if (string.IsNullOrEmpty(value))
base.Name = "Ellipse";
else
base.Name = value;
}
}
...
}
}
``````

Virtual Destructors

Consider the following function applied to geometric shapes, which is supposed to deallocate objects from the specified array:

``````void ReleaseShapes(struct Shape *shapes[], int count)
{

int i;

for (i = 0; i < count; i++)
{
Shape_Destructor(shapes[i]);
free(shapes[i]);
}

}
``````

This code iterates through the array of shapes, calls destructor on each of them and then releases the instance. Problem occurs if shapes array contains instances of derived types. In that case, derived destructor would not be called. It is obvious that destructors should be declared as virtual, so that derived destructor can be dynamically called when needed. With this change in place, we are assured that correct destructor is called, whichever object is passed to the function. Rules applied to the destructor are the same as for any other virtual method.

Things are getting a little bit more relaxed with C#. In that language, and in all .NET languages as well, destructors are virtual by default. There is nothing to be done by the programmer, as runtime ensures that proper destructor (finalizer in C# terms) will be called in all cases.