Posts in this Series:
Linked Lists
Stacks
Queues
Trees (this post)

To conclude the data structures shown in this series, we will show the tree. The tree will store organized data – when you add an item to a tree, it will be stored in a way that it can be easily retrieved in a sorted way.

The first node added to the tree acts as a root node and it has two pointers, one to its left node and one to its right node. The tree is traversed recursively, checking the left and right nodes, until the nodes have no more children.

Its node structure is like this:

[sourcecode language=”csharp” padlinenumbers=”true”]
public class TreeNode<T> where T : IComparable<T>
{

public TreeNode()
{
Left = null;
Right = null;
}

public T Data { get; set; }
public TreeNode<T> Left { get; set; }
public TreeNode<T> Right { get; set; }
}
[/sourcecode]

The treenode’s data must implement IComparable<T>, so it can be compared with other tree elements. The tree is a sorted data structure, so we must know how one element compares to the other. Besides that, it has two pointers, one to its left child and another to its right child.

Once we have the node data structure, we can start adding operations to the tree. The first one is to insert data in the tree:

 

[sourcecode language=”csharp”]
public void Add(T data)
{
var treenode = new TreeNode&lt;T&gt;() { Data = data };
if (_root == null)
{
_root = treenode;
_count++;
return;
}
_currentNode = _root;
while (true)
{
var comparison = treenode.Data.CompareTo(_currentNode.Data);
if (comparison == 0)
throw new InvalidOperationException(&quot;Data already in the tree&quot;);
else if (comparison &lt; 0)
{
if (_currentNode.Left != null)
_currentNode = _currentNode.Left;
else
{
_currentNode.Left = treenode;
_count++;
break;
}
}
else
{
if (_currentNode.Right != null)
_currentNode = _currentNode.Right;
else
{
_currentNode.Right = treenode;
_count++;
break;
}
}
}
}
[/sourcecode]

For the insertion, we have two different cases:

  • if the root node is null, the tree is empty, so the new node becomes the root
  • If the tree is not null, we have to find the insertion point, navigating through the tree – if the inserted data is smaller than the current node’s data, we go to its left child; if it’s larger, we go to its right node. If it’s the same, the data is already inserted and we show an error. We navigate through the tree until we find a node that has no more left or right nodes (depending on the comparison) and we insert the new node in the empty node.

Removing data from the tree is more complex and all can be summed in three cases:

  • The node to be removed has no children. In this case, we set its parent’s node pointing to it to null

image

  • The node to be removed has only one child (left or right). In this case, we set its parent’s left or right node to point to the child node of the deleted node.

image

  • The node to be removed has both the left and right nodes. In this case, we must find the predecessor of this node (the first one that is smaller than it and has no children at the right) and replace the deleted node with the predecessor and update the link from the parent of the predecessor to the predecessor right child (if it exist).

image

The code for deleting a node is:

[sourcecode language=”csharp” padlinenumbers=”true”]
public bool Remove(T obj)
{
var nodeAndParent = FindParentAndNode(obj);
if (nodeAndParent == null)
return false;
var parent = nodeAndParent.Item1;
var currentNode = nodeAndParent.Item2;

if (currentNode.Left == null &amp;&amp; currentNode.Right == null)
{
if (parent == null)
_root = null;
else if (parent.Left == currentNode)
parent.Left = null;
else
parent.Right = null;
_count–;
return true;
}

if (currentNode.Left == null)
{
if (parent == null)
_root = currentNode.Right;
else if (parent.Left == currentNode)
parent.Left = currentNode.Right;
else
parent.Right = currentNode.Right;
_count–;
return true;
}

if (currentNode.Right == null)
{
if (parent == null)
_root = currentNode.Left;
else if (parent.Left == currentNode)
parent.Left = currentNode.Left;
else
parent.Right = currentNode.Left;
_count–;
return true;
}

var parentAndPredecessor = FindParentAndPredecessor(currentNode);
var predParent = parentAndPredecessor.Item1;
var pred = parentAndPredecessor.Item2;
if (predParent == currentNode)
{
if (parent.Left == currentNode)
{
parent.Left = pred;
pred.Right = currentNode.Right;
}
else
{
parent.Right = pred;
pred.Right = currentNode.Right;
}
}
else
{
predParent.Right = pred.Left;
currentNode.Data = pred.Data;
}
return false;
}

[/sourcecode]

At the beginning, we find the node to be deleted and its parent. The FindParentAndNode function traverses the tree and returns the node to be deleted and its parent as a tuple:

[sourcecode language=”csharp”]
public Tuple&lt;TreeNode&lt;T&gt;, TreeNode&lt;T&gt;&gt; FindParentAndNode(T obj)
{
TreeNode&lt;T&gt; currentNode = _root;
TreeNode&lt;T&gt; parentNode = null;
while (true)
{
int comparison = obj.CompareTo(currentNode.Data);
if (comparison &lt; 0)
{
if (currentNode.Left != null)
{
parentNode = currentNode;
currentNode = currentNode.Left;
}
else
return null;
}
else if (comparison &gt; 0)
{
if (currentNode.Right != null)
{
parentNode = currentNode;
currentNode = currentNode.Right;
}
else
return null;
}
else
return Tuple.Create(parentNode, currentNode);
}
}
[/sourcecode]

After finding the node and its parent, when then check for any of the three cases: if it’s a leaf node (no left and right children), then we remove the link from its parent to it. It it has one of the children, we set the link from its parent to its children. If it has both children, then we find its predecessor and the predecessor’s parent and replace the links from the predecessor parent to the deleted node children and replace the data from the deleted node with the predecessor’s data. The FindParentAndPredecessor function is:

[sourcecode language=”csharp”]
private Tuple&lt;TreeNode&lt;T&gt;,TreeNode&lt;T&gt;&gt; FindParentAndPredecessor(TreeNode&lt;T&gt; currentNode)
{
var parent = currentNode;
var pred = currentNode.Left;
while (pred.Right != null)
{
parent = pred;
pred = parent.Right;
}
return Tuple.Create(parent, pred);
}
[/sourcecode]

Now we have all the operational methods in place. Now we need the methods to retrieve data. The first one is the Exists method, that searches an item and returns true if it exists:

[sourcecode language=”csharp”]
public bool Exists(T obj)
{
if (_root == null)
return false;
TreeNode&lt;T&gt; currentNode = _root;
while (true)
{
int comparison = obj.CompareTo(currentNode.Data);
if (comparison &lt; 0)
{
if (currentNode.Left != null)
currentNode = currentNode.Left;
else
return false;
}
else if (comparison &gt; 0)
{
if (currentNode.Right != null)
currentNode = currentNode.Right;
else
return false;
}
else
return true;
}
}
[/sourcecode]

This method will traverse the tree comparing the nodes’ data: if the data of the searched node is greater than the current node’s data, then we go to the right branch. If it is smaller, we go to the left branch. If we reach a node with no corresponding branch, the item is not in the tree. If we find a node where the data is equal to the searched data, we return true.

The other method is GetItems, that returns the sorted items:

[sourcecode language=”csharp”]
public IEnumerable&lt;T&gt; GetItems()
{
if (_root == null)
yield break;
foreach (var data in VisitNode(_root))
yield return data;
}

private IEnumerable&lt;T&gt; VisitNode(TreeNode&lt;T&gt; node)
{
if (node.Left != null)
foreach (var data in VisitNode(node.Left))
yield return data;
yield return node.Data;
if (node.Right != null)
foreach (var data in VisitNode(node.Right))
yield return data;
}
[/sourcecode]

To get the ordered items data, we use a recursive function that does what is called “InOrder Traversal”: for each node in the tree, it traverses the left node, emits the data for the current node and then traverses the right node.

That way we have finished our journey to knowing the binary tree. This is a data structure more complex that the ones we have seen before, but it is also extensively used.

image

The source code for this series of articles is in https://github.com/bsonnino/DataStructures

Posts in this Series:
Linked Lists
Stacks
Queues (this post)
Trees

In the last post, I’ve talked about stacks, a LIFO (Last In – First out) data structure. In this post, I will talk about queues, a FIFO (First in – First out) data structure. A queue is like any box office queue, where the first in line are served first.

The queue has these operations:

  • Enqueue – adds an item to the end of the queue
  • Dequeue – Removes an item from the start of the queue

Like the stack, you can’t add or remove items from the middle of the queue. The node structure is the same as the nodes used in the linked list and in the stack. The Enqueue method is like this:

[sourcecode language=”csharp” padlinenumbers=”true”]
public void Enqueue(T obj)
{
NodeList&lt;T&gt; node = new NodeList&lt;T&gt;() { Data = obj };
_count++;
if (_last == null)
{
_last = _top = node;
return;
}
_last.Next = node;
_last = node;
}
[/sourcecode]

Note that there is a subtle change from the stack code: we are using an extra pointer for the last node, so we can get fast inserts and deletes in the queue.

The Dequeue method is:

[sourcecode language=”csharp”]
public T Dequeue()
{
if (_top != null)
{
NodeList&lt;T&gt; result = _top;
_top = _top.Next;
_count–;
return result.Data;
}
throw (new InvalidOperationException("The queue is empty"));
}

[/sourcecode]

We remove the top node from the queue. With these two operations, the queue is ready. We can also do the same way as the stack and use a linked list to implement it:

[sourcecode language=”csharp”]
public class QueueAsList&lt;T&gt;
{
private readonly LinkedList&lt;T&gt;_linkedList = new LinkedList&lt;T&gt;();

public void Enqueue(T obj)
{
_linkedList.Insert(obj);
}

public T Dequeue()
{
if (_linkedList.Count == 0)
throw (new InvalidOperationException("The queue is empty"));
var result = _linkedList[0];
_linkedList.Remove(result);
return result;
}

public int Count =&gt; _linkedList.Count;

public void Clear()
{
_linkedList.Clear();
}
}
[/sourcecode]

Now we have our implementations of the queue and we can move to the next article, the Tree.

image

The source code for this series of articles is in https://github.com/bsonnino/DataStructures

Posts in this Series:
Linked Lists
Stacks (this post)
Queues
Trees

In the last post I’ve talked about one of the most basic data structures, the linked list. In this post we will start talking about stacks.

A stack is a LIFO (Last In –First out) data structure, it’s like a pile of plates: you start stacking them and, when you need one, you remove the top of the pile, the last you’ve put.

Its structure is very similar to the linked list, but you have these operations on it:

  • Push – add a new item to the stack
  • Pop – remove the top item from the stack
  • Peek – take a look at the top of the stack but don’t remove the item

As you can see, you can only operate on the top of the stack, you cannot add or remove an item in the middle of the stack. As we’ve seen in the last article, creating a generic stack is pretty much the same as creating a non generic one, so we will only develop a generic one in this article.

The Push method will add a new item in the top of the stack, thus setting the _top field to point to the newly created node:

[sourcecode language=”csharp” padlinenumbers=”true”]
private NodeList<T> _top;
private int _count;

public void Push(T obj)
{
var node = new NodeList<T>;
{
Data = obj,
Next = _top
};
_top = node;
_count++;
}
[/sourcecode]

The Pop method checks the top of the stack. If it’s null (empty stack), it throws an InvalidOperationException. If the stack isn’t empty, the top and count are updated and the old top is returned:

[sourcecode language=”csharp”]
public T Pop()
{
if (_top == null)
throw (new InvalidOperationException("The stack is empty"));
var result = _top;
_top = _top.Next;
_count–;
return result.Data;
}
[/sourcecode]

Peek is very simple, it only checks if there is any data in the top of the stack and returns its data:

[sourcecode language=”csharp”]
public T Peek()
{
if (_top != null)
return _top.Data;
throw (new InvalidOperationException("The stack is empty"));
}
[/sourcecode]

Our stack is ready. If you take a closer look, you will see that the stack can be implemented with a Linked List: Push will insert an item at the top (with Insert), Pop will get the top item, remove it and return it. Peek will get the top item and return it:

[sourcecode language=”csharp”]
public class StackAsList<T>;
{

private readonly LinkedList<T> _linkedList = new LinkedList<T>();

public void Push(T obj)
{
_linkedList.Insert(obj);
}

public T Peek()
{
if (_linkedList.Count == 0)
throw (new InvalidOperationException("The stack is empty"));
return _linkedList[_linkedList.Count-1];
}

public T Pop()
{
if (_linkedList.Count == 0)
throw (new InvalidOperationException("The stack is empty"));
var result = _linkedList[_linkedList.Count – 1];
_linkedList.Remove(result);
return result;
}

public int Count => _linkedList.Count;

public void Clear()
{
_linkedList.Clear();
}

}
[/sourcecode]

Pretty simple, no? Now we have two implementations of the stack, in the next article we will see the Queue.

image

The source code for this series of articles is in https://github.com/bsonnino/DataStructures

Posts in this Series:
Linked Lists (this post)
Stacks
Queues
Trees

Although we use data structures daily in our programs, sometimes we forget the theory behind them. So, I decided to create this series of posts to give a refresher on the data structures theory.

This is is by no means a try to replace the current data structures (for sure, the .NET ones will be better optimized than these ones), but just a way to remember what’s going on when we use them.

They will be entirely developed in C# and I will create a WPF program to show examples of their use. So, let’s start with the first one: Linked Lists.

Linked lists are lists of elements that are linked by a pointer to the next element. Each element has a structure like this:

[sourcecode language=”csharp” padlinenumbers=”true”]
class NodeList
{
public NodeList()
{
Next = null;
}

public object Data { get; set; }
public NodeList Next { get; set; }
}
[/sourcecode]

 

Each node stores its data and a pointer to the next node. The last node in the list points to null. The linked list would be something like this:

Singly-linked-list.svg

(Source – Wikipedia)

The only thing that the list must store is its top node. Then, all traversing is done using the Next pointers of each node. So, a basic linked list would be like this:

[sourcecode language=”csharp” htmlscript=”true”]
class LinkedList
{
private NodeList _top = null;
}
[/sourcecode]

This list wouldn’t do much, we need some operations. To insert some data in the list, we could use the Insert method:

[sourcecode language=”csharp” htmlscript=”true”]
public int Insert(object obj)
{
var node = new NodeList { Data = obj };
if (_top == null)
{
_top = node;
_last = node;
}
else
{
_last.Next= node;
_last = node;
}
_count++;
return _count;
}
[/sourcecode]

I added the fields _last, that stores the last item in the list, so there is no need to traverse the list for every inserted node and _count, that stores the number of items in the list. To clear the list, we only  have to set the top node to null:

[sourcecode language=”csharp” htmlscript=”true”]
public void Clear()
{
_top = null;
_count = 0;
}
[/sourcecode]

To get the number of Items on the list, we could call the Count property:

[sourcecode language=”csharp”]
public int Count => _count;
[/sourcecode]

We can also insert some data at a fixed position:

[sourcecode language=”csharp”]
public int InsertAt(int position, object obj)
{
if (position < 0 || position >= _count)
throw (new IndexOutOfRangeException());
var node = new NodeList { Data = obj };
if (position == 0)
{
node.Next = _top;
_top = node;
}
else
{
var current = GetNodeAt(position – 1);
node.Next = current.Next;
current.Next = node;
}
_count++;
return _count;
}
[/sourcecode]

To insert a new node in any position, we first check for the special case to insert in the first position. If it is, we replace the top item with the inserted node and set the Next property to point to the old top. If it’s another position, we traverse the tree using the GetNodeAt, to get the item before the item to be inserted and replace the next pointer of the inserted node, pointing to where the found item was pointing and replacing the next pointer of the found item, pointing to the inserted node. The GetNodeAt function is:

[sourcecode language=”csharp”]
private NodeList GetNodeAt(int position)
{
if (position < 0 || position >= _count)
throw (new IndexOutOfRangeException());
var current = _top;
for(int i = 0; i < position;i++)
{
current = current.Next;
}
return current;
}
[/sourcecode]

With GetNodeAt, we can create an indexed property to get the data in the nth node:

[sourcecode language=”csharp”]
public object this[int index] => GetNodeAt(index).Data;
[/sourcecode]

 

To remove a node, we just need to set the next pointer of the node before to the node after the deleted one:

[sourcecode language=”csharp”]
public bool Remove(object obj)
{
var currentItem = Find(obj);
if (currentItem == -1)
return false;
if (currentItem == 0)
{
_top = _top.Next;
}
else
{
var previousNode = GetNodeAt(currentItem – 1);
previousNode.Next = previousNode.Next.Next;
if (previousNode.Next == null)
_last = previousNode;
}
_count–;
return true;
}
[/sourcecode]

 

We find the item related to the object we want to remove and then set the next node to the value pointed to the node to be deleted. The Find method is:

[sourcecode language=”csharp”]
public int Find(object obj)
{
if (Count == 0)
return -1;
NodeList current = _top;
int currentNo = 0;
do
{
if (current.Data.Equals(obj))
{
return currentNo;
}
current = current.Next;
currentNo++;
} while (current != null);
return -1;
}
[/sourcecode]

We can also create a RemoveAt method:

[sourcecode language=”csharp”]
public bool RemoveAt(int position)
{
if (position < 0 || position >= _count)
throw (new IndexOutOfRangeException());
if (position == 0)
{
_top = _top.Next;
}
else
{
var previousNode = GetNodeAt(position – 1);
previousNode.Next = previousNode.Next.Next;
if (previousNode.Next == null)
_last = previousNode;
}
_count–;
return true;
}
[/sourcecode]

We must still create one last method to retrieve all elements in the list:

[sourcecode language=”csharp”]
public IEnumerable GetItems()
{
if (Count == 0)
yield break;
_currentItem = _top;
while (_currentItem != null)
{
yield return _currentItem.Data;
_currentItem = _currentItem.Next;
}
}
[/sourcecode]

The LinkedList class is finished, but it has one issue, here: the Data property is an Object and this brings some issues: boxing and unboxing of objects, there is no verifying of data integrity (you can add any type of object, and the objects don’t need to be related). So, the next step is to introduce a Generic Linked List, so we can solve these issues. The new NodeList is:

[sourcecode language=”csharp”]
class NodeList<T>
{
public NodeList()
{
Next = null;
}

public T Data { get; set; }
public NodeList<T> Next { get; set; }
}
[/sourcecode]

And the new LinkedList is:

[sourcecode language=”csharp”]
public class LinkedList<T>
{
private NodeList<T> _top;
private NodeList<T> _last;
private int _count;
private NodeList<T> _currentItem;

public int Insert(T obj)
{
var node = new NodeList<T> { Data = obj };
if (_top == null)
{
_top = node;
_last = node;
}
else
{
_last.Next = node;
_last = node;
}
_count++;
return _count;
}

private NodeList<T> GetNodeAt(int position)
{
if (position < 0 || position >= _count)
throw (new IndexOutOfRangeException());
var current = _top;
for (int i = 0; i < position; i++)
{
current = current.Next;
}
return current;
}

public int InsertAt(int position, T obj)
{
if (position < 0)
throw (new IndexOutOfRangeException());
var node = new NodeList<T> { Data = obj };
if (position == 0)
{
node.Next = _top;
_top = node;
}
else
{
var current = GetNodeAt(position – 1);
node.Next = current.Next;
current.Next = node;
}
_count++;
return _count;
}

public void Clear()
{
_top = null;
_count = 0;
}

public int Count => _count;

public T this[int index] => GetNodeAt(index).Data;

public int Find(T obj)
{
if (Count == 0)
return -1;
NodeList<T> current = _top;
int currentNo = 0;
do
{
if (current.Data.Equals(obj))
{
return currentNo;
}
current = current.Next;
currentNo++;
} while (current != null);
return -1;
}

public bool Remove(T obj)
{
var currentItem = Find(obj);
if (currentItem == -1)
return false;
if (currentItem == 0)
{
_top = _top.Next;
}
else
{
var previousNode = GetNodeAt(currentItem – 1);
previousNode.Next = previousNode.Next.Next;
if (previousNode.Next == null)
_last = previousNode;
}
_count–;
return true;
}

public bool RemoveAt(int position)
{
if (position < 0 || position >= _count)
throw (new IndexOutOfRangeException());
if (position == 0)
{
_top = _top.Next;
}
else
{
var previousNode = GetNodeAt(position – 1);
previousNode.Next = previousNode.Next.Next;
if (previousNode.Next == null)
_last = previousNode;
}
_count–;
return true;
}

public IEnumerable<T> GetItems()
{
if (Count == 0)
yield break;
_currentItem = _top;
while (_currentItem != null)
{
yield return _currentItem.Data;
_currentItem = _currentItem.Next;
}
}
}
[/sourcecode]

As you can see, there was not much trouble to change from a LinkedList that stores objects to a generic Linked List.

Our Linked List is ready to be used, and in the next article we will see another data structure, the Stack.

image

The source code for this series of articles is in https://github.com/bsonnino/DataStructures