Overview

Instructor: Partha Biswas (partha DOT biswas AT tufts.edu)

Teaching assistant: Michael Shah (Michael DOT Shah AT tufts.edu)

Description and Objective: This course is intended as an introduction to data structures, algorithms, and more advanced programming techniques. Students will be able to solve real-world problems by reasoning about data structure choices, choose appropriate implementations, and analyze the costs associated with those choices. Students will learn to write, debug, and test large programs systematically. We hope to achieve these goal by presenting higher level concepts in lecture and hands-on computer practices in the lab. The programming assignments will be in C++.

Topics Covered: The major topics within the course include: Abstraction, Problems Solving, Software Design, Sequences, Sets, Finite Maps, Linked Lists, Templates, Stacks, Queues, Trees, Heaps, Sorting Algorithms, Graphs, and Hashing, with exposure to complexity and algorithm analysis.

Prerequisites: COMP 11 or equivalent. Knowledge of C++. Students who have had only Java or Python programming experience should contact the professor.

Examples: All code samples shown in the class are posted here.

 

Posted in Information | Leave a comment

Final review

Asymptotic complexity

Upper bound function

Lower bound function

Order function

Solving recurrence equations

Sorting

Insertion sort – algorithm, complexity and benefits

Selection sort – algorithm, complexity and benefits

Quicksort – algorithm, complexity and benefits

Mergesort – algorithm complexity and benefits

Splay trees

Splay operations

  • Zig-zag
  • Zig-zig
  • Zig

Complexity of operations in splay trees

Algorithms for find(key), min(), max(), remove(key)

Heaps

Relationships between priority queues, selection sort, tournament sort and the heap data structure.

Operations: min(), insert(key), removeMin(), bottomupHeap(), heapsort()

In-place heap sort

Graphs

Representations

Graph traversals (DFS, BFS)

Hashing

Hashcoding and compression function

Examples of bad and good hash-coding

Examples of bad and good compression function

Collision, load-factor

C++ templates and STL

Converting existing programs into parameterized programs

What STL to use when?

Problems

  1. What are the minimum and maximum number of elements in a heap of height h?
  2. Given a min-heap, how would you find the max element? What is the complexity of such an algorithm?
  3. Insert the following numbers into a heap for an in-place sort in descending order: 3, 5, 2, 9, 20
  4. Algorithms for inserting and deleting an entry in a heap?
  5. Algorithms for inserting, deleting and finding a key in a splay tree?
  6. Discuss the sorting algorithms covered in the class.
  7. Give an efficient algorithm to find the first non-repeated character in a string.
  8. Give an efficient algorithm to find the first repeated character in a string.
  9. Given a directed graph, give an algorithm to detect whether there is a cycle.
  10. Given an undirected graph, give an algorithm to detect whether there is a cycle.
  11. Given a directed acyclic graph, give an algorithm to topologically sort the nodes.

 

Posted in Data-structures | Comments Off

Mergesort and Quicksort

Mergesort (not an in-place sort)

mergesort(a, i, j)
{
      if (i >= j) return;
      mid = (i + j)/2;
      mergesort(a, i, mid);
      mergesort(a, mid + 1, j);
      merge(a, i, j);
}

merge(a, i, j)
{
      start = i;
      mid = (i + j)/2;
      k = mid + 1;
      l = i;
      while (i <= mid && k <= j) {
            if (a[i] <= a[k]) {
                  b[l++] = a[i++];
            } else {
                  b[l++] = a[k++];
            }
      }
      if (i > mid) {
          for (; k <= j;) b[l++] = a[k++];
      } else {
          // k > j
          for (; i <= mid;) b[l++] = a[i++];
      }
     // copy back b to a
     for (l = start; l <= j; l++) {
          a[l] = b[l];
     }
}

 

Quicksort (an in-place sort)

quicksort(a, i, j)
{
      if (i >= j) return;
      pindex = random number between i and j;
      pivot = a[pindex];
      // swap pivot and a[j]
      a[pindex] = a[j];
      a[j] = p;
      m = i – 1; // m starts from the left
      n = j; // n starts from the right
      do {
            do { m++; } while (a[m] < pivot);
            do { n--; } while (a[n] > pivot && n > i);
            if (m < n) {
                  // swap a[m] and a[n];
            }
      } while (m < n); // till m and n cross each other
      // m >= n and therefore a[m] must be greater than pivot.
      // hence it is fine to now swap with the pivot in a[j]
      a[j] = a[m];
      // put the pivot in the middle
      a[m] = pivot;
      quicksort(a, i, m-1);
      quicksort(a, m+1, j);
}

 

Posted in Data-structures | Comments Off

Trees and Graphs

Here are definitions of some of the common terms you would often encounter in the context of trees and graphs.

Trees

Tree: A set of nodes and edges connecting the nodes such that there is exactly one path between any 2 nodes.

Rooted tree: A tree where a distinguished node is identified as a root. The nodes share a parent-child relationship. Every node except the root has exactly one parent. The root node does not have any parent.

Path: Connected sequence of edges.

Leaf: A node with no children.

Siblings: Nodes with the same parent.

Ancestor of a node n: Nodes on the path from n to the root (including n itself and the root).

Length of a path: Number of edges on the path.

Depth of a node: Length of path from the node to the root. Depth of root = 0.

Height of a node: Length of path from the node to its deepest descendent. height of a leaf node = 0. Height of tree = Height of root.

Subtree rooted at a node: Tree formed by the node and its descendents.

Binary tree: A tree with no node having more than 2 children.

Graphs

Graph: A set of nodes (or vertices) and edges connecting the nodes. A tree is a particular kind of graph.

Graph representation: G = (V,E), an ordered pair, where V and E are a set of nodes and edges respectively.

Digraph (Directed Graph): A graph where every edge is directed. An edge, e = (u, v) is an ordered pair of nodes, where u and v are referred to as source/origin and sink/destination respectively.

Undirected graph: A graph where an edge e = (u,v) is an unordered pair, i.e., e = (u, v) = (v, u).

Path: A sequence of nodes with an adjacent pair connected by an edge. In a digraph, the edges on the path must be aligned with the direction of the path.

Length of path: Number of edges on the path.

Strongly connected graph: A graph where there is a path from every node to every other node. For an undirected graph, a strongly connected graph is simply called a connected graph.

Degree of a node: Number of edges incident on the node. For a directed graph, indegree of a node refers to the number of edges directed toward the node, and outdegree of a node refers to the number of edges directed away from the node.

Posted in Data-structures | Comments Off

Midterm 1 Solution

Mike’s solutions to midterm 1 attached below.
Please contact Mike if you have questions about the midterm

Average: 88%

Everyone did very well, good job!

Midterm_1_solution

Posted in Exam | Tagged | Comments Off

Midterm review

Linked Lists:

1. Write an algorithm to detect a cycle in a linked list.

2. Remove consecutive duplicate entries from a given linked list.

3. Get the k-th to the last element in the linked list (when the size of the list is not known/tracked).

4. Check if the linked list is a palindrome.

5. Reverse a singly-linked list non-recursively.

Stacks/Queues:

1. Design a stack such that getMinimum() is O(1).

2. Reverse the elements of stack only using stack operations.

Recursion:

1. Write a recursive algorithm to perform x^n.

2. Write a recursive algorithm for binary search.

Complexity analysis:

1. Compare merge sort algorithms with various partitioning options.

2. Write down the lower bound, upper bound and order functions for f(n) = 0.5 n^3 + 2 n^2 + log n

Binary Search Trees:

1. Given an inorder traversal and a postorder traversal, write an algorithm to reconstruct the tree.

2. Find the nearest match of a given value in a binary search tree.

3. Write a function to check whether two BSTs are structurally identical.

4. Write a function to find the size of a BST.

5. Write a function to compute the sum of all elements in a BST.

 

 

 

Posted in Review | Comments Off

Growth of functions

In algorithmic analysis, we learn to characterize the complexity in terms of an upper bound (Big-oh), a lower bound (Big-omega) and an order function (Big-theta). Here are the definitions:

1. Upper bound function (i.e., T(n) is upper-bounded by f(n)): We say,

T(n) = O(f(n)) iff

there exist constants c, N > 0, such that T(n) ≤ c f(n) for all n > N.

2. Lower bound function (i.e., T(n) is lower-bounded by f(n)): We say,

T(n) = Ω(f(n)) iff

there exist constants c, N > 0, such that T(n) ≥ c f(n) for all n > N.

3. Order function (i.e., T(n) is of the order of f(n)): We say,

T(n) = Θ(f(n)) iff

there exist constants c1, c2, N > 0, such that c1 f(n)T(n) ≤ c2 f(n) for all n > N.

A deep understanding of the asymptotic complexities involves getting a good sense of how different growth functions behave as n approaches infinity. In this article, we present a  visualization of the growth of commonly occurring functions.

Here is a summary of the common functions in the increasing order of complexity:

T(n) Nomenclature
O(1) Constant
O(log n) Logarithmic
O(n) Linear
O(n log n) n-log-n
O(n2) Quadratic
O(n3) Cubic
O(2n) Exponential

Comparing the growth of polynomial functions:

Note that the higher order polynomial functions grow much slower in the beginning till (n = 1).

comparePolynomials

Comparing the growth of linear, quadratic and cubic:

Note here that for n > 1, n = O(n2) and n2 = O(n3) (in other words, n is upper-bounded by n2 and n2 is upper-bounded by n3).

Since lower-bound function is a converse of upper-bound function, we can also say, for n > 1, n2 = Ω(n) and n3 = Ω(n2) (in other words, n2 is lower-bounded by n and n3 is lower-bounded by n2).

compareLinearQuadraticCubic

Comparing the growth of logarithmic with linear:

Here, we can say that for any n > 1, log(n) = O(n) (or, log(n) is upper-bounded by n).

compareLogLinear

Comparing the growth of n log(n) and n2:

Here, we can say that for any n > 1, n log(n) = O(n2) (or, n log(n) is upper-bounded by n2).

compareNlogNQuadratic

Comparing the growth of n2 and 2n:

Here, we can say that for any n > 1, n2 = O(2n) (or, n2 is upper-bounded by 2n).

compareQuadraticExponential

 

Posted in Data-structures | Comments Off

Debugging with Emacs

Compile (say, files f1.cpp and f2.cpp) with the following command from Emacs:

M-x compile <return>

Compile command: g++ -o myprogram -g f1.cpp f2.cpp

The presence of ‘-g’ flag tells the compiler to generate the debug symbols in the generated executable called “myprogram“.

Now, invoke debugger from Emacs using the following command:

M-x gdb <return>

Run gdb (like this): gdb myprogram

[For Emacs running in windows mode, try using "--annotate=3" to set the annotate level to 3]

This brings you to gdb prompt, where you can run the program by typing ‘r’ as follows:

gdb> r

If you intend to debug your program, you will likely have a particular place to stop and inspect the variables under question. In that case, setting a breakpoint might be a good place to start before running the program.

Here is a table of gdb commands that you could run from Emacs (after generating the gdb prompt):

 

C-x <space> [Emacs Command] Go to any line in your source code and invoke this command to set a breakpoint. (This is the most convenient way to set a breakpoint from Emacs.)
gdb> b main Set a breakpoint in the beginning of main().
gdb> r Run till a breakpoint or run to completion.
gdb> c Continue from the current breakpoint.
gdb> s Step into a function.
gdb> n Step over the current line. When you hit “return”, the gdb executes the previous command, i.e., you keep executing the program line by line.
gdb> where Inspect the call stack.
gdb> p <varname> Prints the content of a variable called <varname>.
gdb> fin Step out of the current function.
gdb> i b List all the breakpoints.
gdb> del <bpid> Remove breakpoint number <bpid>.
gdb> dis <bpid> Disable breakpoint number <bpid>.
gdb> ena <bpid> Enable breakpoint number <bpid>. (You can toggle enabling and disabling the breakpoints using ena and dis commands.)
gdb> b <bpid> <condition> Set conditional breakpoint for breakpoint number <bpid>. (A condition could be any logical expression like ‘(<varname> == 3)’.)
gdb> comm <bpid> Execute a set of commands every time the execution halts at the breakpoint number <bpid>.

 

Posted in Emacs | Comments Off

Interesting linked list problems

1. Write an algorithm to detect a cycle in a linked list.

2. Remove consecutive duplicate entries from a given linked list.

3. Get the k-th to the last element in the linked list (when the size of the list is not known/tracked).

4. Check if the linked list is a palindrome.

5. Reverse a singly-linked list non-recursively.

Posted in Data-structures, Review | Comments Off