Skip Sorted Set PA
Introduction
The objective of this assignment is to implement a contiguous implementation of a sorted set data structure. Section 1.2.4 in our textbook describes the SSet interface, which is very similar. Even though this set structure will store its members in sorted order, many of the operations will be asymptotically faster than their PA2 equivalents due to the use of the clever skip list structure.
Getting Started
First download and decompress one of the starter files:
Don't forget to download the appropriate Makefile for your platform from Piazza or the course website.
The Sorted Set ADT
For the purposes of this assignment, the sorted set interface (an abstract data type) should support the following operations:
- Adding and removing elements
- Testing for set membership
- Iteration over set members (in sorted order)
- Subset, superset, equality, and disjointedness testing
- Mathematical union, intersection, difference, and symmetric difference operations
In addition, you will need to implement standard data structure features:
- Allocation, clearing, and deallocation
- Size tracking
- Copying
For this assignment, your implementation of this set interface should store elements in a standard skip list as described in Chapter 4 of the textbook. Here is an example skip list containing seven integers:
Please refer to the textbook or lecture slides for details on the theory and implementation of skip lists.
Definitions
As in PA2, we provide a data_t
typedef in skip_set.h
:
typedef
int data_t;
We also need data type definitions for the skip set structure as well as for skip list nodes. These definitions are also in the provided skip_set.h
:
typedef
struct skip_node {
data_t data;
size_t height;
struct skip_node **next;
} skip_node_t;
typedef
struct skip_set {
skip_node_t *head;
size_t max_height;
size_t size;
} skip_set_t;
Finally, we will need an iterator type for our set. Iterators will allow us to traverse the set in a way that is independent of the underlying implementation. For the purposes of this assignment, an iterator will be a pointer to a skip list node; i.e., a skip_node_t*
value. As before, the definition is in skip_set.h
:
typedef skip_node_t* skip_set_iter_t;
IMPORTANT: This iterator definition is different that the one in PA2, where an interator was defined as a pointer to a data_t
value. Now it will be a pointer to a skip_node_t
struct. This will allow you to iterate over the skip list properly. If you wrote your Part 2 and Part 3 functions to use iterators, you should only need to make minimal modifications after you implement the iterator functions: change any iterator dereferences from
*it
(where it
is an interator) to
it->data
or
(*it).data
Of course, you will also need to change function calls as follows:
array_set_begin() => skip_set_begin()
array_set_end() => skip_set_end()
array_set_next() => skip_set_next()
WARNING: You should not modify skip_set.h
! Your code (in skip_set.c
) must compile against the original skip_set.h
without errors or warnings.
Pseudocode
The textbook's pseudocode uses a stack to help re-arrange the skip list on insertion. This works but is a bit overly complicated. Here is some simpler pseudocode for inserting in a skip list:
add(x):
if the set contains x, return
choose new random height new_height
if max_height < new_height:
expand the sentinel
w = allocate a new node of height new_height
u = sentinel
r = max_height - 1
while r >= 0 do
while u.next[r] != NULL and u.next[r].data < x do
u = u.next[r]
if r < new_height:
w.next[r] = u.next[r]
u.next[r] = w
r = r - 1
size = size + 1
Similarly, here is simplified pseudocode for searching in a skip list:
contains(x):
u = sentinel
r = max_height - 1
while r >= 0 do
while u.next[r] != NULL and u.next[r].data < x do
u = u.next[r]
r = r - 1
return u.next[r].data == x
Some of the iteration variables used above were chosen to match the textbook and aren't terribly descriptive, so here are some alternative names:
w new_node (skip_node_t*)
u cur_node (skip_node_t*)
r cur_height (int)
Part 1: Basic Set Operations
The following operations should all be O(1); i.e., their running time should not depend on the number of elements in the set.
-
void skip_set_init(skip_set_t *set)
Initializes a new empty set.
-
size_t skip_set_size(skip_set_t *set)
Returns the size of a set.
The following operations deal with adding, removing, and searching in the set, and they should all run in O(log n) time.
-
void skip_set_add(skip_set_t *set, data_t value)
Adds an element to a set. If the set already contains the given value, no changes take place.
IMPORTANT: The
skip_set_add
method is probably the most complex new method in PA3. Please study the pseudocode from the textbook carefully and consider how to port it to C code. You will need to perform all of the following tasks:-
Check to make sure the item does not already exist. (HINT: delegate this to "skip_set_contains")
-
Choose a random height for the new node. (HINT: repeatedly flip a coin using "(rand() % 2 == 0)")
-
Extend/replace the sentinel node if it is not already tall enough.
-
Insert the new node at all appropriate levels of the skip list.
-
-
void skip_set_remove(skip_set_t *set, data_t value)
Removes an element from a set. If the set does not contain the given value, no changes take place.
-
data_t skip_set_pop(skip_set_t *set)
Returns and removes an arbitrary element from a set. The behavior is undefined if the set contains zero elements.
-
bool skip_set_contains(skip_set_t *set, data_t value)
Returns true if the element is a member of the set; false otherwise.
The following operations clear all elements from the set. Because the skip list is a linked structure, these functions should run in O(n) time.
-
void skip_set_clear(skip_set_t *set)
Removes all elements from a set.
-
void skip_set_free(skip_set_t *set)
Deallocate all memory associated with a set.
Part 2: Iterators and Comparisons
The following operations handle iteration over a set. They should all be O(1).
-
skip_set_iter_t skip_set_begin(skip_set_t *set)
Returns an iterator for a set that represents the beginning of the set.
-
skip_set_iter_t skip_set_end(skip_set_t *set)
Returns an iterator for a set that represents the end of the set.
-
skip_set_iter_t skip_set_next(skip_set_iter_t iter)
Advances and returns a set iterator.
The following operations handle comparisons between two sets. Because the underlying storage structure for this assignment supports O(log n) lookups, these methods should run in O(n log n) time.
-
bool skip_set_is_subset(skip_set_t *set1, skip_set_t *set2)
Returns true if and only if
set1
is a subset ofset2
. -
bool skip_set_is_superset(skip_set_t *set1, skip_set_t *set2)
Returns true if and only if
set1
is a superset ofset2
. -
bool skip_set_is_equal(skip_set_t *set1, skip_set_t *set2)
Returns true if and only if
set1
andset2
contain the same items. -
bool skip_set_is_disjoint(skip_set_t *set1, skip_set_t *set2)
Returns true if and only if
set1
andset2
don't contain any of the same items.
Part 3: Mathematical Set Operations
The last few operations implement standard mathematical set operations. One thing that they all have in common is that they should populate a destination set with elements drawn from other sets. Thus, they all take a "dest
" pointer to a set. Each function should ensure that the destination set is empty by clearing it before performing whatever operations are appropriate for that function.
The first function creates a copy of a set, and should run in O(n log n) time.
-
void skip_set_copy(skip_set_t *dest, skip_set_t *src)
Make
dest
a copy ofsrc
; i.e.,dest
should contain all elements insrc
and no extra elements.
IMPORTANT: The sets should be independent after the copying! In other words, changes made to one of the sets after copying should not affect the other.
The following operations build new sets using two existing sets. Because the underlying storage structure for this assignment supports O(log n) lookups, these methods will require O(n log n) time.
-
void skip_set_union(skip_set_t *dest, skip_set_t *set1, skip_set_t *set2)
Calculates the union of
set1
andset2
, storing it indest
. -
void skip_set_intersect(skip_set_t *dest, skip_set_t *set1, skip_set_t *set2)
Calculates the intersection of
set1
andset2
, storing it indest
. -
void skip_set_diff(skip_set_t *dest, skip_set_t *set1, skip_set_t *set2)
Calculates the difference between
set1
andset2
, storing it indest
. -
void skip_set_sym_diff(skip_set_t *dest, skip_set_t *set1, skip_set_t *set2)
Calculates the symmetric difference between
set1
andset2
, storing it indest
.
Final Submission
DUE DATE: Friday, October 30, 2015 at 23:59 EDT (11:59pm)
Submit ONLY your completed skip_set.c
file on Canvas using the appropriate assignment submission. Make sure the file contains a comment field at the top with your name and a statement of compliance with the JMU honor code.
Please check your code carefully and make sure that it complies with the project specification. In addition, make sure that your code adheres to general good coding style guidelines. Fix any spelling mistakes, incorrect code formatting, and inconsistent whitespace before submitting. Make sure you include any appropriate documentation.