Hypercube Network Simulation
--------- ------- ----------
Preface
-------
Hypercube networks have been used to build high
performance parallel computers. They are closely
related to other kinds of networks, Butterfly, Benes,
and Fat Tree, that have also been so used.
In all such networks, nodes communicate through links
between the nodes. A node may be a computer with its
own memory. A link is just a wire that connects one
node to another and over which messages can be sent.
One problem that parallel computer networks must solve
is permuting data among the nodes, e.g. among the
computers. This means that each node has some data that
belongs in some other destination node, and each node is
the destination of only one such piece of data.
There are permutation patterns that can cause many
messages to pass through a single node, causing a
bottleneck, otherwise known as network congestion. It
is an interesting fact that if the permutation is
random, then for all practical purposes congestion
cannot happen. What do we mean by this? We mean that
the probability of congestion happening is so small
that it can be expected to happen only one in a
zillion times the age of the universe. So for
engineering purposes, it never happens.
This leads to the clever idea of decomposing a
permutation that might suffer from congestion into two
random permutations. Each node picks another node at
random, and sends the data first to that random node,
and then the random node forwards the data to its true
destination. For engineering purposes, the network will
never be congested, even if the permutation would have
congested the network were data sent directly to its
destination.
Problem
-------
To investigate these ideas future you have been asked
to build a simulation of a hypercube network. Here we
will give a precise description of this network.
The network contains N nodes, where N is a power of two,
so for some B, N = 2**B (2 to the B'th power). Each
node has a unique B bit unsigned integer address.
Each node is attached to B links, and these links are
numbered 0, 1, ..., B-1.
The bits of a node address are also numbered 0, 1, ...,
B-1, with the lowest order (units) bit being number 0,
and the highest order bit being number B-1.
Link j of node i is the same as link j of node
k = i ^ ( 1 << j),
in the notation of the C programming language, and
connects nodes i and k. The node address k is just the
node address i with the j'th bit complemented; and
conversely, i is just k with the j'th bit complemented.
In the C programming language ^ denotes `exclusive OR'
and `1 << j' = 2**j as `<<' denotes left shift. Sending
a message through link j complements the j'th bit of the
address of the node the message was at.
Thus one path through a 4 node hypercube would be:
3 -------> 2 -------> 0
link 0 link 1
where the 3 --> 2 part of the path is accomplished by
sending the message through link 0 of node 3, and the 2
--> 0 part by sending through link 1 of node 2. Writing
the node numbers in binary, it is a bit easier to see
what is going on, and the above path would be:
11 -------> 10 -------> 00
link 0 link 1
Messages in the hypercube network (at least in the one
we are simulating) are always sent so the link numbers
of the message path increase as time increases. Or in
other words, the lowest order wrong bit in the current
message location is fixed first.
Thus if we use binary addresses we have the path:
010011 -------> 010010 -------> 010110 -------> 110110
link 0 link 2 link 5
Each link can send messages in either direction.
The network operates in time cycles. During each cycle,
a message, but at most one, can be sent in a particular
direction on each link. Each end of each link has a
send queue of messages waiting to be sent on the link,
and each end also has a receive buffer for a single
message, the last message received on the link.
The network is initialized by each node's being given
exactly one message to be sent to some node, possibly to
itself. This is because we are simulating the permuta-
tion problem described above. If the message is to be
sent to another node, the message is put on one of the
sender's link send queues; otherwise the message has
already arrived at its destination and is discarded.
Then during each cycle the simulator performs the
following two steps in order.
1) Removes from each non-empty send queue in each
network node the queue's first message, and
copies the message to the receive buffer on
the other end of the queue's link.
2) For each node i, looking at the receive buffers of
the links attached to node i in order of their
link number (0, 1, ...), removes from each
non-empty receive buffer its message, and if the
message does not have node i as its final
destination, puts the message on the end of the
send queue of the link that will correct the low-
est order address bit that needs to be corrected
in the message. I.e., if d is the final destina-
tion, i^d is the exclusive OR of i and d, and b is
the bit number of the lowest order on bit in i^d,
puts the message on the end of the send queue for
link b. If the message has node i as its final
destination, the message is discarded.
A network simulation run is done when all send queues
are empty after a cycle (or before the first cycle).
Input
-----
The input consists of input for any number of runs. For
each run, the input consists of a command letter, the
number of bits in a node address (B above), and N
destinations for the messages sent by each of the N =
2**B nodes. The destinations are listed in order of the
node number of the sending node.
Thus the input
r 3
1 0 5 4 3 7 2 6
has the command letter `r', B = 3, N = 2**3 = 8, and
sends messages
from node 0 to node 1
from node 1 to node 0
from node 2 to node 5
from node 3 to node 4
from node 4 to node 3
from node 5 to node 7
from node 6 to node 2
from node 7 to node 6
For each run input, a simulation is run, and the output
is controlled by the command letter.
The command letter and integers in the input are separa-
ted by whitespace, which may consist of any combination
of single space and newline characters. Newlines may
appear anywhere between command letters and integers. B
must be at least 1 and at most 10. The input ends with
an end of file.
Output
------
The output for the possible command letters is
r At the end of the run, the following line is
printed:
RUN #: # cycles, # sends, # max queue length.
Here each # denotes an unsigned integer. Runs
are numbered 1, 2, 3, .... The total number of
messages sent across all links is printed. The
maximum length of any send queue at the end of
any cycle is printed.
q Just like `r' but also prints the following just
before the first cycle and just after each
cycle. First the line:
RUN # CYCLE # QUEUE LENGTH
is printed. Cycles are numbered 1, 2, ...; and
the cycle number 0 is printed for the printout
that occurs before the first cycle.
Then for each node a line is printed consisting
of B integers each in 4 columns. The first line
is for node 0, the last for node N-1. The first
integer in a line is for link 0, the last for
link B-1. The integer for link j and node i is
the length of the send queue for link j in node
i.
Example Input
------- -----
q 2
0 2 1 3
q 2
0 1 2 3
r 3
1 0 3 2 5 4 7 6
r 3
7 6 5 4 3 2 1 0
r 3
0 4 2 6 1 5 3 7
r 4
0 8 4 12 2 10 6 14 1 9 5 13 3 11 7 15
Example Output
------- ------
RUN 1 CYCLE 0 QUEUE LENGTHS:
0 0
1 0
1 0
0 0
RUN 1 CYCLE 1 QUEUE LENGTHS:
0 1
0 0
0 0
0 1
RUN 1 CYCLE 2 QUEUE LENGTHS:
0 0
0 0
0 0
0 0
RUN 1: 2 cycles, 4 sends, 1 max queue length.
RUN 2 CYCLE 0 QUEUE LENGTHS:
0 0
0 0
0 0
0 0
RUN 2: 0 cycles, 0 sends, 0 max queue length.
RUN 3: 1 cycles, 8 sends, 1 max queue length.
RUN 4: 3 cycles, 24 sends, 1 max queue length.
RUN 5: 2 cycles, 8 sends, 1 max queue length.
RUN 6: 4 cycles, 32 sends, 1 max queue length.
File: hypercube.txt
Author: Bob Walton
Date: Sat Aug 31 11:55:40 EDT 2002
The authors have placed this file in the public domain;
they make no warranty and accept no liability for this
file.
RCS Info (may not be true date or author):
$Author: hc3 $
$Date: 2002/08/31 15:59:21 $
$RCSfile: hypercube.txt,v $
$Revision: 1.11 $