Frequently Asked Questions
Hardware:
- How does ATOLL differ from other SANs?
- How much bandwidth do I get?
- What about latency?
- What sort of cable comes with ATOLL?
- Does the ATOLL card also work in 32bit PCI
slots?
Software:
- Does ATOLL support Shared Memory
applications?
- Can ATOLL speed up applications using
TCP/IP-based communication?
Business:
- Why is ATOLL a lot cheaper than other
solutions?
Other:
- What can be done in the case of a node
failure?
- In which way the routing of the
ATOLL network can be changed if a link fails?
- In which way a congestion is solved
in the ATOLL network?
- In which way a deadlock is detected and
if there is one in which way it can be resolved?
-
- The unique property of the ATOLL architecture is the
replication of single network components and their tight integration in
one single ASIC. ATOLL offers 4 NIs, an on-chip 8x8 crossbar and 4 link
interfaces. What's the point? Well, you don't have to spend money on
external switching hardware, and your applications can make use of 4
independent NIs on each node, an ideal base for clustering SMPs.
Top
-
- Each of the 4 bidirectional, byte-wide links of ATOLL runs
at 250 MHz. When operating in a 64 bit/66 MHz PCI slot, ATOLL is capable
of delivering up to 90% of the peak bandwidth to user applications,
resulting in impressive bandwidths of 200 Mbyte/s and more. But with
multiple NIs in use (if you have Dual-CPU nodes, each CPU can run a
parallel process and has its own NI), the ATOLL card is able to fully
saturate a high-end PCI bus. This results in accumulated data rates of
400 Mbyte/s and more.
Top
-
- All message transfer is done in user-level mode without any
OS intervention. One-way latency to send a few bytes starts at about 4
µs, as system-level simulations indicate.
Top
-
- A 5 m ATOLL link cable consists of 40 LVDS signal-wires.
Standard SCSI-3 connectors are used, with 4 connectors per board
(double-stacked).
Top
-
- No. The ATOLL PCI card requires a 3.3V enviroment. Usually,
32bit PCI interfaces do not support 3.3V signalling.
Top
-
- ATOLL is a Message Passing network. We do not support
remote memory operations with the first version, but this might change
with the next implementation. So if you would like to use the ATOLL
network together with Shared Memory applications, you will need a
software VSM implementation.
Top
-
- The main overhead of TCP/IP-based communication is a
consequence of the OS intervention and costly kernel traps. Though the
data transfer is faster, the OS context switch will dominate the latency
associated with this sort of communication, thus eliminating most of the
performance gain.
A solution to this problem might be a standardized user-level
communication API like VIA. It is planned to implement VIA on top of the
ATOLL API.
Top
-
- Because we have taken special care to eliminate most of the
cost factors of todays solutions. ATOLL has no expensive on-board SRAM
(host main memory is a lot cheaper). And the whole functionality (PCI
interface, NIs, switch, links) is integrated into a single ASIC,
avoiding costly multi-chip implementations. Therefore, ATOLL can be
offered at an excellent price/performance level.
Top
-
- If a node fails (there are many different cases of this
failure type!, we discuss here a complete failure with power supply off)
the directly connected nodes can get an interrupt (if enabled), because
the ATOLL link cable provides a signal which says "the cable is
disconnected or the power supply of the connected node failed. The
interrupt routing can then immediately mask this link as not available
in the mask register of the crossbar (see ATOLL Hardware Reference
Manual for register specification and function). After this mask bit is
set no more messages are allowed to request the crossbar for this
outgoing link.
After this fault detection and first reaction many further strategies
are possible but are not yet implemented. Most of the basic mechanisms
have been simulated and checked for the required HW function, but there
is no support for these recovery functions in the ATOLL API nor in the
ATOLL daemon software. The concepts for the fault-tolerance cluster will
be worked out in detail and implemented in a follow up project within
the next two years.
Top
-
-
- There is a fast local reaction of the node possible which
detect the failure of the link. It can mask the crossbar port
immediately and will then get an interrupt for every message requesting
this link. The node can change the route by inserting additional
routing bytes in front of the message in order to define a partial new
route avoiding the failed link. This is very simple, if the network is
a 2D-Grid or Torus as proposed y the User Manual.
The fault detecting node can also send a mesage to the master node
(this it the task of the fault tolerance deamon, not yet available) to
inform about this fault and to request the master to change the routing
tables of all nodes using this link.
There may be alterative pathes in the local routing tables of all nodes
which can be selected faster.
Top
-
- The normal case of a congestion is the occupancy of a link
(with the corresponding output port of the crossbar) by a fairly long
message. The length of an ATOLL message has no hardware restriction,
but may be cut into smaller chunks within the API level. As long as
this long message flows through the crossbar and the link, no other
message to this port can proceed. It is automatically stopped by the
backpressure flow control when the input buffer is filled up. The
congestion will be solved by waiting for the long message to release
the resource. After rearbitration the waiting message can proceed.
Impotant to state here is that receiving a mesage at the receiver node
must have priority over injecting new messages into the network. If
this can be guaranteed by the API level, all congestion will be solved
by the build-in network mechanisms.
Top
-
- First it should be stated that ATOLL can use a routing
strategy which is deadlock free (dimension order routing). If other
routing schemes are used, deadlocks may be possible.
ATOLL provides a special HW feature to detect deadlocks. We detect the
case that there is one message using an output port of the crossbar but
makes no further progress and another message is waiting for the same
port to get arbitrated. After a specific time has elapsed
(programmable), an interupt is generated to signal thepossibility of a
deadlock. The deamon should then communicate with its neightbours to
find the cycle of the deadlock. It helps a lot that these nodes with the
earliest time out interupt are the one which are involved in the cycle.
After having identified the cycle, one of the messages involved in the
cycle can be removed from the network at the blocked port by using the
debug mode and redirect the message. For efficiency reason an unused
host port of this node can be used to receive the message. After having
successfully resolved the deadlock this message can be injected back
into the network with the remaining routing path. Only a small number of
actions must be taken in debug mode (slow) the main part of receiving
and injecting is done in the normal operation mode (fast).
Top
|