Next Previous Contents

8. Troubleshooting

8.1 When compiling on a LINUX system, I get errors involving bool. How do I fix these?

This is caused by conflicts between LINUX headers and particular versions of gcc. To fix it, set the environment variable CXX to the name of a modern (e.g., gnu 2.95) C++ compiler, or to "false" if you don't have one, and re-run ./configure, make, and make install in the top directory. (If you do not have a modern C++ compiler, a Perl implementation of ASFinder will automatically be used in place of the C++ implementation. However, the Perl implementation is much slower, and not suitable for realtime analysis.)

8.2 The Perl applications complain about undefined symbols. How do I fix this?

An error message like this:

Can't load '/usr/local/Coral/lib/Coral.so' for module Coral: /usr/local/Coral/lib/Coral.so: Undefined symbol "__eprintf" at /usr/libdata/perl/5.00503/DynaLoader.pm line 169.

may be caused by using a different version of the C compiler or C libraries to compile Coral.so or ASFunc.so than was used to compile Perl. To fix it, edit perl.inc in the top directory to set the variable PERL_LD to the name of the C compiler that was used to compile Perl, and run make clean and make install in the top directory. If that compiler is no longer available, you may need to reinstall Perl or the C libraries.

8.3 Why do the cell timestamps jump backward in my POINT trace? Why do some applications report intervals with negative duration?

Apparently there is a race condition in the card or driver between clock reset and beginning to capture data, causing a single backward jump a few hundred microseconds into a trace. In applications that report intervals, this problem can manifest itself as a negative duration in the first interval. This appears to affect both OC3 and OC12 point cards.

In CoralReef release 3.5, libcoral and most CoralReef applications will by default discard cells read before the clock reset, avoiding the problem. In earlier versions, you can use the application crl_time to find problems like this, and crl_cut to chop off the first few dozen bad cells.

8.4 Why do the cell timestamps jump backward in my FATM trace?

There appears to be a race condition in the FATM card between incrementing the high order clock when the low order clock wraps, and capturing more cells. This occasionally causes the timestamps to jump backwards a little less than 0.00262144 seconds when the low clock wraps, and then forward again by a little more than 0.00262144 seconds when the high clock catches up. (The period of the low clock wrap is 0.00262144 seconds.)

By default, libcoral and most CoralReef applications will attempt to compensate for this problem by incrementing the high clock when the low clock wraps, and then ignoring the next high clock increment reported by the driver. However, crl_trace and crl_encode do not perform this correction, so trace files will contain raw uncorrected data. Most other applications perform the correction when they read the trace.

8.5 Why does the point card sometimes report -185273100 or -454761244 cells lost?

This was due to a bug in older versions of the CoralReef POINT driver. It was fixed in version 3.3.2.

8.6 Why does running some Coral applications appear to cause cell/packet loss when the applications are run for a short durations?

Coral devices return their data to CoralReef in blocks of integral ATM cells (for FATM/Point cards blocks are 17545 cells and for DAG cards blocks are 16384 cells.) Normally, they do not return any data until a block fills up. When capturing one cell per packet (as crl_flow and crl_rate do), this means 17545 or 16384 cells must go by before the Coral device will give them to CoralReef. When an interval ends in realtime, CoralReef can not process the interval yet because it must assume the card is still holding a block containing packets belonging to that interval. So it waits for the card to fill up the block and give it to CoralReef. Only then can CoralReef look at the cell/packet timestamps and assign the delayed packets to the interval to which they belong. If the traffic rate is low, filling the block may take a long time, perhaps even multiple intervals. When the block finally fills, CoralReef reads it, and processes data for packets belonging to all intervals contained in the block. So, for example, if you use 10s intervals and you see pauses of  50s, and then output for 5 intervals all at once, it is because your traffic takes  50s to fill a block (e.g. 16384 packets in  50s, or  328 packets per second).

The only time the card will return a block before it is filled is when CoralReef stops the card, which it does when the duration ( -Cd) expires in realtime. So it will always be able to process data at the end of the duration, regardless of how full the last block is.

If you run applications like crl_flow repeatedly with 10s duration instead of once with infinite duration and 10s intervals, you will get reports approximately every 10s. But since that requires stopping the card and restarting it, you will also miss packets during the down time.

8.7 Why do CoralReef applications crash the first time one is run on a DAG card?

This was due to an incompatibility between DAG drivers and CoralReef. We have submitted a patch to the DAG developers and expect it to be incorporated in DAG driver versions later than (but not including) 2.2.1d.

8.8 How do I fix this error on a dag device: coral_open: /dev/dag0: ioctl FILLINFO: Inappropriate ioctl for device?

This error means CoralReef compatibility was not enabled when the DAG driver was built. To enable it, the DAG package must be configured with ./configure --with-coral=coral_dir.

8.9 Why does the report generator t2_report[++] report a large number of Autonomous Systems (AS's) as unknown?

ASFinder requires a current BGP table in order to perform the conversion from IP address to AS. This table must be kept current. See the answer to How can I convert an IP address into an AS (Autonomous System) number? and the documentation for ASFinder for more information.

8.10 Why doesn't my pcap filter expression match any packets on my VLAN?

Due to a bug in libpcap-0.6.2, you must specify the vlan iomode option on VLAN sources, or operations on layer 3 and above will not work. We hope this will be fixed in a future version of libpcap.


Next Previous Contents