The Auckland-II data set is a collection of long GPS-syncronized IP header traces captured with a DAG2 system at the University of Auckland Internet uplink by the WAND research group since November 1999.
| Trace (click for graphs) | Run length | Size (gzip) | Download (one file per direction) | MD5 checksum | |
|---|---|---|---|---|---|
| http | ftp | ||||
| 19991129-134258 | 38:29:08 | 1562MB | 134258-0 134258-1 | 134258-0 134258-1 | 1ecbc00e24f3ee416ec715d4f3b3bfa0 5a941b279b833e17ca4535ad9828e7a7 |
| 19991201-192548 | 24:02:58 | 877MB | 192548-0 192548-1 | 192548-0 192548-1 | 07879459effc6d02860d29b8aaca96dc 9a6731f2f39093efeb5cc09c55b30de6 |
| 19991207-125019 | 13:36:08 | 696MB | 125019-0 125019-1 | 125019-0 125019-1 | 75c56386fc7289b7557926589df938d8 475db0ac9cc5d46498fd8e5260c24326 |
| 19991208-125838 | 02:18:59 | 502MB | 125838-0 125838-1 | 125838-0 125838-1 | 4e2dd7cc09caa3f542a947404976412d f2c3e37c1c0b6fabcb8a1fbd60ccfee8 |
| 19991209-151701 | 24:02:10 | 927MB | 151701-0 151701-1 | 151701-0 151701-1 | 9237a7bcb22c2cb2bf3cb7fd7ac55c12 c56f491d1f070154de67497af9951b00 |
| 20000107-093027 | 21:50:41 | 742MB | 093027-0 093027-1 | 093027-0 093027-1 | 48309ba71faaa911caed6003dd73c4ab dc6c92af84142c07432833a727d4a6df |
| 20000112-111915 | 11:16:33 | 626MB | 111915-0 111915-1 | 111915-0 111915-1 | 68d0d0866d24872c1ace28b2c94357c5 779efdae9794e3c031c61c60c9be1267 |
| 20000114-125102 | 24:03:21 | 648MB | 125102-0 125102-1 | 125102-0 125102-1 | 80d8a2b5d78d5f5dd72a76f05909cabc 2bf9945b2c4362cb5e545e98bd4fce0e |
| 20000117-095016 | 21:56:58 | 832MB | 095016-0 095016-1 | 095016-0 095016-1 | 5e04eacb3af0dd491605bd8efb52d762 b63aa4d30c192758c4d99f047d38c2a6 |
| 20000125-143640 | 03:11:28 | 295MB | 143640-0 143640-1 | 143640-0 143640-1 | 3d184b13e1d70a5a9fe0ce1a912b9653 c619f3e263e4be373b1690fb48854629 |
| 20000126-205741 | 19:36:57 | 793MB | 205741-0 205741-1 | 205741-0 205741-1 | 0401bcb243e9d976fd8c08b612226c76 65577e3dd0892bba49f7342fc5f1dd40 |
| 20000128-160441 | 24:01:22 | 570MB | 160441-0 160441-1 | 160441-0 160441-1 | 2bb262052887d966556502eb7de73321 370495c09aa6424814f5a86275247a4b |
When downloading the traces, please read the disclaimers. When using these traces for research and publications, please give appropriate credits for the Auckland-II data set towards NLANR MOAT and the WAND research group. The WAND group wishes to thank Nevil Brownlee and the ITSS operators group for their continuing support of the measurement point.
The traces are in DAG format, which currently is a fixed 64 bytes record format with 40 bytes of IP header (usually covering most, if not all, of the TCP/IP and UDP/IP headers). To start off, a simple ASCII converter is in d3h2asc.pl, although the timestamp output is inaccurate. This is just in case you need something to get started.
For more sophisticated analysis we recommend using the dagtools package available from the DAG software website. The tool dagdump is a simple converter into ASCII and serves as a good starting point when you wish to import DAG traces into a toolset of your own. Also, CAIDA's CoralReef package should have support for native DAG traces shortly.
The University of Auckland ITSS department is operating an OC3c ATM link via Clear Communications (a major New Zealand ISP and competitor to Telecom NZ), which is carrying a variety of services off the main campus.
One particular ATM virtual channel (VPI=10, VCI=103) is used to connect the university to the global Internet, it is the only connection (no backup links), which makes analysis of traffic rather interesting, since all packets for all external connections must pass the measurement point. The encapsulation on this VC is Classical-IP-over-ATM. The connection has a packet peak rate of 2 MBits/sec set in each direction (4 MBits/sec when you aggregate the two trace files).
Among other traffic there is a LANE connection to an offsite campus. A couple of 2 MBit/s PVCs are maintained to carry POTS services across the campus. A total of 30-50 MBits/sec, the last time we measured.
The DAGs are driven by a filtering software that captures headers of CLIP traffic, thus only the link to the Internet is recorded. The hard- and software takes advantage of a Trimble Palisade GPS system, which delivers a 1PPS signal that is timestamped by the DAG2 system. This 1 Hertz timestamp results in an artificial trace record written to be used in postprocessing when converting the DAG2 12.5MHz timestamps into true DAG3 format timestamps (see documentation). The accuracy of the timestamps is believed to be one microsecond to UTC or better. It is this syncronization accuracy which makes it possible to correlate the timestamps within the two directions of a trace, something still quite unique when capturing from bidirectional links (ATM, PoS, ...). The dagmerge utility can take advantage of that when merging the two individual trace directions into one single data stream.
The duration of a trace run is targeted at 24 hours, but varies due to hardware instabilities with the DAG2. We were just too busy getting the DAG3 cards to work to ever get around fixing those seemingly minor problems.
Once a set of 3-4 trace runs has been collected, the traces are taped onto DDS-2 and shipped from Auckland to Hamilton via courier. You might be guessing that we are deliberately trying to avoid an ftp transfer (which would be visible in the next trace run), but the truth is rather simple: we are trying to keep costs low. The Internet traffic charges for the large traces would be enormous. Ok, so much for folklore.
Before and after the timestamp conversion some integrity checks are run on the traces. This is not easy as there is not too much information redundancy in an IP header trace. We do have developed a healty paranoia for time warps though, all traces get checked for timestamps.
For publication the traces have been sanitized. All non-IP traffic is discarded (one more time, just to be sure). Only ICMP, TCP and UDP is left in the trace. For UDP packets and IP fragments all user payload is zeroed. The IP addresses are mapped into network 10.X.X.X in a non-reversable way to preserve privacy. The same IP mapping database is used for all traces, so IP addresses identical in different traces are identical in the real world. A study of behaviour of a particular machine across all the traces is thus possible.
Jörg Micheel, Ian Graham, Nevil Brownlee: The Auckland data set: an access link observed, submitted paper to the 14th ITC Specialist Seminar on Access Networks and Systems, April 25th-27th, 2001, Barcelona, Spain. Email me for a copy.
H Stele Martin, Anthony J McGregor, John G Cleary: Analysis of Internet Delay Times, Proceedings of the First Passive and Active Measurement Workshop, PAM2000, April 3rd-4th 2000, Hamilton, New Zealand, p.141ff.
Sarah K Joyce: Traffic on the Internet - A Study of Internet Games Traffic, BCMS 420 Honours project report, University of Waikato, Hamilton, New Zealand, 2000.
Vinay Ribero, Mark Coates, Rudolf Riedi, Shriram Sarvotham, Brent Hendricks and Richard Baraniuk: Multifractal Cross-Traffic Estimation. Proceedings of the 13th ITC Specialist Seminar on IP Traffic Measurement, Modelling and Management, September 18th -20th , 2000, Monterey, California, USA. Pages 15-1ff. The reference to the Auckland data set is in the actual talk.
Darryl Veitch, Lidong Huang, EMUlab, University of Melbourne, and Patrice Abry, Ecole Normale Superieure de Lyon, Laboratoire de Physique: Studies of Long Range Dependencies and Self-Similiarity in Internet traffic patterns using Wavelet analysis. Darryl Veitch's home page.
Guoqiang Mao and Daryous Habibi: Loss Performance Analysis for Heterogeneous ON-OFF Sources with Application to Connection Admission Control, submitted for publication in IEEE/ACM Transaction on Networking, October 2000.
If you are about to write a paper using the above data set, please kindly let us know.
There is information on the WAND website. Graphs include packets, bandwidth and new connections per each trace. The pointers are:
http://wand.cs.waikato.ac.nz/wand/wits/ http://wand.cs.waikato.ac.nz/wand/wits/auck/2/
The traces online here are a subset of the auck/2 displayed there, they are the first ones on the list.
The Auckland monitor has recently been upgraded to a DAG3 system. The DAG3 will enable contiguous runs over several days or even weeks. This data set will be referred to as Auckland-IV.
Another data set NZIX-II provides for a view of five days of traffic at a busy Internet Exchange. Public access to the traces is here.