Data Transfer Nodes

Data transfer nodes (DTNs) are dedicated, high performing servers used specifically for data transfers. These servers are optimized for large-scale data transfers and are usually best located within a Science DMZ infrastructure.

For DTN hardware, LLNL's Livermore Computing Center has compiled a document detailing hardware recommendations that work well for an ESGF environment.

ICNWG will be working with Globus-equipped DTNs (GridFTP-based software) for data replication between the ESGF sites. LLNL has installed the Globus Connect Server on their DTNs: information on how they configured the systems are available here.

ESnet has three high-performance data transfer hosts connected directly to the ESnet 100Gbps network backbone, which will help test the transfer performance from disk-to-disk for the Globus DTN endpoints. These are accessible to any university or science site. / Near Chicago, IL / Near NYC, NY / Berkeley, CA

Globus Service Access

The test hosts are also available via the Globus Transfer service. They are configured for anonymous, read-only access. is registered as the endpoint esnet#anl-diskpt1 is registered as the endpoint esnet#bnl-diskpt1 is registered as the endpoint esnet#lbl-diskpt1

Sample GridFTP test commands

If you don't have globus-url-copy installed, please refer to the GridFTP Quick Start Guide

#make sure you can connect to server
globus-url-copy -list
# copy 1G file
globus-url-copy -vb -fast file:///tmp/test.out
# copy 1G file using 4 parallel streams
globus-url-copy -vb -fast -p 4 file:///tmp/test.out
# write to /dev/null
globus-url-copy -vb -fast -p 4 file:///dev/null
# read from /dev/zero
globus-url-copy -vb -fast -p 4 -len 1G file:///tmp/t.out
# Use UDT instead of TCP
globus-url-copy -vb -udt file:///dev/null
Each host has a high-performance disk array, mounted as /data1. The following test files are available on each server, and are generated using "/dev/urandom" (the size is what you would expect from reading the filename):

/data1/1M.dat, /data1/10M.dat, /data1/50M.dat, /data1/100M.dat,
/data1/1G.dat, /data1/10G.dat, /data1/50G.dat, /data1/100G.dat
In addition, there are currently several data sets composed of multiple files in a directory structure. These data sets are for testing multi-file transfers. The data sets each contain directories a through y. Each of these directories contains directories a through y. Each leaf directory contains data files named for their place in the directory structure. So, a-a-1M.dat is a 1,000,000 byte data file in the data set with path 5GB-in-small-files/a/a/a-a-1M.dat. Note that the tiny file test set is primarily for testing directory creation performance, as the amount of data transferred will be trivial.

The data sets are:

/data1/5MB-in-tiny-files - 1KB, 2KB, and 5KB files in each leaf directory
/data1/5GB-in-small-files - 1MB, 2MB, and 5MB files in each leaf directory
/data1/50GB-in-medium-files - 10MB, 20MB, and 50MB files in each leaf directory
/data1/500GB-in-large-files - 100MB, 200MB, and 500MB files in each leaf directory
Sample commands for copying the complete data sets (these use the Berkeley DTN - substitute the other DTNs as needed):

# Copy using one stream only to test single-stream disk-to-disk performance
globus-url-copy -vb -p 1 -fast -r \
globus-url-copy -vb -p 1 -fast -r \
# Copy using 4 parallel streams
globus-url-copy -vb -p 4 -fast -r \
globus-url-copy -vb -p 4 -fast -r \
# Copy the big data set using 8 parallel streams
# (make sure your performance is good before doing this one!)
globus-url-copy -vb -p 8 -fast -r \

More information about these hosts can be found on ESnet's fasterdata site.