Novel Translational Methodologies

  1. What's It All About?
  2. Who is Responsible?
  3. Tell Me More.
  4. Resources.

Description of Facilities and Resources

Oak Ridge National Laboratory and the UT-ORNL Joint Institute for Computational Sciences

 

1. Oak Ridge National Laboratory
Computer Facilities. The Oak Ridge National Laboratory (ORNL) hosts three computing facilities: the Oak Ridge Leadership Computing Facility (OLCF), managed for DOE; the National Institute for Computational Sciences (NICS) computing facility operated for the National Science Foundation (NSF); and the National Climate-Computing Research Center (NCRC), formed as a collaboration between ORNL and the National Oceanographic and Atmospheric Administration (NOAA) to explore a variety of research topics in climate sciences. Each of these facilities has a professional, experienced operational and engineering staff comprising groups in high-performance computing (HPC) operations, technology integration, user services, scientific computing, and application performance tools. The ORNL computer facility staff provides continuous operation of the centers and immediate problem resolution. On evenings and weekends, operators provide first-line problem resolution for users with additional user support and system administrators on-call for more difficult problems. 

1.1 Primary Systems
Titan is a Cray XK7 system consisting of 18,688 AMD sixteen-core Opteron™ processors providing a peak performance of more than 3.3 petaflops (PF) and 600 terabytes (TB) of memory. A total of 512 service input/output (I/O) nodes provide access to the 10 petabytes (PB) “Spider” Lustre parallel file system at more than 240 gigabytes (GB/s). External login nodes (decoupled from the XK7 system) provide a powerful compilation and interactive environment using dual-socket, twelve-core AMD Opteron processors and 256 GB of memory.  Each of the 18,688 Titan compute nodes is paired with an NVIDIA Kepler graphics processing unit (GPU) designed to accelerate calculations. With a peak performance per Kepler accelerator of more than 1TF, the aggregate performance of Titan exceeds 20PF. Titan is the Department of Energy’s most powerful open science computer system and is available to the international science community through the INCITE program, jointly managed by DOE’s Leadership Computing Facilities at Argonne and Oak Ridge National Laboratories, and through the Office Of Advanced Scientific Computing Research (ASCR) Leadership Computing Challenge (ALCC), managed by DOE and ASCR.

Gaea consists of a pair of Cray XE6 systems. The smaller partition contains 2,624 socket G34 AMD 16-core Opteron processors, providing 41,984 compute cores, 84 TB of double data rate 3 (DDR3) memory, and a peak performance of 386 teraflops (TF). The larger partition contains 4,896 socket G34 AMD 16 core Interlagos Opteron processors, providing 78,336 compute cores, 156.7 TB of DDR3 memory, and a peak performance of 721 TF.
The aggregate system provides 1.106 PF of computing capability, and 248 TB of memory. The Gaea compute partitions are supported by a series of external login nodes and two separate file systems. The FS file system is based on more than 2,000 SAS drives and provides more than 1 PB (formatted) space for fast scratch to all compute partitions. The LTFS file system provides more than 2000 SATA drives and 4  PB formatted capacity as a staging and archive file system. Gaea is the NOAA climate community’s most powerful computer system and is available to the climate research community through the Department of Commerce/NOAA.

Eos is a 736-node Cray XC30 cluster with a total of 47.104 TB of memory. The processor is the Intel® Xeon®E5-2670. Eos uses Cray’s Aries interconnect in a network topology called Dragonfly. Aries provides a higher bandwidth and lower latency interconnect than Gemini. Support for I/O on Eos is provided by (16) I/O service nodes. The system has (2) external login nodes.
The compute nodes are organized in blades. Each blade contains (4) nodes connected to a single Aries interconnect. Every node has (64) GB of DDR3 SDRAM and (2) sockets with (8) physical cores each. Intel’s Hyper-threading (HT) technology allows each physical core to work as two logical cores so each node can functions as if it has (32) cores. Each of the two logical cores can store a program state, but they share most of their execution resources. Each application should be tested to see how HT impacts performance before HT is used. The best candidates for a performance boost with HT are codes that are heavily memory-bound. The default setting on Eos is to execute without HT, so users must invoke HT with the -j2 option to aprun.
In total, the Eos compute partition contains 11,776 traditional processor cores (23,553 logical cores with Intel Hyper-Threading enabled), and 47.6 TB of memory.

The ORNL Institutional Cluster (OIC) consists of two phases. The original OIC consists of a bladed architecture from Ciara Technologies called VXRACK. Each VXRACK contains two login nodes, three storage nodes, and 80 compute nodes. Each compute node has dual Intel 3.4 GHz Xeon EM64T processors, 4 GB of memory, and dual gigabit Ethernet interconnects. Each VXRACK and its associated login and storage nodes are called a block. There are a total of nine blocks of this type. Phase 2 blocks were acquired and brought online in 2008. They are SGI Altix machines. There are two types of blocks in this family.
 Thin nodes (3 blocks). Each Altix contains 1 login node, 1 storage node, and 28 compute nodes within 14 chassis. Each node has eight cores and 16 GB of memory. The login and storage nodes are XE240 boxes from SGI. The compute nodes are XE310 boxes from SGI.
 Fat nodes (2 blocks). Each Altix contains 1 login node, 1 storage node, and 20 compute nodes within 20 separate chassis. Each node has eight cores and 16 GB of memory. These XE240 nodes from SGI contain larger node-local scratch space and a much higher I/O to this scratch space because the space is a volume from four disks.

1.2 The University of Tennessee and ORNL’s Joint Institute for Computational Sciences
The University of Tennessee (UT) and ORNL established the Joint Institute for Computational Sciences (JICS) in 1991 to encourage and facilitate the use of high-performance computing in the state of Tennessee. When UT joined Battelle Memorial Institute in April 2000 to manage ORNL for the Department of Energy (DOE), the vision for JICS expanded to encompass becoming a world-class center for research, education, and training in computational science and engineering. JICS advances scientific discovery and state-of-the-art engineering by
 taking full advantage of the computers at the petascale and beyond housed at ORNL and in the Oak Ridge Leadership Computing Facility (OLCF) and
 enhancing knowledge of computational modeling and simulation through educating a new generation of scientists and engineers well versed in the application of computational modeling and simulation to solving the world’s most challenging scientific and engineering problems.
 
JICS is staffed by joint faculty who hold dual appointments as faculty members in departments at UT and as staff members in ORNL research groups. The institute also employs professional research staff, postdoctoral fellows and students, and administrative staff.
The JICS facility represents a $10M investment by the state of Tennessee and features a state-of-the-art interactive distance learning center with seating for 66 people, conference rooms, informal and open meeting space, executive offices for distinguished scientists and directors, and incubator suites for students and visiting staff.
The JICS facility is a hub of computational and engineering interactions. Joint faculty, postdocs, students, and research staff share the building, which is designed specifically to provide intellectual and practical stimulation. The auditorium serves as the venue for invited lectures and seminars by representatives from academia, industry, and other laboratories, and the open lobby doubles as casual meeting space and the site for informal presentations and poster sessions.
JICS is home to the National Institute for Computational Sciences (NICS). The mission of NICS is to enable the scientific discoveries of researchers nationwide by providing leading-edge computational resources and education, outreach, and training for underrepresented groups. NICS hosted the recently decommissioned Cray supercomputer, Kraken, which was the first academic system to surpass a petaflop. Current NICS resources include:

Darter is a Cray XC30 system with an Aries interconnect and a Lustre storage system, that provide both high scalability and sustained performance. The Darter supercomputer has a peak performance of 240.9 Tflops (1012 floating point operations per second).

Nautilus is an SGI Altix UV system consisting of one UV1000 (Nautilus), 4 UV10s (Harpoon nodes), and 3 login nodes (Arronax, Conseil, and Nedland). The UV1000 has 1024 cores (128 8-core Intel Nehalem EX processors), 4 terabytes of global shared memory, and 8 GPUs in a single system image. In addition, each UV10 provides 32 cores, 128 GB, and 2 GPUs. Nautilus currently has a CPU speed of 2.0 GHz and a peak performance of 8.2 Teraflops. The Lustre file system Medusa, with 1.3 PB capacity, is mounted on Nautilus.
The primary purpose of Nautilus is to enable data analysis and visualization of data from simulations, sensors, or experiments. Nautilus is intended for serial and parallel visualization and analysis applications that take advantage of large memories, multiple computing cores, and multiple graphics processors. Nautilus allows for both utilization of a large number of processors for distributed processing and the execution of legacy serial analysis algorithms for very large data processing by large numbers of users simultaneously.

Beacon is an energy efficient cluster that utilizes Intel® Xeon Phi™ coprocessors. It is funded by NSF through the Beacon project to port and optimize scientific codes to the coprocessors based on Intel's Many Integrated Core (MIC) architecture.

2. Infrastructure
Physical and Cyber Security. ORNL has a comprehensive physical security strategy including fenced perimeters, patrolled facilities, and authorization checks for physical access. An integrated cyber security plan encompasses all aspects of computing. Cyber security plans are risk-based. Separate systems of differing security requirements allow the appropriate level of protection for each system, while not hindering the science needs of the projects.
 
Network Connectivity. The ORNL campus is connected to every major research network at rates of between 10 GB/s and 100 GB/s. Connectivity to these networks is provided via optical networking equipment owned and operated by UT-Battelle that runs over leased fiber-optic cable. This equipment has the capability of simultaneously carrying either 192 10-GB/s circuits or 96 40-GB/s circuits and connects the OLCF to major networking hubs in Atlanta and Chicago. The connections into ORNL provide access to research and education networks including ESnet, XSEDE, and Internet2. To meet the increasingly demanding needs of data transfers between major facilities, ORNL participated in the Advanced Networking Initiative that provides a native 100 GB optical network fabric that includes ORNL, Argonne National Laboratory, Lawrence Berkeley National Laboratory, and other facilities in the northeast. This 100G fabric is now a production network.
 
The local-area network is a common physical infrastructure that supports separate logical networks, each with varying levels of security and performance. Each of these networks is protected from the outside world and from each other with access control lists and network intrusion detection. Line rate connectivity is provided between the networks and to the outside world via redundant paths and switching fabrics. A tiered security structure is designed into the network to mitigate many attacks and to contain others. 

Visualization and Collaboration. ORNL has state-of-the-art visualization facilities that can be used on site or accessed remotely. 

The EVEREST facility is a scientific laboratory deployed and managed by the Oak Ridge Leadership Computing Facility (OLCF). The primary mission of this laboratory is to provide tools to be leveraged by scientists for analysis and visualization of simulation data generated on the OLCF supercomputers.
Three computing systems are currently provided in the laboratory. These consist of a distributed memory Linux cluster, a shared memory Linux node, and a shared memory Windows node. Access to the Linux computing resources requires an EVEREST account and an RSA Secure ID. Access to the Windows computing resources requires a standard ORNL UCAMS account and does not require a specific EVEREST account.
Two tiled display walls are provided. The primary display wall spans 30.5’ x 8.5’ and consists of 18 1920×1080 stereoscopic Barco projection displays arranged in a 6 x 3 configuration. The secondary display wall consists 16 1920×1080 Planar displays arrange in a 4 x 4 configuration providing a standard 16:9 aspect ratio.
There are four additional peripheral video inputs located on pop-out boxes in the conference table. Each input supports both digital DVI and analog VGA. Users of the laboratory are welcome to control either wall using personal hardware that is brought into the laboratory. Power outlets are provided at the conference table.
The laboratory instruments are controlled using a touch panel interface located at the control desk. All computing resources can be routed to any available display wall. User hardware using the video input ports on the conference table can also be routed via the touch panel.

High Performance Storage and Archival Systems. To meet the needs of ORNL’s diverse computational platforms, a shared parallel file system capable of meeting the performance and scalability require¬ments of these platforms has been successfully deployed. This shared file system, based on Lustre, Data Direct Networks (DDN), and Infini¬Band technologies, is known as Spider and provides centralized access to petascale datasets from all major on-site computational platforms. Delivering more than 240 GB/s of aggregate performance, scalability to more than 26,000 file system clients, and more than 10-petabyte (PB) storage ca¬pacity, Spider is the world’s largest scale Lustre file system. Spider consists of 48 DDN 9900 storage arrays managing 13,440 1-TB SATA drives; 192 Dell dual-socket, quad-core I/O servers providing more than 14 TF in performance; and more than 3 TB of system memory. Metadata are stored on 2 LSI Engino 7900s (XBB2) and are served by three Dell quad-socket, quad-core systems. ORNL systems are interconnected to Spider via an InfiniBand system area network which consists of four 288-port Cisco 7024D IB switches and more than 3 miles of optical cables. Archival data are stored on the center’s High Performance Storage System (HPSS), developed and operated by ORNL. HPSS is capable of archiving hundreds of petabytes of data and can be accessed by all major leadership computing platforms. Incoming data are written to disk and later migrated to tape for long term archiving. This hierarchical infrastructure provides high-performance data transfers while leveraging cost effective tape technologies. Robotic tape libraries provide tape storage. The center has five SL8500 tape libraries holding up to 10,000 cartridges each and is deploying a sixth SL8500 in 2013. The libraries house a total of 24 T10K-A tape drives (500 GB cartridges, uncompressed) and 32 T 10K-B tape drives (1 terabyte cartridges, uncompressed). Each drive has a bandwidth of 120 MB/s. ORNL’s HPSS disk storage is provided by DDN storage arrays with nearly a petabyte of capacity and over 12 GB/s of bandwidth. This infrastructure has allowed the archival system to scale to meet increasingly demanding capacity and bandwidth requirements.

Neutron Sciences Directorate Biology and Soft Matter Division: The Biology and Soft Matter Division (BSMD) operates an external user program for biological and soft matter research using neutron techniques at SNS and HFIR. Division personnel enable the research initiated by external users by acting as instrument responsible scientists and local contacts on a range of different beam lines. BSMD works closely with the Center for Structural Molecular Biology.

Diffraction, small-angle scattering, and reflectometry are ideal methods for studying structure and organization from the atomic to the micron length scales, and neutron spectroscopic methods characterize self and collective motions from picosecond to microsecond timescales. These techniques are applicable to the length and time scales intrinsic to soft matter and biological systems but, unlike most other methods, are uniquely sensitive to hydrogen, an atom abundantly present in biological and soft condensed materials. [Learn more: http://neutrons.ornl.gov/bsmd]

Center for Structural Molecular Biology: The Center for Structural Molecular Biology at ORNL is dedicated to developing instrumentation and methods for determining the 3-dimensional structures of proteins, nucleic acids (DNA/RNA) and their higher order complexes. The tools of the CSMB will help understand how these macromolecular systems are formed and how they interact with other systems in living cells. The focus of the CSMB is to bridge the information gap between cellular function and the molecular mechanisms that drive it. [Learn more: http://www.csmb.ornl.gov/ ]

For more specific detail on how we use neutrons to study biology-related scientific topics, visit these instruments’ websites:

o   BIO-SANS: http://neutrons.ornl.gov/biosans

o   IMAGINE: http://neutrons.ornl.gov/imagine

o   MaNDi: http://neutrons.ornl.gov/mandi

o   EQ-SANS: http://neutrons.ornl.gov/eqsans

o   Liquids Reflectometer: http://neutrons.ornl.gov/lr

o   Neutron Spin Echo: http://neutrons.ornl.gov/nse

o   BASIS: http://neutrons.ornl.gov/basis


Some interesting news resources on biology-related neutron science:

o   Neutrons used to understand Huntington’s Disease/Alzheimer’s:

§  http://www.ornl.gov/ornl/news/news-releases/2014/f9cd8127-4246-42e6-a43c-f31a25af6e3d

§  https://www.youtube.com/watch?v=4YesRs1CMS0

o   Neutrons reveal enzyme synthesis process:

§  http://neutrons.ornl.gov/node/1613

§  http://neutrons.ornl.gov/news/unlocking-enzyme-synthesis

o   Neutrons used to study cell function: http://neutrons.ornl.gov/news/biochemical-switch

o   Lignin Fibers for battery production: http://www.ornl.gov/ornl/news/features/2014/predicting-performance-

o   Neutrons in Soft Matter science: http://www.ornl.gov/ornl/news/features/2013/neutrons-in-soft-matter-science