OSiRIS is a pilot project funded by the NSF to evaluate a software-defined storage infrastructure for our primary Michigan research universities. OSiRIS will combine a number of innovative concepts to provide a distributed, multi-institutional storage infrastructure that will allow researchers at any of our three campuses to read, write, manage and share their data directly from their computing facility locations.
Our goal is to provide transparent, high-performance access to the same storage infrastructure from well-connected locations on any of our campuses. We intend to enable this via a combination of network discovery, monitoring and management tools and through the creative use of CEPH features.
By providing a single data infrastructure that supports computational access on the data “in-place”, we can meet many of the data-intensive and collaboration challenges faced by our research communities and enable these communities to easily undertake research collaborations beyond the border of their own Universities.
At SC16 one of our project engineers had the opportunity to speak with an engineer from Puppet and after the show we had a conversation about OSiRIS for the Puppet blog.
We talked about the project background, goals, and milestones as well as detailing how we leverage Puppet and Foreman in OSiRIS. Read more at the blog post on puppet.com.
The OSiRIS project was featured in the University of Michigan Advanced Research Computing booth at Supercomputing in Salt Lake City this year.
For the week of the conference, November 14 - 18, OSiRIS deployed a 4th storage site in Salt Lake City at the Salt Palace convention center.
SCinet is a high performance advanced network created every year for the Supercomputing conference. SCinet comes together thanks to the hard work of volunteers and the support of industry vendors who donate equipment and services.
On the last day of Supercomputing a call went out to 100G booths to push as much traffic as we could across SCinet LAN and WAN links to set a new SC16 bandwidth record. Of course we had to participate!
To generate traffic we used OSiRIS 100Gb connected hosts at UM, WSU, and MSU as well as 100Gb hosts from ATLAS Great Lakes Tier 2.
The combined traffic of all participating exhibitors set a new record for pushing over 1.2 Terabybtes of traffic over the show floor and more than 1 Terabit/s across WAN circuits.
The NSF-funded Data Logistics Toolkit (DLT) project was featured in the Indiana University and the University of Michigan Advanced Research Computing booths at Supercomputing in Salt Lake City this year.
As an instantiation of the DLT, the Earth Observation Depot Network (EODN) aims to enable open access, reduced latency, and fast downloads for valuable and compelling Earth Science data from satellites for meteorological and atmospheric researchers. Data sources include remote sensing data from NASA’s Earth Science Program (EOS-DIS) and the United States Geological Survey (USGS) Landsat ground network and are made available within EODN via a “harvester” workflow maintained at IU.
Ceph has the capability to ‘overlay’ pools with a cache pool. Typically a cache pool would reside on faster storage to speed up client RW operations for frequently used data.
At SC16 we experimented with using a cache pool to overcome some of the speed handicaps inherent in having OSD separated by high latencies. The theory was that a cache pool local to clients at SC16 would speed up writing in cases where the data could be written without needing to flush the cache.
Our cache pool devices were LIQID NVMe drives in hardware purchased from 2CRSI. One LIQID drive card is 4 x 800GB devices, and each box contained two cards for a total of 16 Ceph OSD available for use as a cache pool.