OSiRIS is a pilot project funded by the NSF to evaluate a software-defined storage infrastructure for our primary Michigan research universities. OSiRIS will combine a number of innovative concepts to provide a distributed, multi-institutional storage infrastructure that will allow researchers at any of our three campuses to read, write, manage and share their data directly from their computing facility locations.
Our goal is to provide transparent, high-performance access to the same storage infrastructure from well-connected locations on any of our campuses. We intend to enable this via a combination of network discovery, monitoring and management tools and through the creative use of CEPH features.
By providing a single data infrastructure that supports computational access on the data “in-place”, we can meet many of the data-intensive and collaboration challenges faced by our research communities and enable these communities to easily undertake research collaborations beyond the border of their own Universities.
Michael Thompson, Wayne State University
OSiRIS introduces a new variable to Ceph with the respect to placement of data containers (placement groups or PG) on data storage devices (OSD). Ceph by default stores redundant copies on randomly selected OSDs within failure domains defined by the CRUSH map. Typically a failure domain is a host, rack, etc and PG replica have fairly low latency between each other.
The OSiRIS project is structured such that PG might be in different cities or even states with much higher network latency between them. This certainly effects overall performance but we do have some options to optimize for certain use cases. One of these options is setting our CRUSH rules to prefer one site or another for the Primary OSD when allocating PG copies. Based on our testing this is a great way to boost read I/O for certain use cases.
Motivated by Ceph usage in our OSiRIS project, University of Michigan has joined the Ceph Foundation as an Associate Member! We join other educational, government, and research organizations engaged in the Ceph foundation at this membership level.
From the Foundation website: The Ceph Foundation exists to enable industry members to collaborate and pool resources to support the Ceph project community. The Foundation provides an open, collaborative, and neutral home for project stakeholders to coordinate their development and community investments in the Ceph ecosystem.
The Ceph Foundation is organized as a directed fund under the Linux Foundation. Thanks to U-M’s recent push to join the Cloud Native Computing Foundation the institution is already a member of the Linux Foundation and was enthusiastic to expand participation in open communities with the Ceph Foundation.
OSiRIS is an NSF funded collaboration between University of Michigan, Michigan State University, Wayne State University, and Indiana University. All of these institutions make valuable contributions to the project and without them our participation in the foundation would not be possible.
For more information on the Ceph Foundation: https://ceph.com/foundation
The International Conference for High Performance Computing, Networking, Storage, and Analysis
November 11–16, 2018
Kay Bailey Hutchison Convention Center, Dallas, Texas
Members of the OSiRIS team traveled to SC18 to setup in the University of Michigan and Michigan State University combined booth!
We setup up a rack of equipment designated for use by OSiRIS, SLATE, and AGLT2 demos at SC. The rack was shipped as a unit from Michigan and was waiting for us to plug it in and set up when we got to the conference.
SLATE (Services Layer At The Edge) aims to provide a platform for deploying scientific services across national and international infrastructures. At SC18 they demonstrated their containerized Xcache service using a 50 TB Rados Block Device hosted by OSiRIS in a pool overlaid by our Ceph cache tier at SC18.
At Supercomputing 2018 the OSiRIS project configured a Ceph storage node with 60 OSD to host cache tier pools. This gave us approximately 600TB of raw caching space, or 200TB with 3x replicated pools. For purposes of testing we replicated pools on just a single host but in a production setup a more resilient set of OSD would be desirable.