Supported by the National Science Foundation Collaborator: University of Michigan Collaborator: Michigan State University Collaborator: Wayne State University Collaborator: Indiana University

OSiRIS is built on open source technology. The components we used or built are detailed below.

Ceph

Ceph Logo

Ceph is a distributed object store and file system designed to provide excellent performance, reliability and scalability.

The OSiRIS Ceph deployment spans WSU, MSU, and UM. We currently have deployed approximately 970 OSD. Our OSD are 8TB or 10TB disks for a total of about 8PB raw storage.

For information on how we tune and configure Ceph: Ceph configuration

One of our project goals is to explore the limits of Ceph and our distributed architecture. As such we did extensive simulated latency testing.

We also did real-world latency testing at Supercomputing 2016. As a method to work around network latency we have also explored and are actively using Ceph cache tiering functionality. Experiments with this were done at Supercomputing 2016, Supercomputing 2018, and we have deployed a production cache tier at the Van Andel Institute in Grand Rapids, MI.

The Ceph CRUSH map also allows for choosing which OSD will be used as the primary in any given set of replicas, and the primary is where reads/writes are directed. Setting the primary to be proximate to where most client reads will occur can boost performance (for those clients) in a high-latency deployment. Our article details the configuration and performance benchmarks.

COmanage

OSiRIS identity management and provisioning is handled by Internet2 COmanage. Plugins to provision user information from COmanage to LDAP and to Grouper are part of the COmanage release. For Ceph we created a new plugin. Each plugin is developed on a git branch and merged into a master branch that reflects our current in-use version of COmanage. We tend to track the master branch of COmanage.

More information and usage instructions for the COManage Ceph Provisioner plugin are available in our documentation.

LdapUserPosixGroupProvisioner: A simple plugin that provisions a posixGroup with gid matching every posixUser uid. Possibly will be obsoleted by including this feature in the core LdapProvisioner plugin but nonetheless we needed something to do this.

Stable code from of all of these plugins is combined on the osiris_master Git branch within our fork of the Internet2 COmanage repository. Other miscellanous changes to COmanage are also included on this branch but they are non-essential for recreating our functionality. From time to time we have made PR to the upstream repo with small changes that are applicable to general use, and may at some point make an effort to include our other plugins in the upstream release if there is interest.

Monitoring

For many system metrics we use Collectd. Typical metrics include CPU, disk, network (including OVS stats), and also metrics specific to some of our services such as HAproxy and Ceph processes details. To configure collectd we use a puppet-collectd module.

Ceph Metrics are gathered by a plugin to ceph-mgr originally written by a U-M student while working for our project. The plugin was contributed back to Ceph where it has seen significant modification since then. Some details about the plugin are covered in our article and you can also find out more information from the Ceph documentation.

For new deployments considering the question of Ceph metrics and monitoring you may also want to look into Prometheus. It also has a ceph-mgr plugin for exporting stats.

In our case we feed metrics from ceph-mgr and from Collectd to InfluxBD. We run our own instance of the open source edition with NVMe storage.

We take the stats from Influxdb and use Grafana to construct dashboards for monitoring Ceph status, system status, etc.

Log collection and aggregation uses the “ELK” stack and Filebeat for shipping logs to Elasticsearch Cluster. We collect logs from syslog files, from Ceph log files, and also logs from devices such as switches. These are all fed into Logstash.

For log searching and visualization we use Kibana. Grafana can also use Elasticsearch data for generating plots though it is not as convenient as other inputs from time-series databases.

More details about monitoring tools

Puppet Modules

We use Puppet to manage setup and configuration. The following puppet modules were created or forked from other modules and modified for OSiRIS usage. Documentation on using them is included in the repository README file.

More information on Puppet and other management tools we use

puppet-ceph: OSiRIS storage is provided by Ceph. This puppet module is used to deploy and manage all ceph components. It was recently updated to deploy Bluestore OSD. Our version is forked from openstack/puppet-ceph

puppet-ds389: OSiRIS backend directory services are provided by 389 Directory server in a multi-master replicated configuration. This module is used to deploy/manage that configuration and additional schema required for OSiRIS.

puppet-grouper: OSiRIS Posix groups are managed and provisioned to LDAP by Internet2 Grouper. Grouper could also be extended with additional provisioning targets to manage non-LDAP groups or to translate group memberships to other models such as S3 bucket ACL users but we haven’t explored this. This puppet module manages Grouper config as used by OSiRIS but requires some pre-setup of Grouper.

puppet-shibboleth: Our web services are authenticated by Shibboleth using InCommon meta-data. We use this puppet module to manage the configuration. It is forked from Aethylred/puppet-shibboleth.

puppet-pdsh: Pdsh is a utility for running shell commands via ssh on multiple nodes in parallel. Our module can be used as a puppet exported resource to define and gather node definitions and groupings for pdsh.

Many other internal components are managed by puppet modules also available from our Github repository. These include LLDP for network link information, Rancid network config version control, and a Shibboleth auth module for our Dokuwiki internal wiki. Further information on any module should be in the repository README. We also leverage a large number of modules from Puppet Forge for basic system configuration.