From 8d6107241a3edf2dfbf224615d334f0fcf5e3a53 Mon Sep 17 00:00:00 2001 From: Zarquan Date: Sun, 29 Oct 2023 09:43:15 +0000 Subject: [PATCH] Updating RSAP requests --- notes/zrq/20231029-01-rsap-request.txt | 245 +++++++++++++++++++++++++ 1 file changed, 245 insertions(+) create mode 100644 notes/zrq/20231029-01-rsap-request.txt diff --git a/notes/zrq/20231029-01-rsap-request.txt b/notes/zrq/20231029-01-rsap-request.txt new file mode 100644 index 00000000..641ea0c4 --- /dev/null +++ b/notes/zrq/20231029-01-rsap-request.txt @@ -0,0 +1,245 @@ +# +# +# +# Copyright (c) 2023, ROE (http://www.roe.ac.uk/) +# +# This information is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This information is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . +# +# +# +#zrq-notes-time +#zrq-notes-indent +#zrq-notes-crypto +#zrq-notes-ansible +#zrq-notes-osformat +#zrq-notes-zeppelin +# +# AIMetrics: [] +# + + Target: + + 2023/24/25 RSAP request + + Result: + + Work in progress ... + +# ----------------------------------------------------- + + Cambridge + + Live DR4 service + 700 cores + 1 TiB RAM + Full DR4 dataset, 500 TiB + 50 TiB Object Swift + 500 TiB CephFS Manila + 5 TiB Block Cinder + 16 TiB Direct attached SSD Ephemeral + + Dev DR4 service + 700 cores + 1 TiB RAM + Full DR4 dataset, 500 TiB (needed to re-generate table indexing) + 50 TiB Object Swift + 500 TiB CephFS Manila + 5 TiB Block Cinder + 16 TiB Direct attached SSD Ephemeral + + Total for Cambridge + 1400 cores + 2 TiB RAM + 100 TiB Object Swift + 1000 TiB CephFS Manila + 10 TiB Block Cinder + 32 TiB Direct attached SSD Ephemeral + + Somerville + + Live/dev DR4 service + 700 cores + 1 TiB RAM + Full DR4 dataset, 500 TiB + 50 TiB Object Swift + 500 TiB CephFS Manila + 5 TiB Block Cinder + 16 TiB Direct attached SSD Ephemeral + + STFC cloud RAL + + Live/dev DR4 service + 700 cores + 1 TiB RAM + Full DR4 dataset, 500 TiB + 50 TiB Object Swift + 500 TiB CephFS Manila + 5 TiB Block Cinder + 16 TiB Direct attached SSD Ephemeral + +# ----------------------------------------------------- + + RSAP total + + 2800 cores + 4 TiB RAM + 200 TiB Object Swift + 2000 TiB CephFS Manila + 20 TiB Block Cinder + 64 TiB Direct attached SSD Ephemeral + +# ----------------------------------------------------- + + Peak estimates: + + This resource request is based on estimates for the peak load + expected when Gaia DR4 is released at the end of 2025. + + Our current Gaia DR3 dataset is around 8TiB. + The Gaia DR4 dataset is expected be be in the order of 500TiB. + + The current timeline for Gaia is to release Gaia DR4 in Q4 2025, + although we may start to receive pre-release copies of the data + for internal development and testing during 2024/25. + + When DR4 is released in Q4 2025 we are expecting to have to curate multiple + copies of the 500TiB dataset, and handle a peak of interest from users in + response to the publicity surrounding the event. + + This means that our resource requirements for this 2024/2025 RSAP round + includes an estimate of the resources that will need to be deployed by + the end of 2024, ready to handle the peak load leading up to the DR4 + release at the end of 2025. + + Resources requested in the 2025/2026 RSAP round will not be in deployed, + tested and ready for use in time. + + Once our Gaia DR4 platform has been deployed and the expected peak load + has passed; we may be able to release some of the requested resources. + + Multiple sites: + + We are planning for our main live deployment to be at Cambridge, along + with a second system on the same platform for development and data curation. + + The secondary sites at RAL and Somerville will provide scale-out capacity to + handle peak load expected during the DR4 release. + + IF secondary sites at RAL and Somerville come online with no problems, + then we may be able to release some the second system at Cambridge. + + Our development plan for 2024 includes migrating from the current monolithic + deployment to a more flexible system consisting of an initial website + that handles user accounts and login, backed by a number of notebook services. + + This will enable us to scale out to use multiple physical sites for the + notebook services while presenting the user with what appears to be a single + integrated platform. + + Multiple data copies: + + Our development plan for 2024 includes working with mock copies of the DR4 + dataset to test our data injest, indexing and partitioning processes, + and testing new deployment methods that can scale out to cope with this + amount of data. + + During 2025 we expect to be injesting pre-release versions of the DR4 data + set and using them to run a beta-test version of the system for a restricted + set of users, while at the same time running one or more development sites, + and keeping the existing live DR3 site running. + + Booking system: + + Our development plan for 2024 includes a booking system that enables users + to book sessions ahead of time, smoothing out the peaks in demand and enabling + us to predict the resources needed in advance. + + With this in place, we may be able to release resources at some of the physical + sites in response to changes in load. + + It is hoped that the booking system can be integrated with Blazar, the Openstack + resource reservation system, if it becomes available on the Openstack platforms. + + Resource pinning: + + Our current allocation at Cambridge is pinned to specific hardware. + This is to needed guarantee that we can release and create sets of resources without + running into problems. + + In the past we found that if the Openstack system is configured to make maximal + use all the available resources, leaving few idle resources available for new + allocations, then releasing and creating a block of 10+ VMs may fail because some + of the resources get assigned to other processes during the gap between our + release and create steps. + + Whether this will be needed at RAL or Somerville will depend on the configuration + of their Openstack systems. + + This issue may be solved by the deployment of Blazar, the Openstack resource + reservation system. + + High memory hypervisors: + + A number of the hypervisors at Cambridge are high memory nodes with a higher + memory:vcpu ratio. + + These host the high memory VMs required by the Zeppelin head nodes to support + the in-memory processing used by libraries such as HDBSCAN. + (*) cite Dennis's paper ? + + Project specific flavors: + + We have a set of project specific flavors at Cambridge, designed to optimise + the packing of VMs into the pinned hypervisors. + + This includes a set of 'himem' flavors designed to make use of the high-memory + hypervisors. + + We will also need to create a set of 'high storage' flavors to make use of the + direct attached SSD storage when it becomes available. + + Direct attached storage: + + This request is based on the assumption that direct attached SSD storage wil + be able to solve the problems with high IO wait that we have seen with our + current deployment. + We planned to use the SSD resources requested in our 2023 allocation to test + this theory, but the resources are not available yet. + + We hope to be able to resolve this question during work planned for 2024. + + CephFS vs object storage: + + The current system uses the CephFS storage system at Cambridge, which is is based + on spinning disc hard drives. + + In theory it should be possible to serve the Gaia dataset via the S3 interface + of Ceph object storage using the Openstack Swift interface. + However, we have encountered some issues with the Java S3 client libraries + when using this to serve the Gaia DR3 dataset. + Which is why this request specifies the majority of the storage, 500TiB, on + CephFS via Manila and only 50 TiB of object via Swift at each site. + + 50 TiB Object Swift + 500 TiB CephFS Manila + + If these issues can be solved, and if the resulting S3 interface is fast enough + to provide the bandwidth needed, then we may be able to swap the bulk of the + storage, 500TiB, from CephFS via Manila to object storage via Swift. + + 500 TiB Object Swift + 50 TiB CephFS Manila + + We hope to be able to resolve this question during work planned for 2024. +