Skip to content

Commit

Permalink
Merge pull request #1230 from Zarquan/20231029-zrq-rsap-request
Browse files Browse the repository at this point in the history
Updating RSAP requests
  • Loading branch information
Zarquan authored Oct 29, 2023
2 parents 8465815 + 8d61072 commit 97833ec
Showing 1 changed file with 245 additions and 0 deletions.
245 changes: 245 additions & 0 deletions notes/zrq/20231029-01-rsap-request.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,245 @@
#
# <meta:header>
# <meta:licence>
# Copyright (c) 2023, ROE (http://www.roe.ac.uk/)
#
# This information is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This information is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# </meta:licence>
# </meta:header>
#
#zrq-notes-time
#zrq-notes-indent
#zrq-notes-crypto
#zrq-notes-ansible
#zrq-notes-osformat
#zrq-notes-zeppelin
#
# AIMetrics: []
#

Target:

2023/24/25 RSAP request

Result:

Work in progress ...

# -----------------------------------------------------

Cambridge

Live DR4 service
700 cores
1 TiB RAM
Full DR4 dataset, 500 TiB
50 TiB Object Swift
500 TiB CephFS Manila
5 TiB Block Cinder
16 TiB Direct attached SSD Ephemeral

Dev DR4 service
700 cores
1 TiB RAM
Full DR4 dataset, 500 TiB (needed to re-generate table indexing)
50 TiB Object Swift
500 TiB CephFS Manila
5 TiB Block Cinder
16 TiB Direct attached SSD Ephemeral

Total for Cambridge
1400 cores
2 TiB RAM
100 TiB Object Swift
1000 TiB CephFS Manila
10 TiB Block Cinder
32 TiB Direct attached SSD Ephemeral

Somerville

Live/dev DR4 service
700 cores
1 TiB RAM
Full DR4 dataset, 500 TiB
50 TiB Object Swift
500 TiB CephFS Manila
5 TiB Block Cinder
16 TiB Direct attached SSD Ephemeral

STFC cloud RAL

Live/dev DR4 service
700 cores
1 TiB RAM
Full DR4 dataset, 500 TiB
50 TiB Object Swift
500 TiB CephFS Manila
5 TiB Block Cinder
16 TiB Direct attached SSD Ephemeral

# -----------------------------------------------------

RSAP total

2800 cores
4 TiB RAM
200 TiB Object Swift
2000 TiB CephFS Manila
20 TiB Block Cinder
64 TiB Direct attached SSD Ephemeral

# -----------------------------------------------------

Peak estimates:

This resource request is based on estimates for the peak load
expected when Gaia DR4 is released at the end of 2025.

Our current Gaia DR3 dataset is around 8TiB.
The Gaia DR4 dataset is expected be be in the order of 500TiB.

The current timeline for Gaia is to release Gaia DR4 in Q4 2025,
although we may start to receive pre-release copies of the data
for internal development and testing during 2024/25.

When DR4 is released in Q4 2025 we are expecting to have to curate multiple
copies of the 500TiB dataset, and handle a peak of interest from users in
response to the publicity surrounding the event.

This means that our resource requirements for this 2024/2025 RSAP round
includes an estimate of the resources that will need to be deployed by
the end of 2024, ready to handle the peak load leading up to the DR4
release at the end of 2025.

Resources requested in the 2025/2026 RSAP round will not be in deployed,
tested and ready for use in time.

Once our Gaia DR4 platform has been deployed and the expected peak load
has passed; we may be able to release some of the requested resources.

Multiple sites:

We are planning for our main live deployment to be at Cambridge, along
with a second system on the same platform for development and data curation.

The secondary sites at RAL and Somerville will provide scale-out capacity to
handle peak load expected during the DR4 release.

IF secondary sites at RAL and Somerville come online with no problems,
then we may be able to release some the second system at Cambridge.

Our development plan for 2024 includes migrating from the current monolithic
deployment to a more flexible system consisting of an initial website
that handles user accounts and login, backed by a number of notebook services.

This will enable us to scale out to use multiple physical sites for the
notebook services while presenting the user with what appears to be a single
integrated platform.

Multiple data copies:

Our development plan for 2024 includes working with mock copies of the DR4
dataset to test our data injest, indexing and partitioning processes,
and testing new deployment methods that can scale out to cope with this
amount of data.

During 2025 we expect to be injesting pre-release versions of the DR4 data
set and using them to run a beta-test version of the system for a restricted
set of users, while at the same time running one or more development sites,
and keeping the existing live DR3 site running.

Booking system:

Our development plan for 2024 includes a booking system that enables users
to book sessions ahead of time, smoothing out the peaks in demand and enabling
us to predict the resources needed in advance.

With this in place, we may be able to release resources at some of the physical
sites in response to changes in load.

It is hoped that the booking system can be integrated with Blazar, the Openstack
resource reservation system, if it becomes available on the Openstack platforms.

Resource pinning:

Our current allocation at Cambridge is pinned to specific hardware.
This is to needed guarantee that we can release and create sets of resources without
running into problems.

In the past we found that if the Openstack system is configured to make maximal
use all the available resources, leaving few idle resources available for new
allocations, then releasing and creating a block of 10+ VMs may fail because some
of the resources get assigned to other processes during the gap between our
release and create steps.

Whether this will be needed at RAL or Somerville will depend on the configuration
of their Openstack systems.

This issue may be solved by the deployment of Blazar, the Openstack resource
reservation system.

High memory hypervisors:

A number of the hypervisors at Cambridge are high memory nodes with a higher
memory:vcpu ratio.

These host the high memory VMs required by the Zeppelin head nodes to support
the in-memory processing used by libraries such as HDBSCAN.
(*) cite Dennis's paper ?

Project specific flavors:

We have a set of project specific flavors at Cambridge, designed to optimise
the packing of VMs into the pinned hypervisors.

This includes a set of 'himem' flavors designed to make use of the high-memory
hypervisors.

We will also need to create a set of 'high storage' flavors to make use of the
direct attached SSD storage when it becomes available.

Direct attached storage:

This request is based on the assumption that direct attached SSD storage wil
be able to solve the problems with high IO wait that we have seen with our
current deployment.
We planned to use the SSD resources requested in our 2023 allocation to test
this theory, but the resources are not available yet.

We hope to be able to resolve this question during work planned for 2024.

CephFS vs object storage:

The current system uses the CephFS storage system at Cambridge, which is is based
on spinning disc hard drives.

In theory it should be possible to serve the Gaia dataset via the S3 interface
of Ceph object storage using the Openstack Swift interface.
However, we have encountered some issues with the Java S3 client libraries
when using this to serve the Gaia DR3 dataset.
Which is why this request specifies the majority of the storage, 500TiB, on
CephFS via Manila and only 50 TiB of object via Swift at each site.

50 TiB Object Swift
500 TiB CephFS Manila

If these issues can be solved, and if the resulting S3 interface is fast enough
to provide the bandwidth needed, then we may be able to swap the bulk of the
storage, 500TiB, from CephFS via Manila to object storage via Swift.

500 TiB Object Swift
50 TiB CephFS Manila

We hope to be able to resolve this question during work planned for 2024.

0 comments on commit 97833ec

Please sign in to comment.