Index orphaned replicas (#6626) #6627

nadove-ucsc · 2024-10-10T01:34:46Z

Connected issues: #6626

Checklist

Author

PR is a draft
Target branch is develop
Name of PR branch matches issues/<GitHub handle of author>/<issue#>-<slug>
On ZenHub, PR is connected to all issues it (partially) resolves
PR description links to connected issues
PR title matches¹ that of a connected issue _{or comment in PR explains why they're different}
PR title references all connected issues
For each connected issue, there is at least one commit whose title references that issue

¹ when the issue title describes a problem, the corresponding PR
title is Fix: followed by the issue title

Author (partiality)

Added p tag to titles of partial commits
This PR is labeled partial _{or completely resolves all connected issues}
This PR partially resolves each of the connected issues _{or does not have the partial label}

Author (chains)

This PR is blocked by previous PR in the chain _{or is not chained to another PR}
The blocking PR is labeled base _{or this PR is not chained to another PR}
This PR is labeled chained _{or is not chained to another PR}

Author (reindex, API changes)

Added r tag to commit title _{or the changes introduced by this PR will not require reindexing of any deployment}
This PR is labeled reindex:dev _{or the changes introduced by it will not require reindexing of dev}
This PR is labeled reindex:anvildev _{or the changes introduced by it will not require reindexing of anvildev}
This PR is labeled reindex:anvilprod _{or the changes introduced by it will not require reindexing of anvilprod}
This PR is labeled reindex:prod _{or the changes introduced by it will not require reindexing of prod}
This PR is labeled reindex:partial and its description documents the specific reindexing procedure for dev, anvildev, anvilprod and prod _{or requires a full reindex or carries none of the labels reindex:dev, reindex:anvildev, reindex:anvilprod and reindex:prod}
This PR and its connected issues are labeled API _{or this PR does not modify a REST API}
Added a (A) tag to commit title for backwards (in)compatible changes _{or this PR does not modify a REST API}
Updated REST API version number in app.py _{or this PR does not modify a REST API}

Author (upgrading deployments)

Ran make docker_images.json and committed the resulting changes _{or this PR does not modify azul_docker_images, or any other variables referenced in the definition of that variable}
Documented upgrading of deployments in UPGRADING.rst _{or this PR does not require upgrading deployments}
Added u tag to commit title _{or this PR does not require upgrading deployments}
This PR is labeled upgrade _{or does not require upgrading deployments}
This PR is labeled deploy:shared _{or does not modify docker_images.json, and does not require deploying the shared component for any other reason}
This PR is labeled deploy:gitlab _{or does not require deploying the gitlab component}
This PR is labeled deploy:runner _{or does not require deploying the runner image}

Author (hotfixes)

Added F tag to main commit title _{or this PR does not include permanent fix for a temporary hotfix}
Reverted the temporary hotfixes for any connected issues _{or the none of the stable branches (anvilprod and prod) have temporary hotfixes for any of the issues connected to this PR}

Author (before every review)

Rebased PR branch on develop, squashed old fixups
Ran make requirements_update _{or this PR does not modify requirements*.txt, common.mk, Makefile and Dockerfile}
Added R tag to commit title _{or this PR does not modify requirements*.txt}
This PR is labeled reqs _{or does not modify requirements*.txt}
make integration_test passes in personal deployment _{or this PR does not modify functionality that could affect the IT outcome}

Peer reviewer (after approval)

PR is not a draft
Ticket is in Review requested column
PR is awaiting requested review from system administrator
PR is assigned to only the system administrator

System administrator (after approval)

Actually approved the PR
Labeled connected issues as demo or no demo
Commented on connected issues about demo expectations _{or all connected issues are labeled no demo}
Decided if PR can be labeled no sandbox
A comment to this PR details the completed security design review
PR title is appropriate as title of merge commit
N reviews label is accurate
Moved connected issues to Approved column
PR is assigned to only the operator

Operator (before pushing merge the commit)

System administrator

Background migrations for dev.gitlab are complete _{or this PR is not labeled deploy:gitlab}
Background migrations for anvildev.gitlab are complete _{or this PR is not labeled deploy:gitlab}
PR is assigned to only the operator

Operator (before pushing merge the commit)

Operator (chain shortening)

Changed the target branch of the blocked PR to develop _{or this PR is not labeled base}
Removed the chained label from the blocked PR _{or this PR is not labeled base}
Removed the blocking relationship from the blocked PR _{or this PR is not labeled base}
Removed the base label from this PR _{or this PR is not labeled base}

Operator (after pushing the merge commit)

Operator (reindex)

Operator

Propagated the deploy:shared, deploy:gitlab, deploy:runner, API, reindex:partial, reindex:anvilprod and reindex:prod labels to the next promotion PRs _{or this PR carries none of these labels}
Propagated any specific instructions related to the deploy:shared, deploy:gitlab, deploy:runner, API, reindex:partial, reindex:anvilprod and reindex:prod labels, from the description of this PR to that of the next promotion PRs _{or this PR carries none of these labels}
PR is assigned to no one

Shorthand for review comments

L line is too long
W line wrapping is wrong
Q bad quotes
F other formatting problem

codecov · 2024-10-10T01:57:34Z

Codecov Report

Attention: Patch coverage is 89.62264% with 22 lines in your changes missing coverage. Please review.

Project coverage is 85.57%. Comparing base (cfe7acf) to head (9b6cf31).
Report is 11 commits behind head on develop.

Files with missing lines	Patch %	Lines
src/azul/plugins/repository/tdr_anvil/__init__.py	94.44%	7 Missing ⚠️
test/integration_test.py	0.00%	5 Missing ⚠️
src/azul/plugins/metadata/anvil/bundle.py	50.00%	3 Missing ⚠️
src/azul/plugins/repository/tdr_hca/__init__.py	50.00%	3 Missing ⚠️
src/azul/plugins/repository/canned/__init__.py	50.00%	2 Missing ⚠️
src/azul/plugins/__init__.py	50.00%	1 Missing ⚠️
src/azul/terra.py	85.71%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #6627      +/-   ##
===========================================
+ Coverage    85.50%   85.57%   +0.07%     
===========================================
  Files          155      155              
  Lines        20758    20874     +116     
===========================================
+ Hits         17749    17863     +114     
- Misses        3009     3011       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coveralls · 2024-10-12T01:18:22Z

coverage: 85.593% (+0.07%) from 85.522%
when pulling 9b6cf31 on issues/nadove-ucsc/6626-index-orphaned-replicas
into cfe7acf on develop.

hannes-ucsc

No showstoppers, approved.

For #6691:

Index: test/integration_test.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/test/integration_test.py b/test/integration_test.py
--- a/test/integration_test.py	(revision 251c79e7791982fef83293ee40f83be8694466ea)
+++ b/test/integration_test.py	(date 1731020545236)
@@ -1905,7 +1905,11 @@
         source = self._choose_source(catalog)
         # The plugin will raise an exception if the source lacks a prefix
         source = source.with_prefix(Prefix.of_everything)
-        bundle_fqids = self.repository_plugin(catalog).list_bundles(source, '')
+        # REVIEW: We had issues with this part of the test being surprisingly
+        #         slow. We should make sure that the removal of log statements
+        #         from list_bundles doesn't make it harder for use to diagnose
+        #         these types of issues. Maybe we should use the client here.
+        bundle_fqids = self.repository_plugin(catalog).list_bundles(source, prefix='')
         return self.random.choice(sorted(bundle_fqids))
 
     def _can_bundle(self,
Index: src/azul/plugins/repository/canned/__init__.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/azul/plugins/repository/canned/__init__.py b/src/azul/plugins/repository/canned/__init__.py
--- a/src/azul/plugins/repository/canned/__init__.py	(revision 251c79e7791982fef83293ee40f83be8694466ea)
+++ b/src/azul/plugins/repository/canned/__init__.py	(date 1731019797504)
@@ -26,9 +26,6 @@
 from furl import (
     furl,
 )
-from more_itertools import (
-    ilen,
-)
 
 from azul import (
     CatalogName,
@@ -165,11 +162,11 @@
 
     def count_bundles(self, source: SOURCE_SPEC) -> int:
         staging_area = self.staging_area(source.spec.name)
-        return ilen(
-            links_id
-            for links_id in staging_area.links
-            if source.prefix is None or links_id.startswith(source.prefix.common)
-        )
+        if source.prefix is None:
+            return len(staging_area.links)
+        else:
+            prefix = source.prefix.common
+            return sum(1 for links_id in staging_area.links if links_id.startswith(prefix))
 
     def list_bundles(self,
                      source: CannedSourceRef,
Index: src/azul/plugins/metadata/anvil/bundle.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/azul/plugins/metadata/anvil/bundle.py b/src/azul/plugins/metadata/anvil/bundle.py
--- a/src/azul/plugins/metadata/anvil/bundle.py	(revision 251c79e7791982fef83293ee40f83be8694466ea)
+++ b/src/azul/plugins/metadata/anvil/bundle.py	(date 1731025228121)
@@ -130,29 +130,27 @@
         pass
 
     def to_json(self) -> MutableJSON:
-        def serialize_entities(entities):
+        def to_json(entities):
             return {
                 str(entity_ref): entity
                 for entity_ref, entity in sorted(entities.items())
             }
 
         return {
-            'entities': serialize_entities(self.entities),
-            'orphans': serialize_entities(self.orphans),
+            'entities': to_json(self.entities),
+            'orphans': to_json(self.orphans),
             'links': [link.to_json() for link in sorted(self.links)]
         }
 
     @classmethod
-    def from_json(cls, fqid: BUNDLE_FQID, json_: JSON) -> Self:
-        def deserialize_entities(json_entities):
+    def from_json(cls, fqid: BUNDLE_FQID, bundle: JSON) -> Self:
+        def from_json(entities):
             return {
                 EntityReference.parse(entity_ref): entity
-                for entity_ref, entity in json_entities.items()
+                for entity_ref, entity in entities.items()
             }
 
-        return cls(
-            fqid=fqid,
-            entities=deserialize_entities(json_['entities']),
-            links=set(map(EntityLink.from_json, json_['links'])),
-            orphans=deserialize_entities(json_['orphans'])
-        )
+        return cls(fqid=fqid,
+                   entities=from_json(bundle['entities']),
+                   links=set(map(EntityLink.from_json, bundle['links'])),
+                   orphans=from_json(bundle['orphans']))
Index: src/azul/plugins/repository/tdr_anvil/__init__.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/azul/plugins/repository/tdr_anvil/__init__.py b/src/azul/plugins/repository/tdr_anvil/__init__.py
--- a/src/azul/plugins/repository/tdr_anvil/__init__.py	(revision 251c79e7791982fef83293ee40f83be8694466ea)
+++ b/src/azul/plugins/repository/tdr_anvil/__init__.py	(date 1731043074073)
@@ -11,7 +11,6 @@
     AbstractSet,
     Callable,
     Iterable,
-    Self,
     cast,
 )
 import uuid
@@ -80,63 +79,83 @@
 
 class BundleType(Enum):
     """
-    AnVIL snapshots have no inherent notion of a "bundle". During indexing, we
-    dynamically construct bundles by querying each table in the snapshot. This
-    class enumerates the tables that require special strategies for listing and
-    fetching their bundles.
+    Unlike HCA, AnVIL has no inherent notion of a "bundle". Its data model is
+    strictly relational: each row in a table represents an entity, each entity
+    has a primary key, and entities reference each other via a foreign keys.
+    During indexing, we dynamically construct bundles by querying each table in
+    the snapshot. This class enumerates the tables that require special
+    strategies for listing and fetching their bundles.
 
-    Primary bundles are defined by a biosample entity, termed the bundle entity.
-    Each primary bundle includes all of the bundle entity descendants and all of
-    those those entities' ancestors, which are discovered by iteratively
-    following foreign keys. Biosamples were chosen for this role based on a
-    desirable balance between the size and number of the resulting bundles as
-    well as the degree of overlap between them. The implementation of the graph
-    traversal is tightly coupled to this choice, and switching to a different
-    entity type would require re-implementing much of the Plugin code. Primary
-    bundles consist of at least one biosample (the bundle entity), exactly one
-    dataset, and zero or more other entities of assorted types. Primary bundles
+    Primary bundles are defined by a biosample entity, termed the *bundle
+    entity*. Each primary bundle includes all of the bundle entity's descendants
+    and all of those those entities' ancestors. Descendants and ancestors are
+    discovered by iteratively following foreign keys. Biosamples were chosen to
+    act as the bundle entity for primary bundles based on a desirable balance
+    between the size and number of the resulting bundles as well as the degree
+    of overlap between them. The implementation of the graph traversal is
+    tightly coupled to this choice, and switching to a different entity type
+    would require re-implementing much of the Plugin code. Primary bundles
+    consist of at least one biosample (the bundle entity), exactly one dataset
+    entity, and zero or more other entities of assorted types. Primary bundles
     never contain orphans because they are bijective to rows in the biosample
     table.
 
     Supplementary bundles consist of batches of file entities, which may include
-    supplementary files, which lack any foreign keys that associate them with
-    any other entity. Non-supplementary files in the bundle are classified as
-    orphans. The bundle also includes a dataset entity linked to the
+    supplementary files. The latter lack any foreign keys that would associate
+    them with any other entity. Normal (non-supplementary) files in the bundle
+    are classified as orphans.
+
+    REVIEW: That (above) sounds surprising and may need more explanation.
+
+    Each supplementary bundle also includes the dataset entity linked to the
     supplementary files.
 
-    Duos bundles consist of a single dataset entity. This "entity" includes only
+    DUOS bundles consist of a single dataset entity. This "entity" includes only
     the dataset description retrieved from DUOS, while a copy of the BigQuery
     row for this dataset is also included as an orphan. We chose this design
     because there is only one dataset per snapshot, which is referenced in all
     primary and supplementary bundles. Therefore, only one request to DUOS per
-    *snapshot* is necessary, but if `description` is retrieved at the same time
-    as the other dataset fields, we will make one request per *bundle* instead,
-    potentially overloading the DUOS service. Our solution is to retrieve
-    `description` only in a dedicated bundle format, once per snapshot, and
-    merge it with the other dataset fields during aggregation.
+    *snapshot* is necessary. If the DUOS `description` were retrieved at the
+    same time as the other fields of the dataset entity, we would make one
+    request per *bundle* instead, potentially overloading the DUOS service. Our
+    solution is to retrieve `description` only in a bundle of this dedicated
+    DUOS type, once per snapshot, and merge it with the other dataset fields
+    during aggregation.
 
     All other bundles are replica bundles. Replica bundles consist of a batch of
     rows from an arbitrary BigQuery table, which may or may not be described by
     the AnVIL schema. Replica bundles only include orphans and have no links.
+
+    REVIEW: Confusingly worded. I think what we mean is that the replicas are
+            stored in the `orphans` attribute. We may need to find a new name
+            for that attribute.
     """
     primary = 'anvil_biosample'
     supplementary = 'anvil_file'
     duos = 'anvil_dataset'
 
-    def is_batched(self: Self | str) -> bool:
+    # REVIEW: I'm getting type errors and PyCharm warnings with the original approach
+
+    @classmethod
+    def is_batched(cls, table_name: str) -> bool:
         """
-        >>> BundleType.primary.is_batched()
+        True if bundles for the table of the given name represent batches of
+        rows, or if each bundle represents a single row.
+
+        >>> BundleType.primary.is_batched
         False
 
         >>> BundleType.is_batched('anvil_activity')
         True
         """
-        if isinstance(self, str):
-            try:
-                self = BundleType(self)
-            except ValueError:
-                return True
-        return self not in (BundleType.primary, BundleType.duos)
+        return table_name not in (BundleType.primary.value, BundleType.duos.value)
+
+
+# REVIEW: The change from method to attribute may require more changes at the
+#         usage sites
+
+for bundle_type in BundleType:
+    bundle_type.is_batched = BundleType.is_batched(bundle_type.value)
 
 
 class TDRAnvilBundleFQIDJSON(SourcedBundleFQIDJSON):
@@ -245,28 +264,29 @@
         self._assert_source(source)
         bundles = []
         spec = source.spec
+
         if config.duos_service_url is not None:
+            # We intentionally omit the WHERE clause for datasets in order to
+            # verify our assumption that each snapshot only contains rows for a
+            # single dataset. This verification is performed independently and
+            # concurrently for every partition, but only one partition actually
+            # emits the bundle.
             row = one(self._run_sql(f'''
                 SELECT datarepo_row_id
                 FROM {backtick(self._full_table_name(spec, BundleType.duos.value))}
             '''))
             dataset_row_id = row['datarepo_row_id']
-            # We intentionally omit the WHERE clause for datasets in order
-            # to verify our assumption that each snapshot only contains rows
-            # for a single dataset. This verification is performed
-            # independently and concurrently for every partition, but only
-            # one partition actually emits the bundle.
             if dataset_row_id.startswith(prefix):
                 bundle_uuid = change_version(dataset_row_id,
                                              self.datarepo_row_uuid_version,
                                              self.bundle_uuid_version)
-                bundles.append(TDRAnvilBundleFQID(
-                    uuid=bundle_uuid,
-                    version=self._version,
-                    source=source,
-                    table_name=BundleType.duos.value,
-                    batch_prefix=None,
-                ))
+                bundle_fqid = TDRAnvilBundleFQID(uuid=bundle_uuid,
+                                                 version=self._version,
+                                                 source=source,
+                                                 table_name=BundleType.duos.value,
+                                                 batch_prefix=None)
+                bundles.append(bundle_fqid)
+
         for row in self._run_sql(f'''
             SELECT datarepo_row_id
             FROM {backtick(self._full_table_name(spec, BundleType.primary.value))}
@@ -275,24 +295,26 @@
             bundle_uuid = change_version(row['datarepo_row_id'],
                                          self.datarepo_row_uuid_version,
                                          self.bundle_uuid_version)
-            bundles.append(TDRAnvilBundleFQID(
-                uuid=bundle_uuid,
-                version=self._version,
-                source=source,
-                table_name=BundleType.primary.value,
-                batch_prefix=None,
-            ))
+            bundle_fqid = TDRAnvilBundleFQID(uuid=bundle_uuid,
+                                             version=self._version,
+                                             source=source,
+                                             table_name=BundleType.primary.value,
+                                             batch_prefix=None)
+            bundles.append(bundle_fqid)
+
         prefix_lengths_by_table = self._batch_tables(source.spec, prefix)
         for table_name, (batch_prefix_length, _) in prefix_lengths_by_table.items():
             batch_prefixes = Prefix(common=prefix,
                                     partition=batch_prefix_length - len(prefix)).partition_prefixes()
             for batch_prefix in batch_prefixes:
                 bundle_uuid = self._batch_uuid(spec, table_name, batch_prefix)
-                bundles.append(TDRAnvilBundleFQID(uuid=bundle_uuid,
-                                                  version=self._version,
-                                                  source=source,
-                                                  table_name=table_name,
-                                                  batch_prefix=batch_prefix))
+                bundle_fqid = TDRAnvilBundleFQID(uuid=bundle_uuid,
+                                                 version=self._version,
+                                                 source=source,
+                                                 table_name=table_name,
+                                                 batch_prefix=batch_prefix)
+                bundles.append(bundle_fqid)
+
         return bundles
 
     def _emulate_bundle(self, bundle_fqid: TDRAnvilBundleFQID) -> TDRAnvilBundle:
@@ -346,6 +368,11 @@
         table_names = sorted(filter(BundleType.is_batched, self.tdr.list_tables(source)))
         log.info('Calculating batch prefix lengths for partition %r of %d tables '
                  'in source %s', prefix, len(table_names), source)
+
+        # REVIEW: This needs a FIXME. The respective issue should have a
+        #         reproduction, maybe in the form of a diff removing the
+        #         workaround, and the resulting unit test failure.
+
         # The extraneous outer 'SELECT *' works around a bug in BigQuery emulator
         query = ' UNION ALL '.join(f'''(
             SELECT * FROM (
Index: src/azul/indexer/index_service.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/azul/indexer/index_service.py b/src/azul/indexer/index_service.py
--- a/src/azul/indexer/index_service.py	(revision 251c79e7791982fef83293ee40f83be8694466ea)
+++ b/src/azul/indexer/index_service.py	(date 1731044069661)
@@ -212,6 +212,9 @@
         for contributions, replicas in transforms:
             tallies.update(self.contribute(catalog, contributions))
             self.replicate(catalog, replicas)
+
+        # REVIEW: The addition of this conditional seems like an optimization
+        #         that is unrelated to the other changes in that commit
         if tallies:
             self.aggregate(tallies)
 
@@ -237,6 +240,9 @@
             tallies.update(self.contribute(catalog, contributions))
         # FIXME: Replica index does not support deletions
         #        https://github.com/DataBiosphere/azul/issues/5846
+
+        # REVIEW: Should this also be conditional like above?
+
         self.aggregate(tallies)
 
     def deep_transform(self,
Index: src/azul/plugins/metadata/anvil/indexer/transform.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/azul/plugins/metadata/anvil/indexer/transform.py b/src/azul/plugins/metadata/anvil/indexer/transform.py
--- a/src/azul/plugins/metadata/anvil/indexer/transform.py	(revision 251c79e7791982fef83293ee40f83be8694466ea)
+++ b/src/azul/plugins/metadata/anvil/indexer/transform.py	(date 1731025001923)
@@ -169,6 +169,8 @@
             assert False, entity_type
 
     def estimate(self, partition: BundlePartition) -> int:
+        # REVIEW: I don't quite understand the pat after "but". *All* orphans will be replicated by one partition? 
+
         # Orphans are not considered when deciding whether to partition the
         # bundle, but if the bundle is partitioned then each orphan will be
         # replicated in a single partition
@@ -577,14 +579,16 @@
                   partition: BundlePartition
                   ) -> Iterable[Contribution | Replica]:
         yield from super().transform(partition)
+        # REVIEW: I think *to coalesce* is rarely seen in passive tense, as in
+        #         "The cells are coalesced" but rather "The cells coalesce"
         if config.enable_replicas:
-            # Replicas are only emitted by the file transformer for entities
-            # that are linked to at least one file. This excludes all orphans,
-            # and a small number of linked entities, usually from primary
-            # bundles don't include any files. Some of the replicas we emit here
-            # will be redundant with those emitted by the file transformer, but
-            # these will be coalesced by the index service before they are
-            # written to ElasticSearch.
+            # The file transformer only emits replicas for entities that are
+            # linked to at least one file. This excludes all orphans, and a
+            # small number of linked entities, usually from primary bundles that
+            # don't include any files. Some of the replicas we emit here will be
+            # redundant with those emitted by the file transformer, but these
+            # will be consolidated by the index service before they are written
+            # to ElasticSearch.
             dataset = self._only_dataset()
             for entity in chain(self.bundle.orphans, self.bundle.entities):
                 if partition.contains(UUID(entity.entity_id)):

hannes-ucsc · 2024-11-08T06:17:50Z

Security design review

src/azul/plugins/repository/tdr_anvil/__init__.py

nadove-ucsc added chained [process] PR needs to based of develop before merging reindex:anvildev [process] PR requires reindexing anvildev reindex:anvilprod [process] PR requires reindexing anvilprod labels Oct 10, 2024

github-actions bot added the orange [process] Done by the Azul team label Oct 10, 2024

nadove-ucsc changed the base branch from develop to issues/nadove-ucsc/6615-use-datasets-projects-as-hub-ids October 10, 2024 01:36

nadove-ucsc mentioned this pull request Oct 11, 2024

Use projects/datasets as hub IDs (#6615) #6616

Closed

nadove-ucsc force-pushed the issues/nadove-ucsc/6615-use-datasets-projects-as-hub-ids branch 3 times, most recently from daf9bfa to 7e3e665 Compare October 11, 2024 18:50

nadove-ucsc changed the base branch from issues/nadove-ucsc/6615-use-datasets-projects-as-hub-ids to develop October 11, 2024 21:53

nadove-ucsc added base [process] Another PR needs to be rebased before merging this one and removed chained [process] PR needs to based of develop before merging labels Oct 11, 2024

nadove-ucsc force-pushed the issues/nadove-ucsc/6626-index-orphaned-replicas branch 3 times, most recently from d0edc4b to d207786 Compare October 12, 2024 00:57

nadove-ucsc force-pushed the issues/nadove-ucsc/6626-index-orphaned-replicas branch 12 times, most recently from c8b0767 to a95de72 Compare October 17, 2024 01:33

nadove-ucsc added the reindex:dev [process] PR requires reindexing dev label Oct 17, 2024

nadove-ucsc marked this pull request as ready for review November 7, 2024 21:53

nadove-ucsc assigned hannes-ucsc Nov 7, 2024

hannes-ucsc previously approved these changes Nov 8, 2024

View reviewed changes

hannes-ucsc added 0 reviews [process] Lead didn't request any changes sandbox [process] Resolution is being verified in sandbox deployment labels Nov 8, 2024

hannes-ucsc assigned achave11-ucsc and unassigned hannes-ucsc Nov 8, 2024

achave11-ucsc removed their assignment Nov 8, 2024

nadove-ucsc dismissed hannes-ucsc’s stale review via 043298e November 8, 2024 20:16

nadove-ucsc force-pushed the issues/nadove-ucsc/6626-index-orphaned-replicas branch from 251c79e to 043298e Compare November 8, 2024 20:16

github-advanced-security bot found potential problems Nov 8, 2024

View reviewed changes

src/azul/plugins/repository/tdr_anvil/__init__.py Fixed Show fixed Hide fixed

nadove-ucsc force-pushed the issues/nadove-ucsc/6626-index-orphaned-replicas branch 3 times, most recently from 24a106a to 2c7de0b Compare November 8, 2024 20:59

nadove-ucsc requested a review from hannes-ucsc November 8, 2024 22:12

nadove-ucsc assigned hannes-ucsc Nov 8, 2024

nadove-ucsc added 3 commits November 8, 2024 16:40

[r] Index orphaned replicas (#6626)

104a73b

Add FIXME (#6647)

fc10972

[r] Include parent dataset/project in hub IDs (#6626)

9b6cf31

hannes-ucsc force-pushed the issues/nadove-ucsc/6626-index-orphaned-replicas branch from 2704c7b to 9b6cf31 Compare November 9, 2024 00:40

hannes-ucsc approved these changes Nov 9, 2024

View reviewed changes

hannes-ucsc mentioned this pull request Nov 9, 2024

Left-overs from orphan-related PRs #6691

Open

hannes-ucsc assigned achave11-ucsc and unassigned hannes-ucsc Nov 9, 2024

achave11-ucsc merged commit 0b3b6b4 into develop Nov 9, 2024
11 checks passed

achave11-ucsc removed the base [process] Another PR needs to be rebased before merging this one label Nov 9, 2024

achave11-ucsc deleted the issues/nadove-ucsc/6626-index-orphaned-replicas branch November 12, 2024 17:06

achave11-ucsc assigned dsotirho-ucsc and unassigned achave11-ucsc Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Index orphaned replicas (#6626) #6627

Index orphaned replicas (#6626) #6627

nadove-ucsc commented Oct 10, 2024 •

edited by dsotirho-ucsc

Loading

codecov bot commented Oct 10, 2024 •

edited

Loading

coveralls commented Oct 12, 2024 •

edited

Loading

hannes-ucsc left a comment

hannes-ucsc commented Nov 8, 2024

Index orphaned replicas (#6626) #6627

Index orphaned replicas (#6626) #6627

Conversation

nadove-ucsc commented Oct 10, 2024 • edited by dsotirho-ucsc Loading

Checklist

Author

Author (partiality)

Author (chains)

Author (reindex, API changes)

Author (upgrading deployments)

Author (hotfixes)

Author (before every review)

Peer reviewer (after approval)

System administrator (after approval)

Operator (before pushing merge the commit)

System administrator

Operator (before pushing merge the commit)

Operator (chain shortening)

Operator (after pushing the merge commit)

Operator (reindex)

Operator

Shorthand for review comments

codecov bot commented Oct 10, 2024 • edited Loading

Codecov Report

coveralls commented Oct 12, 2024 • edited Loading

hannes-ucsc left a comment

Choose a reason for hiding this comment

hannes-ucsc commented Nov 8, 2024

Security design review

nadove-ucsc commented Oct 10, 2024 •

edited by dsotirho-ucsc

Loading

codecov bot commented Oct 10, 2024 •

edited

Loading

coveralls commented Oct 12, 2024 •

edited

Loading