-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persist repository properties in database #4179
Labels
enhancement
New feature or request
Comments
jhrozek
added a commit
to jhrozek/minder
that referenced
this issue
Aug 29, 2024
Adds a utility database function to search entities by ID. Because we store entity properties as JSON, we add a GIN index. Testing with 50.000 entities we get the following performance: ``` Nested Loop (cost=30.39..109.45 rows=5 width=87) (actual time=0.290..0.292 rows=1 loops=1) -> Bitmap Heap Scan on properties p (cost=30.10..67.90 rows=5 width=16) (actual time=0.216..0.218 rows=1 loops=1) " Recheck Cond: (value @> '{""value"": ""MBymWMNbcv"", ""version"": ""v1""}'::jsonb)" Filter: (key = 'upstream_id'::text) Heap Blocks: exact=1 -> Bitmap Index Scan on idx_properties_value_gin (cost=0.00..30.10 rows=10 width=0) (actual time=0.174..0.175 rows=1 loops=1) " Index Cond: (value @> '{""value"": ""MBymWMNbcv"", ""version"": ""v1""}'::jsonb)" -> Index Scan using entity_instances_pkey on entity_instances ei (cost=0.29..8.31 rows=1 width=87) (actual time=0.069..0.069 rows=1 loops=1) Index Cond: (id = p.entity_id) Filter: (entity_type = 'repository'::entities) Planning Time: 2.365 ms Execution Time: 0.590 ms ``` Compared to not using the index: ``` Nested Loop (cost=0.29..3165.63 rows=5 width=87) (actual time=0.117..42.194 rows=1 loops=1) -> Seq Scan on properties p (cost=0.00..3124.07 rows=5 width=16) (actual time=0.032..42.108 rows=1 loops=1) " Filter: ((value @> '{""value"": ""MBymWMNbcv"", ""version"": ""v1""}'::jsonb) AND (key = 'upstream_id'::text))" Rows Removed by Filter: 100004 -> Index Scan using entity_instances_pkey on entity_instances ei (cost=0.29..8.31 rows=1 width=87) (actual time=0.082..0.082 rows=1 loops=1) Index Cond: (id = p.entity_id) Filter: (entity_type = 'repository'::entities) Planning Time: 1.104 ms Execution Time: 42.300 ms ``` Related: mindersec#4179
JAORMX
pushed a commit
that referenced
this issue
Aug 29, 2024
Adds a utility database function to search entities by ID. Because we store entity properties as JSON, we add a GIN index. Testing with 50.000 entities we get the following performance: ``` Nested Loop (cost=30.39..109.45 rows=5 width=87) (actual time=0.290..0.292 rows=1 loops=1) -> Bitmap Heap Scan on properties p (cost=30.10..67.90 rows=5 width=16) (actual time=0.216..0.218 rows=1 loops=1) " Recheck Cond: (value @> '{""value"": ""MBymWMNbcv"", ""version"": ""v1""}'::jsonb)" Filter: (key = 'upstream_id'::text) Heap Blocks: exact=1 -> Bitmap Index Scan on idx_properties_value_gin (cost=0.00..30.10 rows=10 width=0) (actual time=0.174..0.175 rows=1 loops=1) " Index Cond: (value @> '{""value"": ""MBymWMNbcv"", ""version"": ""v1""}'::jsonb)" -> Index Scan using entity_instances_pkey on entity_instances ei (cost=0.29..8.31 rows=1 width=87) (actual time=0.069..0.069 rows=1 loops=1) Index Cond: (id = p.entity_id) Filter: (entity_type = 'repository'::entities) Planning Time: 2.365 ms Execution Time: 0.590 ms ``` Compared to not using the index: ``` Nested Loop (cost=0.29..3165.63 rows=5 width=87) (actual time=0.117..42.194 rows=1 loops=1) -> Seq Scan on properties p (cost=0.00..3124.07 rows=5 width=16) (actual time=0.032..42.108 rows=1 loops=1) " Filter: ((value @> '{""value"": ""MBymWMNbcv"", ""version"": ""v1""}'::jsonb) AND (key = 'upstream_id'::text))" Rows Removed by Filter: 100004 -> Index Scan using entity_instances_pkey on entity_instances ei (cost=0.29..8.31 rows=1 width=87) (actual time=0.082..0.082 rows=1 loops=1) Index Cond: (id = p.entity_id) Filter: (entity_type = 'repository'::entities) Planning Time: 1.104 ms Execution Time: 42.300 ms ``` Related: #4179
jhrozek
added a commit
to jhrozek/minder
that referenced
this issue
Aug 29, 2024
… repository service Related: mindersec#4179
10 tasks
jhrozek
added a commit
to jhrozek/minder
that referenced
this issue
Aug 29, 2024
… repository service Related: mindersec#4179
jhrozek
added a commit
that referenced
this issue
Aug 29, 2024
…4299) * Accept transaction from outside in the property service If we had called these methods in a transaction, we would have effectivelly created another parallel transaction which might cause entities to not be referenced correctly. * Add FilteredCopy and Merge to properties These will be useful later. * Add EntityInstance model This is just a shortcut to avoid passing the properties and project ID and entity etc around * Decouple entities from fetching properties This makes it easier to handle multiple entities inside the github provider as all that is needed is to implement the PropertyFetcher interface. It also makes moves the logic around formatting the name and what properties are operational into the provider to code that is per-entity. * Handle operational attributes minder saves attributes like hook ID which do not come from the upstream entity. We need to avoid deleting them on updates. * Add a method to refresh a repository + lazy migration into the github repository service Related: #4179 * Switch over webhook handling to use property entities Changes fetchRepo to refresh the repository being processed and return an EntityInstance model structure instead of a database repo Related: #4173 * Propagate 404 from the property wrappers as a special error code
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: