Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Land refactored/fixed room stats #5971

Merged
merged 113 commits into from
Sep 4, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
8374bcb
Newsfile
reivilibre Aug 19, 2019
8de9ebe
Tear out current room & user statistics (#5880)
reivilibre Aug 20, 2019
d7675e7
Add schema for Separated Statistics
reivilibre Aug 20, 2019
80a1c6e
Add storage function for storing stats deltas
reivilibre Aug 20, 2019
e4cbea6
Handle state deltas and turn them into stats deltas
reivilibre Aug 20, 2019
1819563
Ack, isort!
reivilibre Aug 20, 2019
b5573c0
Update synapse/storage/stats.py
reivilibre Aug 20, 2019
4a97eef
Update synapse/storage/stats.py
reivilibre Aug 20, 2019
6a19f7e
Add room and user statistics documentation.
reivilibre Aug 20, 2019
981c6cf
Sanitise accepted fields in `_update_stats_delta_txn`
reivilibre Aug 20, 2019
977310e
Clarify `_update_stats_delta_txn`
reivilibre Aug 20, 2019
eafa8d3
Unify name of 'stats regenerator' in schema comments.
reivilibre Aug 20, 2019
18a4c03
Remove needless defaults.
reivilibre Aug 20, 2019
7b657f1
Simplify table structure
reivilibre Aug 22, 2019
e8fc180
Fix up SQL schema delta
reivilibre Aug 22, 2019
79252d1
Fix up historical stats support.
reivilibre Aug 22, 2019
c3d2bf2
Allow schema deltas to be engine-specific
reivilibre Aug 27, 2019
1ecd1a6
Use engine-specific delta SQL files rather than delta written in Python.
reivilibre Aug 27, 2019
baeaf00
Merge branch 'develop' into rei/rss_target
reivilibre Aug 27, 2019
5043ef8
Merge branch 'rei/rss_target' into rei/rss_inc2
reivilibre Aug 27, 2019
4b7bf2e
Apply suggestions from code review
reivilibre Aug 27, 2019
81c5289
Clarify `_update_stats_delta_txn` by adding code comments and kwargs.
reivilibre Aug 27, 2019
544ba2c
Apply minor suggestions from review
reivilibre Aug 27, 2019
a6c1020
Lock tables in upsert fall-backs.
reivilibre Aug 27, 2019
736ac58
Code formatting (Black)
reivilibre Aug 27, 2019
09cbc3a
Switch to milliseconds in room/user stats for consistency.
reivilibre Aug 27, 2019
c775f31
Don't include the room & user stats docs in this PR.
reivilibre Aug 27, 2019
bc754cd
Merge branch 'rei/rss_inc2' into rei/rss_inc3
reivilibre Aug 27, 2019
3b09a37
Adapt to stats now working in milliseconds
reivilibre Aug 27, 2019
99c88ac
No-op if no membership change and thus simplify verbose dict updates.
reivilibre Aug 27, 2019
dd8e602
For user stats, handle other membership transitions properly.
reivilibre Aug 27, 2019
491eaf0
Remove obsolete `OldCollectionRequired` as old collection is obsolete.
reivilibre Aug 27, 2019
11c4e50
Rename `room_state` table to `room_stats_state`
reivilibre Aug 27, 2019
62b1250
Update `_purge_room_txn` to take account of separated stats tables
reivilibre Aug 27, 2019
07c267c
For user stats, handle other membership transitions properly.
reivilibre Aug 27, 2019
44d3c2e
Invalidate `get_earliest_token_for_stats` cache as required.
reivilibre Aug 27, 2019
10c1a23
Fix logic error.
reivilibre Aug 27, 2019
324f21b
Fix logic error.
reivilibre Aug 27, 2019
064143c
Use `DeferredLock` instead of `threading.Lock`
reivilibre Aug 27, 2019
1af7866
Clean up code with improved naming and hoist around functions.
reivilibre Aug 27, 2019
b9f1adc
Update synapse/storage/stats.py
reivilibre Aug 28, 2019
a344ad3
Code formatting (Black)
reivilibre Aug 28, 2019
cc66cf1
Merge pull request #5889 from matrix-org/rei/rss_inc2
reivilibre Aug 28, 2019
dfb22fe
Merge branch 'rei/rss_target' into rei/rss_inc3
reivilibre Aug 28, 2019
81aa6d5
Address code review comments
reivilibre Aug 28, 2019
3cdce28
Merge pull request #5890 from matrix-org/rei/rss_inc3
reivilibre Aug 28, 2019
bc2c284
Add `total_event_bytes` to room statistics schema.
reivilibre Aug 28, 2019
a13ad21
Add incremental counting for rooms' total events and total event bytes.
reivilibre Aug 28, 2019
d7a692f
Update total_events and total_event_bytes on new events.
reivilibre Aug 28, 2019
b06f294
Track new users in user statistics.
reivilibre Aug 28, 2019
73d552a
Hoist up None check to prevent trying to iterate over NoneType.keys()
reivilibre Aug 28, 2019
3b69bf3
Upsert fixes
reivilibre Aug 28, 2019
4444b9a
Code formatting (Black)
reivilibre Aug 29, 2019
39dbee2
Count total_events and total_event_bytes within the loop.
reivilibre Aug 29, 2019
f7ececb
Merge branch 'develop' into rei/rss_target
reivilibre Aug 29, 2019
7c0224d
Merge branch 'rei/rss_target' into rei/rss_inc6
reivilibre Aug 29, 2019
6048103
Merge branch 'rei/rss_target' into rei/rss_inc5
reivilibre Aug 29, 2019
9dbf42a
Merge pull request #5923 from matrix-org/rei/rss_inc5
reivilibre Aug 30, 2019
4c13f2b
Merge branch 'develop' into rei/rss_target
reivilibre Aug 30, 2019
6c582d7
Merge branch 'rei/rss_target' into rei/rss_inc6
reivilibre Aug 30, 2019
757205d
Convert `chain` to `list` as `chain` is only once iterable.
reivilibre Aug 30, 2019
44b0367
Add stats regenerator
reivilibre Aug 30, 2019
893729a
Code formatting
reivilibre Aug 30, 2019
8c02602
Merge pull request #5924 from matrix-org/rei/rss_inc6
reivilibre Aug 30, 2019
1d6cf15
Merge branch 'rei/rss_inc7' into rei/rss_inc8
reivilibre Aug 30, 2019
440c60e
Some fixes that have become necessary due to changes in other PRs
reivilibre Aug 30, 2019
065042c
Code formatting and typo pointed out by Erik.
reivilibre Aug 30, 2019
7dc387e
Merge branch 'rei/rss_inc7' into rei/rss_inc8
reivilibre Aug 30, 2019
0f2e59f
Fix that became apparent after unit testing
reivilibre Aug 30, 2019
bf6d45f
Merge branch 'rei/rss_inc7' into rei/rss_inc8
reivilibre Aug 30, 2019
b379a11
`users` table's ID field is actually called `name`.
reivilibre Aug 30, 2019
425d445
Merge branch 'rei/rss_inc7' into rei/rss_inc8
reivilibre Aug 30, 2019
4ecc62b
Whoops, took out a line there...
reivilibre Aug 30, 2019
97b2035
Merge branch 'rei/rss_inc7' into rei/rss_inc8
reivilibre Aug 30, 2019
d39c09c
Ambiguous `room_id`
reivilibre Aug 30, 2019
eba432e
Merge branch 'rei/rss_inc7' into rei/rss_inc8
reivilibre Aug 30, 2019
50c321d
Adapt to use renamed `room_state`
reivilibre Aug 30, 2019
98a8928
Merge branch 'rei/rss_inc7' into rei/rss_inc8
reivilibre Aug 30, 2019
b928909
Fix incremental processor when there are no deltas.
reivilibre Aug 30, 2019
ab11c0a
Whoopsies; these things come in order…
reivilibre Aug 30, 2019
7b977bd
Fixes to counting and stats deltas
reivilibre Aug 30, 2019
6f5e543
Various fixes
reivilibre Aug 30, 2019
d49457b
Add stats tests
reivilibre Aug 30, 2019
fc5d118
Add stats docs
reivilibre Aug 30, 2019
21593fe
Linting
reivilibre Aug 30, 2019
fca3a9c
Fix to use milliseconds
reivilibre Aug 30, 2019
ffc30b8
Merge branch 'develop' into rei/rss_target
reivilibre Aug 30, 2019
e893214
Merge pull request #5941 from matrix-org/rei/rss_inc7
reivilibre Aug 30, 2019
84532a4
Merge branch 'rei/rss_target' of github.com:matrix-org/synapse into r…
erikjohnston Sep 2, 2019
02f759e
Renamve get_room_state
erikjohnston Sep 2, 2019
745f2da
Merge pull request #5946 from matrix-org/rei/rss_inc8
erikjohnston Sep 2, 2019
e65c8a9
Remove user stats tracking
erikjohnston Sep 2, 2019
d85dc0d
Fixup
erikjohnston Sep 2, 2019
0d13f46
Revert "Remove user stats tracking"
erikjohnston Sep 2, 2019
7a2cbf8
dfslkdfj
erikjohnston Sep 2, 2019
604d152
Fixup initial room stats bg update
erikjohnston Sep 2, 2019
31b767f
Fix up initial user bg
erikjohnston Sep 2, 2019
0f69162
Merge branch 'develop' of github.com:matrix-org/synapse into erikj/ro…
erikjohnston Sep 2, 2019
34fdcdb
Add bg updates
erikjohnston Sep 2, 2019
a3ee427
Make updates atomic
erikjohnston Sep 3, 2019
a7270ee
Only track total joined rooms for users
erikjohnston Sep 3, 2019
916aae3
Remove per slice stats from current
erikjohnston Sep 3, 2019
95e5477
Track federated state
erikjohnston Sep 3, 2019
6d533f3
Correctly delete old room_stats_state
erikjohnston Sep 3, 2019
27b1d3d
Add guests_can_join to room_stats
erikjohnston Sep 3, 2019
d41e3c3
Track user sending stuff stats
erikjohnston Sep 3, 2019
48d6e05
Fix typo
erikjohnston Sep 3, 2019
277da96
Fix sql
erikjohnston Sep 3, 2019
8ea0f5c
Only local users
erikjohnston Sep 3, 2019
bb15072
Remove needless measure blocks
erikjohnston Sep 3, 2019
a2c11cd
Fix PEP8
erikjohnston Sep 3, 2019
db19f69
Update docs
erikjohnston Sep 4, 2019
648fb54
Fixup changelog
erikjohnston Sep 4, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/5971.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix room and user stats tracking.
62 changes: 62 additions & 0 deletions docs/room_and_user_statistics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
Room and User Statistics
========================

Synapse maintains room and user statistics (as well as a cache of room state),
in various tables. These can be used for administrative purposes but are also
used when generating the public room directory.


# Synapse Developer Documentation

## High-Level Concepts

### Definitions

* **subject**: Something we are tracking stats about – currently a room or user.
* **current row**: An entry for a subject in the appropriate current statistics
table. Each subject can have only one.
* **historical row**: An entry for a subject in the appropriate historical
statistics table. Each subject can have any number of these.

### Overview

Stats are maintained as time series. There are two kinds of column:

* absolute columns – where the value is correct for the time given by `end_ts`
in the stats row. (Imagine a line graph for these values)
* They can also be thought of as 'gauges' in Prometheus, if you are familiar.
* per-slice columns – where the value corresponds to how many of the occurrences
occurred within the time slice given by `(end_ts − bucket_size)…end_ts`
or `start_ts…end_ts`. (Imagine a histogram for these values)

Stats are maintained in two tables (for each type): current and historical.

Current stats correspond to the present values. Each subject can only have one
entry.

Historical stats correspond to values in the past. Subjects may have multiple
entries.

## Concepts around the management of stats

### Current rows

Current rows contain the most up-to-date statistics for a room.
They only contain absolute columns

### Historical rows

Historical rows can always be considered to be valid for the time slice and
end time specified.

* historical rows will not exist for every time slice – they will be omitted
if there were no changes. In this case, the following assumptions can be
made to interpolate/recreate missing rows:
- absolute fields have the same values as in the preceding row
- per-slice fields are zero (`0`)
* historical rows will not be retained forever – rows older than a configurable
time will be purged.

#### Purge

The purging of historical rows is not yet implemented.
13 changes: 5 additions & 8 deletions synapse/config/stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,19 +27,16 @@ class StatsConfig(Config):

def read_config(self, config, **kwargs):
self.stats_enabled = True
self.stats_bucket_size = 86400
self.stats_bucket_size = 86400 * 1000
self.stats_retention = sys.maxsize
stats_config = config.get("stats", None)
if stats_config:
self.stats_enabled = stats_config.get("enabled", self.stats_enabled)
self.stats_bucket_size = (
self.parse_duration(stats_config.get("bucket_size", "1d")) / 1000
self.stats_bucket_size = self.parse_duration(
stats_config.get("bucket_size", "1d")
)
self.stats_retention = (
self.parse_duration(
stats_config.get("retention", "%ds" % (sys.maxsize,))
)
/ 1000
self.stats_retention = self.parse_duration(
stats_config.get("retention", "%ds" % (sys.maxsize,))
)

def generate_config_section(self, config_dir_path, server_name, **kwargs):
Expand Down
Loading