Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Commit

Permalink
Fix and refactor room and user stats (#5971)
Browse files Browse the repository at this point in the history
Previously the stats were not being correctly populated.
  • Loading branch information
erikjohnston authored Sep 4, 2019
1 parent ea128a3 commit 6e834e9
Show file tree
Hide file tree
Showing 11 changed files with 1,642 additions and 641 deletions.
1 change: 1 addition & 0 deletions changelog.d/5971.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix room and user stats tracking.
62 changes: 62 additions & 0 deletions docs/room_and_user_statistics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
Room and User Statistics
========================

Synapse maintains room and user statistics (as well as a cache of room state),
in various tables. These can be used for administrative purposes but are also
used when generating the public room directory.


# Synapse Developer Documentation

## High-Level Concepts

### Definitions

* **subject**: Something we are tracking stats about – currently a room or user.
* **current row**: An entry for a subject in the appropriate current statistics
table. Each subject can have only one.
* **historical row**: An entry for a subject in the appropriate historical
statistics table. Each subject can have any number of these.

### Overview

Stats are maintained as time series. There are two kinds of column:

* absolute columns – where the value is correct for the time given by `end_ts`
in the stats row. (Imagine a line graph for these values)
* They can also be thought of as 'gauges' in Prometheus, if you are familiar.
* per-slice columns – where the value corresponds to how many of the occurrences
occurred within the time slice given by `(end_ts − bucket_size)…end_ts`
or `start_ts…end_ts`. (Imagine a histogram for these values)

Stats are maintained in two tables (for each type): current and historical.

Current stats correspond to the present values. Each subject can only have one
entry.

Historical stats correspond to values in the past. Subjects may have multiple
entries.

## Concepts around the management of stats

### Current rows

Current rows contain the most up-to-date statistics for a room.
They only contain absolute columns

### Historical rows

Historical rows can always be considered to be valid for the time slice and
end time specified.

* historical rows will not exist for every time slice – they will be omitted
if there were no changes. In this case, the following assumptions can be
made to interpolate/recreate missing rows:
- absolute fields have the same values as in the preceding row
- per-slice fields are zero (`0`)
* historical rows will not be retained forever – rows older than a configurable
time will be purged.

#### Purge

The purging of historical rows is not yet implemented.
13 changes: 5 additions & 8 deletions synapse/config/stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,19 +27,16 @@ class StatsConfig(Config):

def read_config(self, config, **kwargs):
self.stats_enabled = True
self.stats_bucket_size = 86400
self.stats_bucket_size = 86400 * 1000
self.stats_retention = sys.maxsize
stats_config = config.get("stats", None)
if stats_config:
self.stats_enabled = stats_config.get("enabled", self.stats_enabled)
self.stats_bucket_size = (
self.parse_duration(stats_config.get("bucket_size", "1d")) / 1000
self.stats_bucket_size = self.parse_duration(
stats_config.get("bucket_size", "1d")
)
self.stats_retention = (
self.parse_duration(
stats_config.get("retention", "%ds" % (sys.maxsize,))
)
/ 1000
self.stats_retention = self.parse_duration(
stats_config.get("retention", "%ds" % (sys.maxsize,))
)

def generate_config_section(self, config_dir_path, server_name, **kwargs):
Expand Down
Loading

0 comments on commit 6e834e9

Please sign in to comment.