-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Sharding #26
Comments
The link in the release message for this points to #22. |
The Discord Documentation regarding Gateways has been updated. Disgo is also stable. To clear up a few oversights in the original message.
Upon further reading of What is a Shard?, it's clear that the hierarchy of a Shard Manager would be
So a "shard manager" is the equivalent of a "sessions manager" which is currently referred to by This documentation concludes that sharding in multiple ways is possible, but that you can't "ignore" (drop) a shard without also ignoring the incoming data of multiple guilds. Thus, a user who simply wants to use that module to shard without additional work likely expects that module to work in the following manner.
As opposed to a Shard Manager (Load Balancer Application) that subscribes to every event — as if it weren't sharded — and sends each event to another application that handles them. In each case, an abstraction (shard manager) operates upon the
The remaining implementation of "sharding" in Disgo consists of literally two lines; and a tweak to
The remaining complexity comes from the ability to track it (or manage it). What determines what numbers to use in the Note that adhering to the actual Discord Requirement to Shard is as straightforward as providing those values in the
An application using the same bot token within a cluster must maintain knowledge of every application that uses the same bot (token). If someone wishes to use a central service (i.e Redis) to count requests, each call from a separate application must notify that central service of its request and vice-versa. Such that the user must implement code in each application which rate limits by receiving the count from that central service, before making a request. This is simplified by the In contrast, an individual shard has no requirement to maintain knowledge of other shards. However, if someone wishes to use a cental service (i.e Load Balancer) to shard across multiple applications (or provide information among them), each Session must be tracked from that service and vice-versa. Such that the user must implement code in each application to do that. THis is simplified by a |
To summarize #26 (comment). A The
The The Disgo Shard Manager module aims to define a |
discord/discord-api-docs#5717 applies to #26 update manual fieldalignment
The new commit allows the SessionStartLimit object to be used. With this information, it's clear that there is no limit to the amount of shards a bot is able to create. A bot doesn't have to be in 2500 guilds to shard. That is only the level at which it is required. Therefore, a bot in a single server can shard, but it can only send 1 Identify Send Event every 5 seconds. The above information shows that it is possible for |
The rate limit implementation for the Gateway must be reclarified by Discord: discord/discord-api-docs#5844. |
i got it figured out. pr soon |
Connection vs. Session vs. ShardSuppose that you create a WebSocket Connection to the Discord API. Can that connection send multiple Identifiy packs with shards specified? The documentation states:
This Ready event always has a new Session ID (tested), which indicates the WebSocket Connection is tied to the WebSocket Session. Therefore, a single WebSocket Connection cannot send multiple Identify payloads using the shards field. Furthermore, an Identify payload has no manner of specifying a session ID: This is significant because it means that one WebSocket Connection is always tied to one WebSocket Session (Identify payload), which can only be tied to a single shard. To recap:
Here is what the documentation says about these assertions:
It's saying that you use a single shard per WebSocket Connection: If WebSocket Connection is tied to a Discord Session, then one shard is tied to a session. And you can't change that with a Resume payload. So then a session is always tied to one shard. But which object contains the other?
This statement means you can have 500 active sessions but only five unique shards. Each shard determines what guild event data is sent to each session. However, each session still only receives data based on a single shard.
This statement reiterates the above statement: 500 sessions can have the same shard, or 500 sessions can have different shards. [1] So while a shard could contain a session, it's better to say that a session has a shard field
Discord only lets you shard using an active-active strategy since you can't directly create a single shard with a higher number of guilds or events than other shards. Discord expects you to handle all events of one shard on one instance (without using additional network utilities). In other words, you can't create 70 sessions PER SHARD and then 70 instances PER SHARD; each handling a single event. Suppose you want to use an architecture that handles different events per instance. In that case, you must create a Shard Instance(s) that handles all guild event data by forwarding those events to a respective Handler Instance (e.g., Guild Member Chunk, Message Create). Shard ManagerSo the current session implementation needs revising if we want to implement a plug-and-play "shard manager".
vs.
[2] This option requires all methods of a The ShardManager could be a SessionManager, which has many implications. Is the ShardManager a SessionManager that is stored in the bot? Session ManagerThe The
But most importantly, the
So in the last ShardManager commit, I created an opt-in session manager that kept track of all your sessions in a future release. This would prevent users (developers) from doing this themselves, touching variables in states entirely different from what the developer expected. [3] And reading that back, this is a good idea: Make people reference sessions by ID, not by pointer (via So when a user (developer) connects a Session to the Gateway, store that in the bot
But should I make it opt-in or mandatory? Its memory cost is negligible as it only involves storing three maps containing string keys (Session ID's) that reference pointers. Its processing cost is negligible as it is only invoked when a session is manipulated during a connection or disconnection. So I will make it mandatory in the LocationThe Shard Manager is only used to modify a session's shard field and facilitate connection and/or mass manipulation of session's like a session would. The Session Manager is a collection of all sessions created by a bot. These two are different things, so keep both.
The Shard Manager must also remain a field of the bot because the bot determines how a shard is routed. To be clear, the amount of guilds in a shard is determined by how many guilds the bot is in. So when a session is connected using a bot, it must call the bot's Shard Manager to invoke special operations. The alternative (using shard manager as a parameter of [1] Session Shard Field |
Rate LimitIf the rate limit applies per per connection, then it shouldn't be managed by the bot, but rather the WebSocket Connection (which The following tests determine the Gateway Rate Limit implementation. Test Per Connection Gateway Rate Limit (59s Time Limit)
If this test passes, the Gateway Rate Limit applies per WebSocket Connection. This test passed for me. Otherwise, the following test should be performed. Test Per Session Gateway Rate Limit (59s Time Limit)
If this test passes, the Gateway Rate Limit applies per WebSocket Session. This test was not necessary, but also passes. Otherwise, the Gateway Rate Limit applies per bot (token). ResultTest Per Connection Gateway Rate Limit passed indicating that the Gateway Rate Limit is per connection (as confirmed by @devsnek in discord/discord-api-docs#5898 (comment)). So the current rate limit implementation must be refactored accordingly. |
from #26 update dasgo fix missing GUILD_MEMBERS_CHUNK intents
from #26 fix static check lint issues
implements #26 update dasgo fix missing GUILD_MEMBERS_CHUNK intents fix static check lint issues * edit concept information * fix enable intent functions * refactor shardmanager interface * implement SessionManager * update Gateway Rate Limit Handling * add Gateway tests for Session Manager, Rate Limits * remove TestGatewayGlobalRateLimit test * implement Shard Manager implement InstanceShardManager * add Shard integration test workflow * generate Send Event Shard Manager functions * update bundle * fix shard test workflow * fix data race in test * remove app ID write on ready event fix data race
Implemented in #64. |
Problem
Sharding can't be implemented (tested) without a bot in 2,500 guilds.
We do not have a bot that is in 2,500 guilds and there is no point in going through the effort to achieve that with a simple test bot. As an example, the DGO (Discord Go Test Bot) has been soliciting invitations for 6 years (among 3.2K stars) and STILL does NOT inherently provide sharding, due an inability to shard. However, we plan to add an optional shard manager to the Disgo library.
Solution
If sharding is as easy to implement as stated in bwmarrin/discordgo#265, then the easiest solution is to have someone else create the shard manager. Otherwise, one of the contributors may create a bot that warrants sharding, at which point creating a shard manager would be warranted and testable.
Another alternative is to have a bot developer who wants (or needs) sharding to provide us their token for testing purposes. However, it's unlikely for a developer to do that.
Another alternative is to have Discord provide us with the ability to shard (by upgrading the
max_concurrency
andshards
) for the test bot. However, this is historically the most unlikely to occur.Implementation
Once we have a way to test sharding, it can actually be implemented. Based on the Discord Sharding Documentation, this warrants a
ShardManager interface. In other words, the
Sessions` field of the client (which is currently unused by Disgo) can be replaced by the Shard Manager.Sending up to
max_concurrency
Identify commands per 5 seconds is already implemented.The instantiation of the Identify command (in the
session.go
initial
function) can be modified to set the Identify.Shard value to the value determined by the ShardManager.A test can be made to ensure that automatic sharding works, but if this test can NOT run with the test token it is rather useless. For this reason, it might not be included in the CI/CD pipeline, and only used when an issue with sharding is identified.
The text was updated successfully, but these errors were encountered: