Controversial posts and comments #2515

ghost · 2022-10-26T07:40:42Z

Posts and comments ordered by most total votes but that are close to zero score. I guess this should only be available on instances with the downvotes active.

dessalines · 2022-10-27T20:38:24Z

I don't have time to do this but someone else could.

iByteABit256 · 2023-06-16T16:09:12Z

Can I give this a try?

dcormier · 2023-06-16T17:32:18Z

@iByteABit256 I was just digging into this. 🫠 I came back to propose a calculation for "controversialness". You can have this one. I'll share where I was at, anyway.

I was thinking something like this (but implemented in SQL, similar to the existing hot_rank SQL function).

fn controversy_rank(upvotes: u32, downvotes: u32, score: i32) -> u32 {
    (upvotes + downvotes) / if score == 0 { 1 } else { score.unsigned_abs() }
}

Some examples of how this would work with various inputs can be seen here.

iByteABit256 · 2023-06-16T18:08:34Z

Not bad, but it has a flaw that small changes in like/dislike ratio can lead to huge changes in "controversialness".

For example 98/100 ratio isn't that different than 99/100, but it would have half the score.

My thinking was something like this:

fn controversy_rank(upvotes: u32, downvotes: u32, score: i32) -> u32 {
  if downvotes != 0 { upvotes / downvotes * score.unsigned_abs() } else { 0 }
}

Which seems more intuitive to me and gives more balanced scores, what do you think?

dcormier · 2023-06-16T18:47:55Z

it has a flaw that small changes in like/dislike ratio can lead to huge changes in "controversialness".

Does it matter? Will that value be shown, or used for anything other than sorting the comments?

My thinking was something like this:

fn controversy_rank(upvotes: u32, downvotes: u32, score: i32) -> u32 {
  if downvotes != 0 { upvotes / downvotes * score.unsigned_abs() } else { 0 }
}

The results for that are surprising. 100 upvotes and 100 downvotes results in 0 controversialness. The same as if something has 0 upvotes and 100 downvotes. Similarly, I would expect these to have the same level of controverialness, but they don't:

    assert_eq!(5, controversy_rank(50, 45, 5));
    assert_eq!(0, controversy_rank(45, 50, -5));

iByteABit256 · 2023-06-16T19:13:54Z

You're right, it needs some work. Also, what I was thinking for the multiplier was actually (upvotes + downvotes) to represent activity, since a 50-50 post with 2 total votes is much less controversial than a 50-50 post with 1000 votes.

Your way definitely gives good enough results though, I just want to explore it a bit before implementing it

dcormier · 2023-06-16T19:21:13Z

Yeah, that's what I was thinking, too. The total number of votes should be significant, here.

It's definitely worth exploring.

Here's something to show the output better and let you fiddle with the math more. I originally was just using a spreadsheet to try different approaches.

iByteABit256 · 2023-06-16T19:39:13Z

Printing it out as a table made it quite clearer, I think it's good enough to keep

All of the high scores are highly controversial, and the amount of activity clearly scales with it

dcormier · 2023-06-16T20:39:18Z

I agree. It seems good. I'd like to see some more people chime in with opinions, but maybe that'll come with a PR. At the very least, it's something that can be moved forward with.

Edit: Playing with the output visualization more because I was bored and it was pleasing. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=130155f2c33aa262c403427b8235dd82

iByteABit256 · 2023-06-16T21:40:40Z

That's proof enough haha, really cool!

dcormier · 2023-06-16T21:44:36Z

Looking that that output makes me think it might be worthwhile to subsort by something like activity (descending), or some other existing sort type. Things that aren't controversial get to a pretty flat curve fairly quickly, and otherwise may result in inconsistent ordering (if that's important in any way).

jamesmcm · 2023-06-16T21:56:09Z

It might also be helpful if the "controversy score" were visible in the UI when sorting this way too.

qznc · 2023-06-17T07:26:57Z

To throw in another idea:

min(upvotes, downvotes)

However, its primary advantage is that it is simpler, so easier to understand for user.

qznc · 2023-06-17T07:32:15Z

Btw reddit's algorithm in 2015 is here.

ghost · 2023-06-17T07:51:19Z

That seems like it has worked well in the past.

cpdef double controversy(long ups, long downs):
    """The controversy sort."""
    if downs <= 0 or ups <= 0:
        return 0

    magnitude = ups + downs
    balance = float(downs) / ups if ups > downs else float(ups) / downs

    return magnitude ** balance

iByteABit256 · 2023-06-17T16:42:36Z

Here is a comparison between @dcormier's, @qznc's and Reddit's method.

Reddit's looks like the most correct overall, but @dcormier's looks almost as good but much more performant since it doesn't involve float arithmetic and powers. @qznc's is the most performant, but the results are quite worse judging from this

Edit: Added an alteration of my own method after realising the main problem with it and how Reddit solved it

Edit: Changed debug build to release build and did absolute function manually instead of using Rust's abs() which seemed to be much faster. The results of the first 3 all seem good enough, time seems to be slightly better on the ratio method but take that with a grain of salt. After all, this is going to be implemented in SQL not Rust.

dcormier · 2023-06-19T14:19:28Z

That's not a very effective way of benchmarking in this case, unfortunately. The results are wildly different from run to run, and even within the same run. I.e., not only do the number change quite a bit from run to run, but within the same run two algorithms that had similar times in one run might have disparate times in another. Using cargo bench (requires nightly) or Criterion.rs would show differences more clearly.

Regardless, it's probably not worth benchmarking that in Rust. The existing hot_rank function used to produce the value to sort by when sorting on hot lives in SQL, not Rust. I would expect this function to end up being similar.

The Reddit method produces more gentle curve, which is nice.

iByteABit256 · 2023-06-19T17:18:34Z

I had a pretty lucky streak when I first wrote it but yeah, unfortunately it seems completely indeterminate now that I tried it again some times

ghost added the enhancement New feature or request label Oct 26, 2022

dessalines transferred this issue from LemmyNet/lemmy-ui Oct 27, 2022

Nutomic added the extra: good first issue Good for newcomers label Jan 15, 2023

This was referenced Jun 19, 2023

Controversal Posts and Comments (UI Side) LemmyNet/lemmy-ui#1404

Closed

Add controversial ranking #3205

Merged

lionirdeadman added the area: sorting label Jul 23, 2023

dessalines closed this as completed in #3205 Jul 26, 2023

dessalines mentioned this issue Jun 20, 2024

Change how number of votes affects controversy rank #4852

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Controversial posts and comments #2515

Controversial posts and comments #2515

ghost commented Oct 26, 2022

dessalines commented Oct 27, 2022

iByteABit256 commented Jun 16, 2023

dcormier commented Jun 16, 2023

iByteABit256 commented Jun 16, 2023 •

edited

Loading

dcormier commented Jun 16, 2023 •

edited

Loading

iByteABit256 commented Jun 16, 2023 •

edited

Loading

dcormier commented Jun 16, 2023 •

edited

Loading

iByteABit256 commented Jun 16, 2023 •

edited

Loading

dcormier commented Jun 16, 2023 •

edited

Loading

iByteABit256 commented Jun 16, 2023

dcormier commented Jun 16, 2023 •

edited

Loading

jamesmcm commented Jun 16, 2023

qznc commented Jun 17, 2023

qznc commented Jun 17, 2023

ghost commented Jun 17, 2023

iByteABit256 commented Jun 17, 2023 •

edited

Loading

dcormier commented Jun 19, 2023 •

edited

Loading

iByteABit256 commented Jun 19, 2023

Controversial posts and comments #2515

Controversial posts and comments #2515

Comments

ghost commented Oct 26, 2022

dessalines commented Oct 27, 2022

iByteABit256 commented Jun 16, 2023

dcormier commented Jun 16, 2023

iByteABit256 commented Jun 16, 2023 • edited Loading

dcormier commented Jun 16, 2023 • edited Loading

iByteABit256 commented Jun 16, 2023 • edited Loading

dcormier commented Jun 16, 2023 • edited Loading

iByteABit256 commented Jun 16, 2023 • edited Loading

dcormier commented Jun 16, 2023 • edited Loading

iByteABit256 commented Jun 16, 2023

dcormier commented Jun 16, 2023 • edited Loading

jamesmcm commented Jun 16, 2023

qznc commented Jun 17, 2023

qznc commented Jun 17, 2023

ghost commented Jun 17, 2023

iByteABit256 commented Jun 17, 2023 • edited Loading

dcormier commented Jun 19, 2023 • edited Loading

iByteABit256 commented Jun 19, 2023

iByteABit256 commented Jun 16, 2023 •

edited

Loading

dcormier commented Jun 16, 2023 •

edited

Loading

iByteABit256 commented Jun 16, 2023 •

edited

Loading

dcormier commented Jun 16, 2023 •

edited

Loading

iByteABit256 commented Jun 16, 2023 •

edited

Loading

dcormier commented Jun 16, 2023 •

edited

Loading

dcormier commented Jun 16, 2023 •

edited

Loading

iByteABit256 commented Jun 17, 2023 •

edited

Loading

dcormier commented Jun 19, 2023 •

edited

Loading