I made a Lemmy instance with a custom algorithm that keeps only the top 20% most unique (=interesting?) posts. It does this by calculating a similarity score between every post on my instance and all posts that came before it. The top 80% of posts with the highest self-similarity get removed instantly.
The idea would be that this allows me to cut through the noise that’s running through the communities, similar to how xkcd-signal attempted to do 20 years ago.
The instance is mostly meant for reading, not posting. So it has a very open federation policy (for now).
If anything, this is experimental. So please let me know what you think! You can see the type of stuff that gets removed in the modlog (https://lemmy.coffee/modlog).
Interesting. One of my instance’s guiding philosophies is “Quality over Quantity”. I’ve taken different steps toward achieving that (defederate from the Reddit repost instances, disallow pretty much all content bots, manually/locally mod duplicate posts, etc).
Do you plan to publish your algorithm/filter? Would be interested in seeing if it could be tuned and possibly reduce some of the workload for me.
In an ideal world sure. But I’d have to think about that some more, because in principle I don’t want people to game it :)
Lemmy’s license is AGPL, so you would need to at least publish changes to Lemmy itself 😉
(I don’t know if e.g. the code for the algorithm is separate, in order to have a closed source algorithm with an open source Lemmy fork)
I made no changes to the lemmy codebase, its all done through an auto-moderating bot that auto-removes posts that don’t meet the standard :)