I'm so excited to present my work on #Fediverse-wide hashtag federation, search, and subscription at #ActivityPubConf.

The recording will be available later, but if you're interested in enabling a global consistent view on hashtags, know stuff about DHTs or #ActivityPub relays, you can have a look at the paper: git.orlives.de/schmittlauch/pa

Please contact me about any questions, remarks or other feedback!

So here's a TL;DR:

The problem about the current state of hashtags in the fediverse is that users have a fragmented view on posts, depending on their instance. Some posts containing a hashtag never reach your own instance, so you won't see it. This is bad for decentralisation and coordination.

I plan to throw some additional P2P stuff at the fediverse: All instances are distributing the responsibility for relaying or storing posts (just their IDs) among themselves using a Distributed Hash Table. ->

Show thread

@schmittlauch Wow, much respect for thinking about this. It's the largest problem IMHO since the beginning.

I still think the solution is separate software so that every platform does not have to implement more complex stuff than extra search REST API calls. I have not read your (long!) paper yet.

My current preferred solution is a network of index servers that share data over a p2p or similar. Platforms communicate with them over REST API's.

@schmittlauch the index servers wouldn't need the whole content of the posts, only 1) ID and 2) hashtags. This would allow platforms to make search requests for ID's and then they can fetch the ID's for actual content. This would make running the specialized index servers lightweight and would not introduce additional complex p2p development requirements to platforms themselves. Implementing AP alone is hard enough.

@jaywink I also propose the separation into a transparent application proxy component, lain even suggested to implement it as a relay.
Regarding "index servers" it depends on what you mean by that: If the index servers are supposed to crawl the Fediverse themselves then good luck with keeping up with its load: At the scale of :birdsite: that'd be ~140,000 posts/second. Furthermore, the indexers might not even know each server.

Thus I propose that as an opt-in instances actually push ->

@jaywink their published post to the responsible indexing server. Though it's not an indexing server but just a relay server, which will itself forward the post to a longer-term indexing/ storage server.
For distributing the load and avoiding a central point of authority or failure, I make each server just responsible for a subset of hashtags to handle.

I'll let you know once the recording of my talk is released, in case you need an easier start than a 24 pages paper ;)


@schmittlauch Great, sounds good 👍 Yeah no totally didn't mean crawling, I mean push from opt-in servers, just like the current relays work (well, the diaspora one at least, not sure how the AP ones work).

Just to understand, the network of hashtag servers would support anyone hosting one as part of the network? So you don't mean that some organization is in charge of hosting it?

Sign in to participate in the conversation

Diaspodon.fr est une instance majoritairement francophone et généraliste. Aucun contenu du fédiverse n'est filtré par une décision d'administrateur ou de modérateur.