The recording will be available later, but if you're interested in enabling a global consistent view on hashtags, know stuff about DHTs or #ActivityPub relays, you can have a look at the paper: https://git.orlives.de/schmittlauch/paper_hashtag_federation/src/branch/master/paper_hashtag_federation.pdf
Please contact me about any questions, remarks or other feedback!
So here's a TL;DR:
@schmittlauch Wow, much respect for thinking about this. It's the largest problem IMHO since the beginning.
I still think the solution is separate software so that every platform does not have to implement more complex stuff than extra search REST API calls. I have not read your (long!) paper yet.
My current preferred solution is a network of index servers that share data over a p2p or similar. Platforms communicate with them over REST API's.
@schmittlauch the index servers wouldn't need the whole content of the posts, only 1) ID and 2) hashtags. This would allow platforms to make search requests for ID's and then they can fetch the ID's for actual content. This would make running the specialized index servers lightweight and would not introduce additional complex p2p development requirements to platforms themselves. Implementing AP alone is hard enough.
@jaywink I also propose the separation into a transparent application proxy component, lain even suggested to implement it as a relay.
Regarding "index servers" it depends on what you mean by that: If the index servers are supposed to crawl the Fediverse themselves then good luck with keeping up with its load: At the scale of that'd be ~140,000 posts/second. Furthermore, the indexers might not even know each server.
Thus I propose that as an opt-in instances actually push ->
@jaywink their published post to the responsible indexing server. Though it's not an indexing server but just a relay server, which will itself forward the post to a longer-term indexing/ storage server.
For distributing the load and avoiding a central point of authority or failure, I make each server just responsible for a subset of hashtags to handle.
I'll let you know once the recording of my talk is released, in case you need an easier start than a 24 pages paper ;)
@schmittlauch Great, sounds good 👍 Yeah no totally didn't mean crawling, I mean push from opt-in servers, just like the current relays work (well, the diaspora one at least, not sure how the AP ones work).
Just to understand, the network of hashtag servers would support anyone hosting one as part of the network? So you don't mean that some organization is in charge of hosting it?
@schmittlauch Science papers are a bit difficult for me - I tend to have a really hard time focusing after a page. So looking forward to a tl/dr ;)
@schmittlauch BTW just reading your talk slides.
One error: the #diaspora relay allows subscribers to choose either "all" or "a list of tags". Your talk indicated only the former.
Slides look good, though most of the science speak goes over my head ;) I think the important thing to consider is that whatever the complexity of the relay/indexers/distributed nodes, the API towards implementing social platforms MUST be simple and easy to understand.
Diaspodon.fr est une instance majoritairement francophone et généraliste. Aucun contenu du fédiverse n'est filtré par une décision d'administrateur ou de modérateur.