Show HN: Answer Overflow – Indexing Discord content into the web
333 points| rhyssullivan1 | 2 years ago |answeroverflow.com
I'm Rhys, I develop Answer Overflow a search engine for Discord channels. Answer Overflow indexes content from channels into Google making them discoverable on the web.
I'm sharing this again after seeing a lot of discussion during the Reddit blackout about the inaccessibility of information sent in Discord servers.
Answer Overflow is a verified bot in over 100 communities, fully complies with the Discord ToS, and is open source! https://github.com/AnswerOverflow/AnswerOverflow
Check out some of the communities here!
T3 Community - https://www.answeroverflow.com/c/966627436387266600
C# - https://www.answeroverflow.com/c/143867839282020352
Reactiflux - https://www.answeroverflow.com/c/143867839282020352
All - https://www.answeroverflow.com/browse
Please let me know what feedback you have, thanks for checking it out!
apignotti|2 years ago
I really don't understand how the need for indexing and search was overlooked.
rhyssullivan1|2 years ago
Discord start as "your private place for your friends to talk" during a time where there were a lot of privacy issues with other communication methods.
Then as it grew beyond this scope of being a private place for friends, it would have been good for indexing to be added but indexing a normal text channel is really hard since you don't know where the conversation starts / stops to submit to a sitemap.
Now we've got large public communities and forum channels so it's possible they roll out their own version soon, but it does still slightly go against how their product was originally created so there may be some hesitation with adding it due to not knowing what the community reaction will be like.
esafak|2 years ago
Kiro|2 years ago
madeofpalk|2 years ago
starlevel003|2 years ago
It wasn't overlooked. The point is to make it difficult for outside users to access information unless they sign up.
thunky|2 years ago
thrashh|2 years ago
A forum is totally different.
And even then, forums weren’t designed to be archived from the start. People just wrote web crawlers and search engines.
(I know Discord has some forum-like functionality now but the point stands.)
chillfox|2 years ago
The lack of good search really prevents the hostility towards new users that you often see on Reddit/forums where every question is instantly answered by a one liner "use the search" reply.
Discord communities are some of the most friendly and welcoming communities I have ever encountered on the internet. I think a large part of it is the chat nature and inability to easily pull up old comments.
andybak|2 years ago
The one user whom I contacted said they had never clicked the green consent button.
EDIT - turns out those posts were only visible to me when I was logged in to both sites (which makes sense).
It wasn't obvious this was the case and checking incognito shows things correctly.
rhyssullivan1|2 years ago
easygenes|2 years ago
[1] https://maggieappleton.com/cozy-web
TeMPOraL|2 years ago
What's happening is that these "communities" demand you to commit first, and deny providing value to passive participants. If that sounds reasonable to some, let me point out that the entire value of the Internet is built on doing the opposite. Wikipedia, Reddit, StackOverflow, everything that you can find through a search engine - those are all resources made available by people and groups that, for various reasons, decided to share knowledge instead of hoarding it, invite passive participation instead of demanding active commitment. The good days of the Internet, the ones people mourn, back before it got fully commercialized? They were built on the sentiment of openly sharing information, giving them "pay it forward" style - not gate-keeping them in webs of trust, and/or demanding people to pay with effort.
Maybe I'm too old, but I hate the "cozy web" with passion.
rhyssullivan1|2 years ago
There's lots that have support channels though for programming libraries, for games, etc and having all of that content locked away can be really damaging.
One of the interesting things I've noticed is when a community for a more niche game / programming library joins Answer Overflow, they often shoot up to being top performers on the site which is great to see.
Along with that, not all channels are indexed, mainly just help channels. What's nice with this is it keeps that cozy feeling of a private place to talk, while helping more people find a community they will enjoy and keeping information accessible.
Long term, I'd like to implement forms of anti-abuse tools for communities to use so they can understand what the types of people who join their server from Answer Overflow are like. For example, if it turns out that 90% of the people who join are abusive, then it'd make sense for them to turn off indexing.
You could possibly make the argument that for the long term health of some communities, having indexed content helps to keep the community active
philippejara|2 years ago
leobg|2 years ago
Question in one message. Then two unrelated messages. Then a partial answer by somebody. And so on.
It’s even worse than indexing a PDF. Just breaking stuff into paragraphs and generating embeddings isn’t going to cut it.
sprremix|2 years ago
Some communities I'm in have #support channels which only support threads. So you create a thread, add a title and a body message and people can reply to your thread by clicking on it. There's no way to post individual messages; only comments in threads.
Thread overview: https://i.imgur.com/jfvrRtG.png
Opening a thread: https://i.imgur.com/pqGrARI.png
This solves your context problem. Still not sure if this is the right direction we want to go in. This just proves to me that Discord is not right tool for the problem at hand.
mdaniel|2 years ago
rhyssullivan1|2 years ago
- Answer Overflow works on a consent basis for displaying messages (https://docs.answeroverflow.com/user-settings/displaying-mes...), while Linen does all the messages in a community. The consent system Answer Overflow has helps a lot with respecting user privacy while also getting content indexed.
- Linen appears to be building out a competitor to Slack & Discord while Answer Overflow is focused on building on top of those platforms, so we've got very different roadmaps. From what I can gather from the Linen roadmap, they're implementing things like voice chat, private channels, etc. Whereas with Answer Overflow some of the things I'm focused on is answer automation, tracking outdated answers, analytics for where to improve your docs etc
- Answer Overflow is pretty much only focused on Discord servers, it wouldn't be too hard to support both Slack and Discord but what's nice about focusing on Discord for now is it helps with our goal of being the best indexing tool specifically for Discord
- Global search (https://www.answeroverflow.com/search), you can search all Answer Overflow communities at the same time
The team at Linen have built out a great product though and it's cool watching them succeed with it!
bitshiftfaced|2 years ago
rhyssullivan1|2 years ago
- The API grants you essentially a sublicense to the data, since Answer Overflow is a bot going through the official API and following the ToS properly, that should cover it for any potential issues - Answer Overflow gets consent from users to use their messages https://docs.answeroverflow.com/user-settings/displaying-mes...
jaygreco|2 years ago
I’ve been wanting to set something like this up for the nullbits server for a while. When I picked discord instead of a forum, I wasn’t counting on the growth we saw. There’s a lot of friction for new folks who aren’t yet on discord, and there’s a lot of knowledge in the server that’s locked behind discord.
Just set everything up! My only feedback is that enabling indexing for all of our text channels took a while doing them all individually, but that’s kind of on me for not enabling forums for help requests until now.
rhyssullivan1|2 years ago
If you have any other feedback, please send it to me on Discord so I make sure I see it - thanks!
freediver|2 years ago
Unless a general purpose web search engine introduces a special Discord 'tab', like Images/News/Videos already exist, there is no way for a search engine to assign relevance to anything said on Discord because there is no authority or link graph based credibility for any message. In other words a mention of 'blue widgets' on Discord is competing with milions of web pages mentioning 'blue widgets' which all have some kind of built in relevance. If the idea is that this will be achieved through people linking to an aggregrator like this website, then perhaps, but the approach does suffer from the chickien and the egg problem.
andybak|2 years ago
But also either answeroverflow.com will gain some domain authority over time, or the communities will be hosted on domains that already have some.
jcq3|2 years ago
mid-kid|2 years ago
wanderingbit|2 years ago
Once I do that I'd like to DM you with some questions mid-kid.
Nice job on getting so much implemented and open for users!
rhyssullivan1|2 years ago
Alifatisk|2 years ago
returnInfinity|2 years ago
Good luck!
arp242|2 years ago
(And to be honest, I think they would be justified too; I initially assumed it was related to Stack Overflow based on the title. but turns out it's not – this is the sort of confusion trademarks are intended to protect).
rhyssullivan1|2 years ago
> Do name your application with something unique. Including one of the terms, "Stack" or "Exchange" or "Overflow" in your product name is generally okay.
It's a different enough product that I feel comfortable with it - Stack Overflow is only for programming while Answer Overflow is for all topics. Along with that Overflow is a pretty generic word and if you wanted to get super technical with it, the context I'm using the word in is "I have so many answers they're overflowing" while theirs is a reference to a programming term.
We'll see and I'm not a lawyer but given that their trademark guidelines allow it, I feel comfortable
cheschire|2 years ago
https://stackoverflow.com/legal/trademark-guidance
dancemethis|2 years ago
retox|2 years ago
bsenftner|2 years ago
ilrwbwrkhv|2 years ago
isnhp|2 years ago
informal007|2 years ago
tudorw|2 years ago
berkle4455|2 years ago
Walled gardens are going to get a whole lot stricter.