alphaXiv: Open research discussion on top of arXiv

[+] amadeuspagel|1 year ago|reply

Great idea.

- The frontpage should directly show the list of papers, like with HN. You shouldn't have to click on "trending" first. (When you are logged in, you see a list of featured papers on the homepage, which isn't as engaging as the "trending" page. Again, compare HN: Same homepage whether you're logged in or not.)

- Ranking shouldn't be based on comment activity, which ranks controversial papers, rather papers should be voted on like comments.

- It's slightly confusing that usernames allow spaces. It will also make it harder to implement some kind of @ functionality in the comments.

- Use HTML rather then PDF. Something that could be trivial with HTML, like clicking on an image to show a bigger version, requires you to awkwardly zoom in with PDF. With HTML, you would also have one column, which would fit better with the split paper/comments view.

[+] impendia|1 year ago|reply

> Use HTML rather then PDF.

The PDF is the original paper, as it appears on arXiv, so using PDF is natural.

In general academics prefer PDF to HTML. In part, this is just because our tooling produces PDFs, so this is easiest. But also, we tend to prefer that the formatting be semi-canonical, so that "the bottom of page 7" or "three lines after Theorem 1.2" are meaningful things to say and ask questions about.

That said, the arXiv is rolling out an experimental LaTeX-to-HTML converter for those who prefer HTML, for those who usually prefer PDF but may be just browsing on their phone at the time, or for those who have accessibility issues with PDFs. I just checked this out for one of my own papers; it is not perfect, but it is pretty good, especially given that I did absolutely nothing to ensure that our work would look good in this format:

https://arxiv.org/html/2404.00541v1

So it looks like we're converging towards having the best of both worlds.

[+] throw_pm23|1 year ago|reply

Counterpoint: please don't do any of the above and keep arxiv as it is. It is too valuable to mess it up, it is the few things on the internet that have not been ruined yet, and the "comment activity" can happen in the articles themselves at the scale of years, decades, and centuries.

[+] Retr0id|1 year ago|reply

> rather papers should be voted on like comments.

I don't think this is an inherently better approach, but maybe there should be an option for different ranking mechanisms. You could also rank by things like cite-frequency, cite-recency, "cite pagerank", etc.

[+] sestep|1 year ago|reply

Tiny note: Stack Exchange also allows spaces in display names, and they make @ functionality work regardless: https://meta.stackexchange.com/a/43020/297476

Agreed that it makes it more complicated though.

[+] rehaanahmad|1 year ago|reply

Great idea, we'll look into making the home page the trending page soon.

Regarding HTMl, our original site actually only supported HTML (because it was easier to build an annotator for an HTML page). the issue is that a good ~25% of these papers don't render properly which pisses off a lot of academics. Academics spend a lot of time making their papers look nice for PDF, so when someone comes along and refactors their entire paper in HTML, not everyone is a fan.

That being said, I do think long term HTML makes a lot of sense for papers. It allows researchers to embed videos and other content (think, robotics papers!). At some point we do want to incorporate HTML papers back into the site (perhaps as a toggle).

[+] ZeroSolstice|1 year ago|reply

> The frontpage should directly show the list of papers, like with HN.

I disagree. There are numerous times where I have browsed the comments on a HN post where people haven't read the article and are just responding to the comment thread. The workflow for this seems a bit different in that a person would have already read a paper and wanted to read through existing discussions or respond to discussion. With that, having the search front and center would follow as the next steps for a person who read a paper and wanted to "search" for discussions related to that paper in particular.

HN is more an aimless browsing which is a bit different than researching a specific area or topic.

[+] diggan|1 year ago|reply

> - Ranking shouldn't be based on comment activity, which ranks controversial papers, rather papers should be voted on like comments.

How about not ranking things at all? I don't feel like things like this should be a popularity/"like" contest and instead let the content of the paper/comments speak for themselves. Yes, there will be some chaff to sort through when reading, but humanity will manage.

Just sort things by updated/created/timestamp and all the content will be equal.

[+] gradus_ad|1 year ago|reply

> Ranking shouldn't be based on comment activity, which ranks controversial papers

But don't we want people's attention drawn to controversial/conversation generating papers? The whole point of the platform is to drive conversation

[+] runningmike|1 year ago|reply

Many people on earth have names with spaces. So good that a username can reflect a real name a person has.

[+] cgshep|1 year ago|reply

Tenured prof here. Every paper of mine goes on Arxiv with no exceptions, published under CC BY-NC-ND licenses. Some of us are working hard to overcome the system (e.g. look at the IACR's efforts). Unfortunately, academics are still hindered by institutional inertia; in fact, many prefer the status quo, usually those who rely on prestige over actual quality to advance their careers.

[+] michaelmior|1 year ago|reply

> usually those who rely on prestige over actual quality to advance their careers

Unfortunately for those of us pre-tenure, it's difficult to balance these as I'm sure you aware. We're evaluated by people who may have the best intentions, but don't work directly in our field. They then determine whether we keep our jobs. It's difficult not to consider prestige as a factor when you know those evaluating you will.

[+] gigatexal|1 year ago|reply

Thank you, thank you, thank you! I've no skin in the game (not an academic and a math idiot but I've a hole in my heart for Aaron Swartz and what he stood for) but I love that there are professors like yourself that believe in the free sharing of knowledge.

[+] Ar-Curunir|1 year ago|reply

What do you mean by the IACR’s efforts here? In the crypto community it’s very much the norm to put everything on eprint, and it is very rare to find a crypto paper not on there

[+] godelski|1 year ago|reply

  > Unfortunately, academics are still hindered by institutional inertia

As an ABD this has been a real pain point for me. Maybe I came into academia thinking what mattered most was the research. But now I'm the stereotypical PhD who passionately despises academia for its lack of being academic. I'm happy to have competition, but at the end of the day are we all not on the same team?

How the hell did we create a system where it is the norm that an advisor does not read a thesis, to read papers, to mentor? For that to be the norm among a committee? When I've had issues with getting works getting through review (even when they have high citations due to arxiv) I don't understand why it's acceptable for a response to be "keep trying" instead of "here, I read the paper and reviewer responses, let me help"[0]. It seems inefficient that we throw students into the deep end and watch them sink or learn to swim. I think there'd be a lot fewer dejected PhD students if there was a stronger focus on academics, mentorship, and collaboration over churning out ̶w̶i̶d̶g̶e̶t̶s̶ papers.

I think what pisses me off the most is thinking that research significance and success can be measured __purely__ through metrics like citation counts, H-indices, i10's, awards, etc. I'm not saying those are useless, but that we can evaluate without looking at the content? (as you say, actual quality of work) It's like we learned about Goodhart's Law and decided it was a feature not a bug.

(I know this is not always the case and there are many amazing advisors, but I'd be impressed if someone didn't know this is happening at least somewhere within their department.)

[0] If it takes a village to raise a child, it takes a department to mint a PhD. These types of things should come from committees, not just advisors. Our annual meetings and review shouldn't just be going through the motions.

[+] parpfish|1 year ago|reply

> Tenured prof here.

Yeah, but every pre-tenure or postdoc is like “I can’t fight the system right now, I need to publish enough to still have a job two years from now”

[+] chipdart|1 year ago|reply

> (...) in fact, many prefer the status quo, usually those who rely on prestige over actual quality to advance their careers.

Your comment doesn't read like one from anyone with any relationship with academia. If you had, you'd know that the issue is not a vacuous "prestige" but funding being dependent on hard metrics such as impact factor, and in some cases with metrics being collected exclusively from a set of established peer-reviewed journals that must be whitelisted.

And ArXiv is not one of them.

This means that a big share of academia has their professional and future, as well as their institution's ability to raise funding, dependent on them publishing on a small set of non-open peer-reviewed journals.

Reading your post, you make it sound like anyone can just upload a random PDF to a random file server and call it a publication. That ain't it. If you fail to understand the problem, you certainly ain't the solution.

[+] parpfish|1 year ago|reply

Just had an idea that may help the moderation AND encourage higher levels of discourse — comments are not published immediately.

When I was doing peer reviews, it would often take a day or more to read a paper, think it through, and then write up something thoughtful and constructive.

If you introduce a mechanism to delay comments (eg, holding all messages for 24-72 hours before publishing or only releasing new comments on Monday mornings) it would:

- encourage commenters to write longer thoughtful responses rather than short quick comment threads

- reduce back and forth flame wars

- ease the burden on moderators and give them time to do batches of work

- see if multiple commenters come to the same conclusions/critiques to minimize bandwagon effects

[+] w-m|1 year ago|reply

Hey alphaxiv, you won’t let me claim some of my preprints, because there’s no match with the email address. Which there can’t be, as we’re only listing a generic first.last@org addresses in the papers. Tried the claiming process twice, nothing happened. Not all papers are on Orcid, so that doesn’t help.

I think it’ll be hard growing a discussion platform, if there’s barriers of entry like that to even populate your profile.

[+] phreeza|1 year ago|reply

How would you propose making claiming possible without the risk of hijacking/misrepresentation?

[+] rehaanahmad|1 year ago|reply

Thanks for reaching out, I am one of the students working on this. We are adding google scholar support soon. If your paper isn't on Scholar or ORCID, you will need to submit a claim that our team reviews. There isn't really any other option, arXiv doesn't allow us to view the author's submission email automatically (although we are in the process of becoming an arXiv labs project soon).

[+] auggierose|1 year ago|reply

Upload a new version of your paper on arxiv, this time with an email address that works.

[+] Y_Y|1 year ago|reply

So you've put a fake email address on your papers? As in, one that you can't receive from? Why?

[+] tc4v|1 year ago|reply

I know you don't have a lifetime access to institutional email adress, but using a fake address is so counterproductive. You're only going to claim the paper once, and yuh ou should do it while you have access to your email. Then you update your account eith a new address.

[+] godelski|1 year ago|reply

Nice idea but I dislike the implementation. Honestly, I very much like OpenReview and question why we don't use that?

OpenReview has: - preprints - versioning - reviewing, with threads and latex support - ability to link websites, repos, datasets, etc - bibtex generation

But it's not as popular as arxiv, though very popular in review (conferences often do not use full features)

One thing I dislike about this is that it is open to all. Arxiv doesn't have a hard filter (you just need someone with an account to vet you, which stakes their reputation), but the existence of a filter is critical.

I don't want a place to engage with the public. We have Reddit, hacker news, Twitter, mastodon, and countless other places. I want a place for academics to talk to academics [0], researchers to researchers. There is a serious lack of spaces where serious low level in-the-weeds discussions can happen. Even fucking GitHub and hugging face are swamped by people asking dumb questions on research projects like how to install pytorch, fine tune a model, or where the source code is.

I'm really happy to include a lot of people, but I think we also need spaces where experts know they're talking to peers. Without that you have to assume you're not talking to peers because they outnumber us a few thousand to one. So that doesn't encourage engaging in research or technical discussions, it encourages talking down to peers and misinterpreting.

[0] a degree is not what makes you "an academic"

[+] rehaanahmad|1 year ago|reply

One of the co-creators of this site. A lot of great suggestions I'm reading so far, a lot of them are currently in the works (zooming in/out, infra issues for slow loading times on some papers, google scholar claiming papers).

For some more context, we are a group of 3 students with a background in AI research, and this site was initially built as an internal tool to discuss ai papers at Stanford. We've been dealing with a lot of growing pains/infra issues over the past month that we are in the process of hashing out. From there we would love to make a more concerted effort to share this in areas outside of AI. Happy to hear your thoughts here, or more formally via [email protected].

I do want to highlight, our site has a team of reviewers/moderators and having folks from different subject areas is critical to making sure the site doesn't end up a cesspool, apply here: https://docs.google.com/forms/d/11ve-4cL0axTDcqnHF66zX6greFV....

[+] karmakurtisaani|1 year ago|reply

I remember seeing this idea some years ago. I think it was called qrxiv.org or something like that, but can't find it anymore. I hope this one has better luck, getting the users in the fragmented space of preprints can be a challenge.

[+] fuglede_|1 year ago|reply

There's also https://scirate.com/ which occasionally has active discussions but, at least in my field, there's far from critical mass, and discussions only happen when someone kick starts and advertises a thread.

[+] rsp1984|1 year ago|reply

I launched gotit.pub [1] last year. It's very much the same thing.

[1] https://gotit.pub

[+] karencarits|1 year ago|reply

There is also https://pubpeer.com/

I worry that fragmentation of this space might not be beneficial, so it would be nice if these services could collaborate in some way, perhaps using activitypub or something

[+] cgshep|1 year ago|reply

> Use HTML rather then PDF.

Tenured prof here. Academics don't use HTML, despite its obvious advantages. The incentive system is deeply broken. No big-name journal or conference will accept a well-formatted HTML over their proprietary Latex/Word format. Latex to PDF converters generally suck.

[+] elashri|1 year ago|reply

Arxiv already provide a HTML version of the articles [1]. The authors does not have to provide HTML version, it is converted by arxiv. i.e [2]

[1] https://info.arxiv.org/about/accessible_HTML.html

[2] https://arxiv.org/html/2409.00838v1

[+] sundarurfriend|1 year ago|reply

I wish for either:

1) Zoom buttons just for the paper - the article text is often tiny, and zooming in with the browser messes up the page layout and makes the page practically unusable.

OR

2) A simple direct button to download the PDF directly. This would alleviate the zoom problem since I can view it in my local PDF reader with the best settings for me. Having to go to arxiv to download the PDF for every paper would be a nuisance over time though, so a button in the top bar would make the experience a lot better.

[+] AlexDragusin|1 year ago|reply

For me it always downloads the PDF, because I have disabled the View PDF in browser option (Toggle ON, on Edge: "Always download PDF files"), in browser settings, consider this as a solution.

Edit: The above is applicable to arxiv itself, I got confused, the alphaxiv.org opens the PDF in a framed way with no option to download, indeed.

[+] rehaanahmad|1 year ago|reply

Zoom is in the works! We are adding this in the coming week!

[+] tinyhouse|1 year ago|reply

We obviously had this for many years with OpenReview, which has a different purpose, but having something for every paper is indeed needed. I have trouble opening some links, guessing it's still under heavy development. Looks nice!

[+] codegladiator|1 year ago|reply

This is great. Already loving the discussions/comments I see there.

[+] cs702|1 year ago|reply

How are the creators going to prevent gaming?

I ask because every system I've ever come across for discussing and ranking content without human moderation is always, sooner or later, gamed.

[+] rehaanahmad|1 year ago|reply

We have a team of enthusiastic reviewers/moderators in a couple sub-categories. We plan on growing this team out as the site continues to grow. If you'd like to be a reviewer: https://docs.google.com/forms/d/11ve-4cL0axTDcqnHF66zX6greFV...

[+] teleforce|1 year ago|reply

It a shame that arXiv now it is not what it used to be, a very useful pre-print before actual publication. It looks like it is now a pseudo journal masquerading as a pre-print server since apparently arXiv has editorial and review teams that reject papers based on their 'expertises' [1].

Perhaps they think they are reputable now just because Perelman's proof papers were published there and they want to maintain their 'reputation' [2]. The irony is that Perelman would most probably not publish in arXiv if it is in their current pseudo journal status.

[1] Editorial Advisory Council:

https://info.arxiv.org/about/people/editorial_advisory_counc...

[2] Reclusive mathematician rejected honors for solving 100-year-old math problem, but he relied on Cornell's arXiv to publish:

https://news.cornell.edu/stories/2006/09/proof-100-year-old-...

[+] gr__or|1 year ago|reply

I am very non-eager to help any further platform grow that has not been built on-top of sth like atproto (the BlueSky protocol), to prevent silos and the monopolist landlords that come with those.

Great idea though, would love to use sth like this, if it existed on a federalized protocol.

[+] scarlehoff|1 year ago|reply

I believe this site is missing a very important thing, direct links to the different categories with a list of papers. This is at least how I (and I believe many others) browse arXiv. I open it up in the morning, scroll through a few categories and open a few papers that look interesting to me.

I could see myself using alphaxiv for that, and then, if there's a comment section, I might even read it, and, who knows, leave a comment. But there's no way I'm going to be changing the address or going to some other site to search for papers just to see whether there are some comments.

ps: I see the extension adds a "discussion" link to arxiv, it is a pity that it is only available for Chrome.

[+] eigenket|1 year ago|reply

It sounds like what you want is scirate. As far as I understand from this post this new thing is just scirate but lacking the interface you're talking about here.

[+] forgotpwd16|1 year ago|reply

Kinda related, Hypothesis (and Diigo iirc in past) has an extension/bookmarklet that can provide an annotation/comment overlay on any web page/pdf. Guess what is needed for arXiv discussion is this overlay but smarter, that is knows a paper pdf and web view are the same, and abstract page is connected to them.

[+] chfritz|1 year ago|reply

Why limit this to arxiv papers? Why not any paper published online, e.g., via https://bibbase.org? btw, very cool that you seem to have overcome the initial inertia of getting something like this going. The idea is not new, but it's a marketplace dynamic that is hard to bootstrap.

[+] unknown|1 year ago|reply

[deleted]

[+] abhayhegde|1 year ago|reply

Great platform for invigorating research discussions! But seeing only AI based (or broadly CS based) research as featured papers is a bit discouraging. Perhaps there isn't enough critical mass for other topics yet.

[+] john-titor|1 year ago|reply

Tried to sign up with my corporate email (life sciences, 100k+ employees worldwide with a big research arm). Says the institution is not known to the service. What's the process to get it known?

[+] rehaanahmad|1 year ago|reply

Email me at [email protected], I'll add it asap!

178 comments