jasonpriem | 1 year ago | on: Ask HN: Who is hiring? (November 2024)
jasonpriem's comments
jasonpriem | 3 years ago | on: OpenAlex: The Promising Alternative to Microsoft Academic Graph
jasonpriem | 3 years ago | on: OpenAlex: The Promising Alternative to Microsoft Academic Graph
Thanks for the suggestion about the data dump. A lot of that weight is abstracts, which come in at over 30GB just by themselves. But it's true that the JSON format has some redundancies. For now we think those are worth it, because the denormalized schema is very compatible with the API and easy for beginners to get started with. Plus you only have to download it once (for free! HT to AWS Open Data sponsorship), and after that the updates are very light.
We'll certainly consider offering a smaller, normalized format in the future though, if we get more requests for it.
jasonpriem | 3 years ago | on: OpenAlex: The Promising Alternative to Microsoft Academic Graph
We did have some good conversations with folks at Meta before they closed up shop, but didn't end up using any of their data.
jasonpriem | 3 years ago | on: OpenAlex: The Promising Alternative to Microsoft Academic Graph
jasonpriem | 3 years ago | on: OpenAlex: The Promising Alternative to Microsoft Academic Graph
With the API, folks can outsource the heavy data engineering to us for free, and just do the fun parts themselves. We want to make building real-world apps on the global research graph fun and easy, the kind of thing you can do as a hackathon project, instead of with a six-figure grant.
That said, I agree that it's absolutely essential the entire dataset be easy to download and mirror as well. It's called OpenAlex because it's _open_, soup to nuts (the "Alex" part is homage to the ancient Library of Alexandria). All the data is open, the code is open, and our governance is as open as we can make it. [1]
jasonpriem | 3 years ago | on: OpenAlex: The Promising Alternative to Microsoft Academic Graph
jasonpriem | 9 years ago | on: Unpaywall: Browser extension to find free copies of academic papers
[1] http://www.nature.com/nature/journal/v542/n7642/full/nature2...
jasonpriem | 9 years ago | on: 100 Awesome Women in the Open-Source Community You Should Know
One thing I missed in this writeup was more explanation of their methods. For instance, why were they only able to make gender guesses for 2mil out of 7mil users? That's unusually low for name-based gender identification. I'm guessing this is because many GitHub accounts didn't have first names, but would be great to actually see.
I'd also love to see the percentage of women they found out of those 2 million. Otherwise it's "Top 100 out of the ???? women on GitHub." Hopefully this will be addressed in the followup posts they promised. I'll be looking forward to them.
[disclosure: I'm a PI on http://depsy.org, which is funded by the National Science Foundation. And one of the gals on this list is my co-PI]
jasonpriem | 10 years ago | on: The unsung heroes of scientific software
And as you say, the growing popularity of GitHub gives us all kinds of cool data even when there's no central package manager for the language. In fact, we're mining imports of every Python and R project on GitHub right now to build out the dependency network beyond the (much much smaller) CRAN and PyPi networks.
The idea with Depsy has been to launch quickly with two languages, so people could see what it looks like, then iterate and add more as we get feedback. So we'll count your comment as +1 for C and C++ :)
jasonpriem | 10 years ago | on: The unsung heroes of scientific software
jasonpriem | 11 years ago | on: Academic Urban Legends
jasonpriem | 11 years ago | on: Show HN: Coredemia – share and discuss research papers
It's a great idea and I hope someone finds the way to make it work.
We're a small nonprofit on a mission inspired by the ancient Library of Alexandria: to create a universal database of research information. We gather, organize, and serve info on 280M papers,100M authors, billions of citations and more. We’re passionately open: our code is open source, and our data is free and public domain. We value kindness, creativity, and getting stuff done. Our work supports millions of users every day, and we’re growing fast.
We're hiring a senior frontend engineer and product owner. Salary $250k-300k, great perks.
Apply: https://ourresearch.breezy.hr/p/b31ea361225401-senior-fronte...