mkhorton's comments

mkhorton | 3 years ago | on: The origin of the strong form of superconductivity

Not for superconductivity specifically, but for a broad range of properties of crystals, this is what the Materials Project[0] does.

Materials Project is funded by the US Department of Energy and uses supercomputing to simulate hundreds of thousands of different crystal structures on the quantum mechanical level to try and find those which have useful properties for practical applications.

This line of research is broadly called “materials discovery”, “materials design” (often “high throughput”) or even “materials genomics” depending on who you ask. These terms are provided in case anyone wants to search and read more about it.

[0] https://materialsproject.org

mkhorton | 3 years ago | on: The Materials Project

This would be a much bigger conversation, the SQLite efforts are very cool.

Short answer to your question is that the API load should be fine (I regularly download large subsets of the database myself via the API for research purposes), although there are good and bad ways of writing API queries. We have some tutorials, workshops, etc. available to help newcomers to our API write good queries.

We also have an email address set up ([email protected]) where people can give us a heads up if they are concerned about putting an undue load on our servers; as much as we try to have reasonable automatic limits set, sometimes we have had issues! API traffic continues to grow too, which in some ways is a nice problem to have, but does mean this is a moving target.

mkhorton | 3 years ago | on: The Materials Project

Absolutely, yes. Materials Project runs a service called "MPComplete" where people can submit structures to "help complete the database." There's an API, or we're working on a new drag-and-drop interface on the website to quickly upload a CIF or similar.

By all means email me at [email protected] if you're interested and I can sort it out.

mkhorton | 3 years ago | on: The Materials Project

In the short (~decade) term, we do tape backups of calculation data in Berkeley, and offload data to an independently-funded European project (NOMAD), to ensure data is in at least two locations. Likewise, our production databases are automatically backed up in the cloud, but we also keep a local mirror on a bare metal server. In the longer 2^6-year time frame or further out still, I would just be flattered if the data is at all still useful for people. I think it's fair to say our community has a lot of challenges to face before we get to that point.

We don't seed any torrents ourselves and only support API access (mainly because we're a small team and have to focus our effort), but with the open license I hope the data can live on wherever/however it can.

mkhorton | 3 years ago | on: The Materials Project

> Is there any way I could use this to see if there was merit in that idea?

It likely can't give you an instant answer, but it can be a good starting point for a research project. For example, Materials Project has information about the dielectric properties of a material, has datasets for electron conductivities, vibrational (phonon) properties and the like. So you would start by searching the dataset for the properties of interest to get a shortlist of candidate materials, and then do more focused studies based on those.

Note that the Materials Project does also have known materials in its database that are currently used extensively in real-world devices too, so it can also be used to provide additional information about those materials. In this way, if you're looking for an improvement on an existing material, you can start with a known-good material and see if similar materials might exist that offer an improvement on your property of interest.

mkhorton | 3 years ago | on: The Materials Project

It is aimed at inorganic materials in general, and many of the calculations are bootstrapped from existing experimental crystal databases.

However, this is not to say there aren't some biases. A lot of the Materials Project collaborators work on battery research, so there is some bias towards battery materials. But people have used MP to search for new photocatalysts, for example (or carbon capture materials, new phosphors, thermoelectrics for solid-state refrigeration, lead-free piezoelectrics, transparent conductors, etc.. the list goes on).

mkhorton | 3 years ago | on: The Materials Project

> How do projects like this deal with papers published based on falsified data? Do they reproduce any of the source data themselves?

I can't speak to this specific instance, but Materials Project does try to pay close attention to questions of reproducibility and provenance. Materials Project runs open-source repos[0] so that its methods can be verified, individual calculations are available via an API[1] and we also partner with NOMAD[2] to make larger files and calculation artifacts available for direct download. This is in addition to documenting methods via peer-reviewed papers, online docs, etc.

This is not to say that issues of reproducibility don't still exist, or that we ourselves couldn't be doing better. It's a big problem in the community.

[0] https://github.com/materialsproject [1] https://api.materialsproject.org/docs [2] https://www.nomad-coe.eu

mkhorton | 3 years ago | on: The Materials Project

I would agree with your comment, but I think it's fair to ask this question. Discovering new materials can have many unintended consequences, especially if they contain elements that are not earth abundant or have high costs (environmental, personal) associated with their extraction.

mkhorton | 3 years ago | on: The Materials Project

Yes, this is almost exclusively a computational resource, with the exception of experimental data contributed by third parties. Most of our compute comes from the lovely people at NERSC[0].

All our predictions are benchmarked against experimental data wherever possible, but it's always a balancing act between things that can be calculated reliably and at scale, and the latest-and-greatest methods which give the most accurate predictions.

[0] https://www.nersc.gov

mkhorton | 3 years ago | on: The Materials Project

We have a mechanism for upload of experimental data (MPContribs[0]), that can then be linked back to the Materials Project's "material detail pages" for a given material. This also then provides a public API for bulk download of this data. We hope this will help make relevant experimental data more discoverable.

[0] https://contribs.materialsproject.org

mkhorton | 3 years ago | on: The Materials Project

There are a few differences, but broadly MatWeb is more useful for manufacturing and has a broader range of materials available (including plastics, extensive metallic alloys, etc.) and real world properties. These are materials you might purchase and use today.

In contrast, the Materials Project are computed predicted information on inorganic crystals (typically, ideal, on-stochiometric crystals), that might be used for many different device applications like solar, optoelectronics, batteries, etc. Many of these crystals will not be available to purchase and will need to be grown in a laboratory, and Materials Project is therefore much more focused towards active research into new materials.

mkhorton | 3 years ago | on: The Materials Project

Hi everyone, fun to see The Materials Project make the front page! I work on this, happy to answer any questions.

mkhorton | 5 years ago | on: Ask HN: Who is hiring? (July 2020)

Materials Project, Lawrence Berkeley National Laboratory | Web Developer | Berkeley, CA, USA | Onsite | https://materialsproject.org https://lbl.gov

Mission: We are a group of academic researchers who create and curate the Materials Project, the world's leading database of crystalline materials that is freely available for people to query to find materials for applications such as energy, batteries, solar, water splitting, optoelectronics and more. Our user base is growing exponentially (now >120k) and includes a wide range of people, from students who are just encountering materials science for the first time, to academic researchers and industry users. We’re now in the process of building a new frontend for the website to meet some key needs that have arisen as the project has grown, as well as to share some of the latest data we’ve been generating which will require deep thought in how best to make this data accessible and understandable to the broadest possible audience. If this sounds exciting to you, please get in touch. The Materials Project was founded in 2011.

Technologies: This is a good time to start working with us since we're at the early stages of designing our new frontend, and you will have an opportunity to help us shape what that looks like. We've settled on React and TypeScript for our core technologies, and are committed to modern best practices where possible. Due to the large number of Python developers in our team, we will also be making heavy use of the Plotly Dash framework, and extending this using custom React components, so some Python familiarity will also be useful. All the code we write is open source <3 you can find our code at https://github.com/materialsproject

Team: You will be joining a small team of four core developers, along with a larger research group of many postdocs and graduate students here at LBL, and also interacting with our collaborators worldwide. COVID statement: This is an on-site job, however we are currently working remote and have been given guidance to expect this to continue until the end of September.

The official job ad, further details on how to apply, and our equal employment opportunity statement are all available here: https://lbl.referrals.selectminds.com/jobs/web-developer-the...

Please note that this ad is a re-post from June, and we are currently interviewing candidates. However if the job ad link is still active then that means we are still accepting applications. We look forward to hearing from you!

mkhorton | 5 years ago | on: Crystallography Open Database

> I love the API for the materials project!

Ah, so happy it's useful to you!

We're working on a new API internally too (based on FastAPI) that will hopefully bring better documentation along with it, so stay tuned for improvements.

> On the experimental side, how does it compare to ICSD?

We have pretty good coverage of ICSD and other experimental databases, and we continue to process and calculate new materials as they're discovered. We also calculate ordered approximations of disordered structures too, but this is an area where we could improve.

We also provide a capability where users can upload crystal structures we don't have and we calculate those too (with credit going to the original uploader).

mkhorton | 5 years ago | on: Crystallography Open Database

For crystallography specifically, there's ourselves (Materials Project), OQMD, AFLOW, Materials Cloud, JARVIS, and a number of more specific (but no less important) specialized databases. There are also a number of commercial offerings.

Best practices are incredibly difficult. We're trying to establish a common API currently (https://github.com/Materials-Consortia/optimade) that can be adopted by all database providers. How the data is stored behind the scenes is something that ends up being very specific to how the data is generated and what its applications are. We're definitely better as a community than we were ten years ago, but there's a lot of work to be done here.

In terms of scientific databases outside crystallography/materials science, Nature's Scientific Data is a good open-access journal to peruse: https://www.nature.com/sdata/

page 1