> By 2021, these engineered bacteria could be simulated in unprecedented detail. Every gene, every major protein, and nearly every metabolic reaction in JCVI-syn3A.
I think the crux is here:
> Even after years of study, 91 of JCVI-syn3A's genes remain unannotated, of which roughly one-third are essential. Deleting any single one kills the cell, yet we have no idea what they do – representing some of biology's most fundamental unsolved puzzles.
---
I think minimal cells and virtual cells are especially exciting as they open up a path to create fully controlled experimental environments for biochemistry from the ground up.
Right now sooo much time in biochemistry goes into working around the limitations of what already happens to be present in an organism. E.g. we may know 5% of mechanisms that go on in a cell, but the remaining 95% percent of mechanisms that go on may still brick your experiment, and without knowing about them you essentially have to shrug and trial and error your way through them.
In contrast in a synthetic minimal cell, we could start out with an organism where we know 95% of the mechanisms that are going on, and then study new mechanisms one gene at a time, steadily building up to bigger and bigger mechanisms.
Strangely it seems to me that a lot of effort is going more into being able to simulate full cells that contain unknown mechanisms, rather than trying to use the capabilities to create hypothesis to uncover the unknown mechanisms. Yes, that probably expedites the path towards simulating much bigger human cells, but ultimately still leaves us in the dark on most fronts.
> Strangely it seems to me that a lot of effort is going more into being able to simulate full cells that contain unknown mechanisms, rather than trying to use the capabilities to create hypothesis to uncover the unknown mechanisms. Yes, that probably expedites the path towards simulating much bigger human cells, but ultimately still leaves us in the dark on most fronts.
I imagine it's much easier to create and test hypotheses about the unknown mechanisms, when you can view them in context of a larger system, with reasonable performance, allowing you to metaphorically "grab them in your palm" and tweak on the fly. We work better when we explore things, instead of immediately taking on problems that are at the limit of our computational tools, requiring individual brains (and tons of paperwork) to make up for the difference.
In this sense, researching the nano-scale basics, and aiming to simulate micro-scale cellular systems, are actually aligned - as long as they're not cutting too much corners, the latter is creating space for former work to be done efficiently.
>Strangely it seems to me that a lot of effort is going more into being able to simulate full cells that contain unknown mechanisms, rather than trying to use the capabilities to create hypothesis to uncover the unknown mechanisms. Yes, that probably expedites the path towards simulating much bigger human cells, but ultimately still leaves us in the dark on most fronts.
Seems the result of this general trend in science towards brute prediction and abandoning the goal of explanation or understanding.
This is exactly what I'm an expert at, I even coined a term in the field [1], :).
Since I started doing this 15 years ago (and I know the field predates me by much), one always has had this feeling that we are so close to a big breakthrough in biological simulation, but at the same time, progress has been kind of "slow". I think the reason for that is because pushing the envelope forward in this field requires mastering three (maybe four) different disciplines, your pick of [Bio, Chem, CS, Math, Physics]. Very few people reach this level of simultaneous understanding of all these pieces.
I'm not trying to gatekeep the field, though, much of the progress here (including many of the papers mentioned in TFA) is work coming from PhD students. Anyone could jump into this, but you really need to sit down and try to make sense of it for a while, years. PhD gives one the perfect opportunity for that.
Anyway, I hope this thing keeps going on forward, it's one of the ultimate goals of Biology and it would be extremely beneficial to the world.
A noob question, since the original article doesn't go into details - what is exactly being simulated here? I was under the impression that we can't even reliably do a single protein folding due to the sheer complexity of the task. So how do we simulate the zillions that are bouncing around in a single cell? And if we don't simulate it at that level, how are we confident that it is correct?
Isn't it simply because it's a fundamentally hard problem that may not even be solvable? Simulating a 50 amino acids long protein in water for 1 ms on a top supercomputer using molecular dynamics would take about a week.
Can the current approach lead to models that are even remotely as useful as a full molecular dynamics simulation? The current approach requires us to first discover the hard stuff, the myriads of tiny mechanisms happening in the cell.
This is exactly the field I want to enter! I really want to work on the tooling side for atomic simulation (I think I have a design that could complete each timestep in ~10usec that doesn't lose speed as it scales). I think it would be cool to automatically extract parameters for coarser grained models.
I'm planning to go to college for electrical engineering (ASIC design), but swap out some of my requirements to focus on particle physics. The college I got into also has an undergraduate MD lab that I got invited to.
Do you have any tips on what skills you've found most valuable as you've done simulation?
Thank you for your contributions. You are quite literally saving lives.
Are there any good local (op-so ideally) tools and/or libraries one can experiment with? I have access to a couple HPC clusters and would love to learn more.
Thank you! and it's awesome you can contribute to the subject!
You're so right that it feels so difficult to make sense of because of how cross-disciplinary it is. I hope more people invest and work on this stuff as well. I'm hoping to learn more over the years!
UConn coordinated a ton of work during the past two decades on mechanistic cell models. Mostly ODEs, PDEs, and stochastic ODEs. See The Virtual Cell at https://vcell.org.
It's interesting how high-throughput perturbation assays have led to data-driven whole cell models. But these are not yet good at making robust predictions.
Probably the future are hybrid neuro-symbolic models.
Yes, this. A lot of work in this field is missing from that timeline. Just circa 2010-2020, Les Loew's VCell 3D PDE approaches, Faeder et al.'s BioNetGen / ODE work, Luthey-Schulten Shulten's grid based cell models, the Pittsburgh supercomputing center's 3D monte-carlo MCell, the image-based deep learning models at the Allen Institute for Cell Science...
It's nice to see the idea of virtual cells make a comeback now, though the meaning seems to have shifted to transciptomics-based transformer / gpu-powered models (which have issues[0]), it's a fun field / problem, but I think it will make better progress if we take advantage of all the varied computational work that has come before.
[0] Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all https://arxiv.org/abs/2410.13956
What a strange web page. Scrolling is thoroughly broken.
I recently went to a two day workshop on whole cell modelling. I'm still trying to work out how much of the exercise is fantasy. I get that some of the chemistry is well enough understood to simulate from the ground up, but there's so much more to it.
The oddest thing to me is the level of satisfaction in being able to run the model. I would think the model has to be very very fast, because of all the work that needs to be done with it to fit it to data and fully understand its behavior.
[+] [-] hobofan|8 months ago|reply
> By 2021, these engineered bacteria could be simulated in unprecedented detail. Every gene, every major protein, and nearly every metabolic reaction in JCVI-syn3A.
I think the crux is here:
> Even after years of study, 91 of JCVI-syn3A's genes remain unannotated, of which roughly one-third are essential. Deleting any single one kills the cell, yet we have no idea what they do – representing some of biology's most fundamental unsolved puzzles.
---
I think minimal cells and virtual cells are especially exciting as they open up a path to create fully controlled experimental environments for biochemistry from the ground up.
Right now sooo much time in biochemistry goes into working around the limitations of what already happens to be present in an organism. E.g. we may know 5% of mechanisms that go on in a cell, but the remaining 95% percent of mechanisms that go on may still brick your experiment, and without knowing about them you essentially have to shrug and trial and error your way through them.
In contrast in a synthetic minimal cell, we could start out with an organism where we know 95% of the mechanisms that are going on, and then study new mechanisms one gene at a time, steadily building up to bigger and bigger mechanisms.
Strangely it seems to me that a lot of effort is going more into being able to simulate full cells that contain unknown mechanisms, rather than trying to use the capabilities to create hypothesis to uncover the unknown mechanisms. Yes, that probably expedites the path towards simulating much bigger human cells, but ultimately still leaves us in the dark on most fronts.
[+] [-] TeMPOraL|8 months ago|reply
I imagine it's much easier to create and test hypotheses about the unknown mechanisms, when you can view them in context of a larger system, with reasonable performance, allowing you to metaphorically "grab them in your palm" and tweak on the fly. We work better when we explore things, instead of immediately taking on problems that are at the limit of our computational tools, requiring individual brains (and tons of paperwork) to make up for the difference.
In this sense, researching the nano-scale basics, and aiming to simulate micro-scale cellular systems, are actually aligned - as long as they're not cutting too much corners, the latter is creating space for former work to be done efficiently.
[+] [-] suddenlybananas|8 months ago|reply
Seems the result of this general trend in science towards brute prediction and abandoning the goal of explanation or understanding.
[+] [-] moralestapia|8 months ago|reply
This is exactly what I'm an expert at, I even coined a term in the field [1], :).
Since I started doing this 15 years ago (and I know the field predates me by much), one always has had this feeling that we are so close to a big breakthrough in biological simulation, but at the same time, progress has been kind of "slow". I think the reason for that is because pushing the envelope forward in this field requires mastering three (maybe four) different disciplines, your pick of [Bio, Chem, CS, Math, Physics]. Very few people reach this level of simultaneous understanding of all these pieces.
I'm not trying to gatekeep the field, though, much of the progress here (including many of the papers mentioned in TFA) is work coming from PhD students. Anyone could jump into this, but you really need to sit down and try to make sense of it for a while, years. PhD gives one the perfect opportunity for that.
Anyway, I hope this thing keeps going on forward, it's one of the ultimate goals of Biology and it would be extremely beneficial to the world.
1: https://www.frontiersin.org/journals/plant-science/articles/...
[+] [-] ulnarkressty|8 months ago|reply
[+] [-] RivieraKid|8 months ago|reply
Isn't it simply because it's a fundamentally hard problem that may not even be solvable? Simulating a 50 amino acids long protein in water for 1 ms on a top supercomputer using molecular dynamics would take about a week.
Can the current approach lead to models that are even remotely as useful as a full molecular dynamics simulation? The current approach requires us to first discover the hard stuff, the myriads of tiny mechanisms happening in the cell.
[+] [-] smj-edison|8 months ago|reply
I'm planning to go to college for electrical engineering (ASIC design), but swap out some of my requirements to focus on particle physics. The college I got into also has an undergraduate MD lab that I got invited to.
Do you have any tips on what skills you've found most valuable as you've done simulation?
[+] [-] _factor|8 months ago|reply
Are there any good local (op-so ideally) tools and/or libraries one can experiment with? I have access to a couple HPC clusters and would love to learn more.
[+] [-] udara|8 months ago|reply
You're so right that it feels so difficult to make sense of because of how cross-disciplinary it is. I hope more people invest and work on this stuff as well. I'm hoping to learn more over the years!
[+] [-] nextos|8 months ago|reply
It's interesting how high-throughput perturbation assays have led to data-driven whole cell models. But these are not yet good at making robust predictions.
Probably the future are hybrid neuro-symbolic models.
[+] [-] donovanr|8 months ago|reply
It's nice to see the idea of virtual cells make a comeback now, though the meaning seems to have shifted to transciptomics-based transformer / gpu-powered models (which have issues[0]), it's a fun field / problem, but I think it will make better progress if we take advantage of all the varied computational work that has come before.
[0] Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all https://arxiv.org/abs/2410.13956
[+] [-] paulfharrison|8 months ago|reply
I recently went to a two day workshop on whole cell modelling. I'm still trying to work out how much of the exercise is fantasy. I get that some of the chemistry is well enough understood to simulate from the ground up, but there's so much more to it.
The oddest thing to me is the level of satisfaction in being able to run the model. I would think the model has to be very very fast, because of all the work that needs to be done with it to fit it to data and fully understand its behavior.
[+] [-] maltee|8 months ago|reply
[+] [-] unknown|8 months ago|reply
[deleted]