This kind of system will suffer from the ratchet problem. A single bug that negatively impacts state is no longer fixable by rebooting. Instead, you have to format/reinstall. Unintended state becomes permanent, and upgrade paths constrained to the point where you must do intentional damage. I'd also be leery of fragmentation.
The project is falling into the "simplification trap", where all it's doing is moving inherent complexity around (which eventually requires users to do a bunch of work-arounds) rather than eliminating it (because inherent complexity can't be eliminated; only needless complications can be eliminated).
In theory, this sort of setup would be nice in a perfect world, but in the real world of buggy software and faulty hardware and cosmic rays and failing network connections, it's a disaster waiting to happen.
Well said, first thing I've though when I saw this project was: sounds like cache-invalidation hell.
Now a second and genuine follow up and wildly generalized question:
What if, given that information isn't lost (no-hiding theorem), death in the biological platforms is a reset for consciousness to continue its evolution after a renewal of any state pollution?
You could base such system on software transactional memory and persistent data structures (like in Clojure) and in case of a bug just revert the state to any particular point in time by changing one pointer.
Your app crashed? No problem - Rewind 5 minutes earlier and don't do the thing that crashed it.
There's whole unexplored universe of possibilities when we do away with traditional OS design. I'm especially interested in how it would work with Intel Optane.
Rebooting does not fix any bugs, it just resets the state. I'm not sure this is really the better solution because many bugs never get fixed because "Have you tried turning it off and on, again?"
This is what happened to my iOS install. Due to some problem (is it an exploit hidden somewhere among incoming messages?) my Messages app opens for 5+ seconds every time, and there is absolutely nothing I can do, except setting up a whole new iPhone from scratch (and hoping the problematic message win't get synced from iCloud anyway).
On rooted Android I could clear corrupted app data.
Yes, I have a similar feeling in SmallTalk (i.e. Squeak or Cincom)
A corrupted image is difficult to fix, and you end up saving the code on a "parcel"/package and reload it on a clean image.
I think the idealism lies in the ability to reset/reboot. How much system behavior is undefined? From a security perspective, it's nearly impossible to tell exactly what malware did when so much data is ephemeral.
Isn't that what snapshots are for? If rebooting breaks then rollback to a previous snapshot of the kernel or core components but keep all the users data intact.
I like the ideas in Phantom, even considering the negative comments, it is refreshing to see people having a go at OS design that isn't just blindly making yet another UNIX clone, because apparently we can't get enough of them.
I'm not sure this actually solves the real problem that people have. Sure, reboots are annoying and pretty much all current OSes could be improved in this regard. Back in the 90s, it's interesting that SunOS supported live kernel upgrades, so at the same time the kernel on disk was updated, the running system was also patched so it would act the same way as-if rebooted, but without any disruption to the live system.
However, in the current days where rebooting is seen as normal for an OS upgrade, but most of the time people just put their computer to sleep, the main problem is that all the network interfaces are effectively useless when the system wakes up because most connections will have long since timed out due to lack of connection acknowledgement packets. In such a case, systems will have to tear-down and re-create a lot of state anyway.
The actual issue this seems to solve, that of saving memory state and restoring again, is already solved for most use cases except OS upgrades by using sleep mode. But the hard case about network connections is still unsolved for this system, and by solving the problems it does solve with yet another VM-based environment it'll probably be doomed to obscurity unless common existing applications and virtual machines can easily be made to run on-top of it.
> all the network interfaces are effectively useless when the system wakes up
This is - and probably always will be a problem with networked software. You should program as if TCP connections could drop at any time, for any reason. Browser tabs get backgrounded. Laptops go to sleep. Cell phones roam, and go into tunnels. Servers have temporary net splits.
Most high level tasks can be retried safely. Nonces and things can be used to safely retry almost anything else.
I miss the simplicity of IRC, the way slack and discord smoothly transition between online and offline states is graceful and intuitive. That is how almost all software should behave.
It's even more essential in IoT and mobile devices, where reboot is often necessary, sometimes just to restore broken connection, like modem. Nobody works now on stable software, everyone tries to ship it as fast as possible. It's not necessarily bad thing, but it is certainly orthogonal to the persistence.
The plan is (was?) running any JVM stuff on their VM.
Also, IIRC Dmitry was thinking about embedded systems applications at least back in 2011 when he was giving a talk about PhantomOS at HighLoad++ conference in Moscow.
Nowadays, if you can make your hobby OS run WebAssembly I think you'd be able to get a lot of functionality out of it, even if it can't run Office x86 binaries.
> Its primary goal is to provide environment for programs thatsurvive OS reboot. Such an environment greatly simplifies software development
Simplifies? I can't even imagine such a program. To me a program is something that starts and ends and can also re-start with a fresh state in case something goes wrong.
What about some middle ground? I’d like my IDE to remain constant between reboots, my browser to some degree is already (if it crashes, it reopens all current tabs on start, which is effectively the same).
I would have thought some clever hacks on an existing kernel would be reliable enough rather than an entirely new OS just for one feature, though.
I studied some persistent object store databases in the past - which seemed like an incredibly good idea and I think the AS400 used such a database and was very popular.
It removed the need for a filesystem or any of the usual patterns for retrieving data like SQL so a whole class of programming that people think of as "normal" today just vaporised.
Persistent programs seem like a logical-ish next step but I wish the first step could have been taken because it was very nice to program in.
Not related to Phantom OS, but the OS running the Apollo 11 guidance computers had this same feature. In fact, the famous 1202 program alarms right before landing are related: a low priority task was overwhelming the system and the OS restarted to get a clean slate, continuing the high priority tasks where they left off:
“The software rebooted and reinitialized the computer, and then restarted selected programs at a point in their execution flow near where they had been when the restart occurred.” [0]
Did it? It seems more like that guidance application was saving checkpoint data, rather than OS feature. Though distinction between OS and applications might be muddy here. In fact persistent OS design would be catastrophic, as restoring all jobs would just cause resource exhaustion again.
I get similar benefits from running an application in a VM. If I don’t want to shut down the application, I just save the running state of the VM. But if I need to recover from a crash or other state error, I can still reboot the VM.
How do you handle backup and restore on a system like this? If the snapshots are at the OS level, and there is cross-app commingling of data objects, how can you restore the state of a particular application or object (formerly "file") without breaking everything? How do I restore a single document or a single contact in my address book?
I wonder how much wear on the system disk is caused unnecessarily by taking continuous snapshots.
I think that you would like to limit the actual data to persist to the data that needs to be be persisted — that which can't be recreated quickly, purely from other objects.
Isn't the ideal of OS not to need to reboot ever? This project is about intentionally rebooting?
I just had the question of 'whatever happened to ksplice?' and immediately found my answer of 'oh oracle' no wonder nobody talks about ksplice anymore.
I'm going to self plug here on work I'm apart of that does persistent processes, although the motivation is different. I do think this paper does a good job of informing the reader on why these sort of features would be really cool that this project does not necessarily dive into. But it requires widening the API and allowing for developers to choose how they persist.
I remember reading about another, pretty old, microkernel OS which had persistent processes. It was probably a capability based OS. Anyone know what that was?
Ooh interesting, probably unrelated but I was thinking a GAI would use one of these at its core eg. "it can't die". For the self/state aspect.
Also a criteria is code that modify itself without recompiling.
This also considers a power source like a nuclear battery or something that will last a long time or alternate in the low power state/ambient or whatever energy.
New operating systems are really needed for virtual and augmented reality. Especially with higher resolution, more comfortable devices that people use for work. I think the 3d file system representation will finally become normal.
It will be an exciting time for user interface development.
Possibly we could see networked collaboration/"multiplayer" at the OS level.
[+] [-] kstenerud|4 years ago|reply
The project is falling into the "simplification trap", where all it's doing is moving inherent complexity around (which eventually requires users to do a bunch of work-arounds) rather than eliminating it (because inherent complexity can't be eliminated; only needless complications can be eliminated).
In theory, this sort of setup would be nice in a perfect world, but in the real world of buggy software and faulty hardware and cosmic rays and failing network connections, it's a disaster waiting to happen.
[+] [-] sebastianconcpt|4 years ago|reply
Now a second and genuine follow up and wildly generalized question:
What if, given that information isn't lost (no-hiding theorem), death in the biological platforms is a reset for consciousness to continue its evolution after a renewal of any state pollution?
[+] [-] ajuc|4 years ago|reply
Your app crashed? No problem - Rewind 5 minutes earlier and don't do the thing that crashed it.
There's whole unexplored universe of possibilities when we do away with traditional OS design. I'm especially interested in how it would work with Intel Optane.
[+] [-] stormking|4 years ago|reply
[+] [-] joshuajomiller|4 years ago|reply
[+] [-] tomaskafka|4 years ago|reply
On rooted Android I could clear corrupted app data.
[+] [-] daitangio|4 years ago|reply
Static image is indeed a nice feature I like.
[+] [-] checker659|4 years ago|reply
[+] [-] resters|4 years ago|reply
[+] [-] encryptluks2|4 years ago|reply
[+] [-] formerly_proven|4 years ago|reply
[+] [-] pjmlp|4 years ago|reply
[+] [-] AnIdiotOnTheNet|4 years ago|reply
[0] By which of course we mean Linux.
[+] [-] ralferoo|4 years ago|reply
However, in the current days where rebooting is seen as normal for an OS upgrade, but most of the time people just put their computer to sleep, the main problem is that all the network interfaces are effectively useless when the system wakes up because most connections will have long since timed out due to lack of connection acknowledgement packets. In such a case, systems will have to tear-down and re-create a lot of state anyway.
The actual issue this seems to solve, that of saving memory state and restoring again, is already solved for most use cases except OS upgrades by using sleep mode. But the hard case about network connections is still unsolved for this system, and by solving the problems it does solve with yet another VM-based environment it'll probably be doomed to obscurity unless common existing applications and virtual machines can easily be made to run on-top of it.
[+] [-] josephg|4 years ago|reply
This is - and probably always will be a problem with networked software. You should program as if TCP connections could drop at any time, for any reason. Browser tabs get backgrounded. Laptops go to sleep. Cell phones roam, and go into tunnels. Servers have temporary net splits.
Most high level tasks can be retried safely. Nonces and things can be used to safely retry almost anything else.
I miss the simplicity of IRC, the way slack and discord smoothly transition between online and offline states is graceful and intuitive. That is how almost all software should behave.
[+] [-] xvilka|4 years ago|reply
[+] [-] timka|4 years ago|reply
Also, IIRC Dmitry was thinking about embedded systems applications at least back in 2011 when he was giving a talk about PhantomOS at HighLoad++ conference in Moscow.
[+] [-] kingcharles|4 years ago|reply
[+] [-] qwerty456127|4 years ago|reply
Simplifies? I can't even imagine such a program. To me a program is something that starts and ends and can also re-start with a fresh state in case something goes wrong.
[+] [-] hsbauauvhabzb|4 years ago|reply
I would have thought some clever hacks on an existing kernel would be reliable enough rather than an entirely new OS just for one feature, though.
[+] [-] WalterGR|4 years ago|reply
[+] [-] t43562|4 years ago|reply
It removed the need for a filesystem or any of the usual patterns for retrieving data like SQL so a whole class of programming that people think of as "normal" today just vaporised.
Persistent programs seem like a logical-ish next step but I wish the first step could have been taken because it was very nice to program in.
[+] [-] elteto|4 years ago|reply
“The software rebooted and reinitialized the computer, and then restarted selected programs at a point in their execution flow near where they had been when the restart occurred.” [0]
[0] https://www.hq.nasa.gov/alsj/a11/a11.1201-pa.html
[+] [-] garaetjjte|4 years ago|reply
(more details on Apollo 11 problem: https://www.doneyles.com/LM/Tales.html)
[+] [-] xupybd|4 years ago|reply
[+] [-] dang|4 years ago|reply
Phantom OS, a Russian OS where “everything is an object” - https://news.ycombinator.com/item?id=19672610 - April 2019 (23 comments)
[+] [-] jl6|4 years ago|reply
[+] [-] strlen|4 years ago|reply
[+] [-] webmaven|4 years ago|reply
[+] [-] offmycloud|4 years ago|reply
[+] [-] Findecanor|4 years ago|reply
I think that you would like to limit the actual data to persist to the data that needs to be be persisted — that which can't be recreated quickly, purely from other objects.
[+] [-] sleepingadmin|4 years ago|reply
I just had the question of 'whatever happened to ksplice?' and immediately found my answer of 'oh oracle' no wonder nobody talks about ksplice anymore.
[+] [-] bigodanktime|4 years ago|reply
https://www.rcs.uwaterloo.ca/pubs/sosp21-aurora.pdf
[+] [-] mahesh-hegde|4 years ago|reply
[+] [-] Findecanor|4 years ago|reply
But there have been several others also based on similar ideas, such as Mungi.
In recent years, there is Twizzler, intended for NVRAM. <https://www.youtube.com/watch?v=0Ix5DYKxzLI>
[+] [-] jcun4128|4 years ago|reply
Also a criteria is code that modify itself without recompiling.
This also considers a power source like a nuclear battery or something that will last a long time or alternate in the low power state/ambient or whatever energy.
[+] [-] ilaksh|4 years ago|reply
It will be an exciting time for user interface development.
Possibly we could see networked collaboration/"multiplayer" at the OS level.