top | item 45698116

(no title)

fock | 4 months ago

I guess for IMS/CICS/TPF/... the IBM mainframe is a just fine appliance compared to the alternatives. While not exactly transaction processors, SAP HANA, Oracle Exadata and co. all market themselves towards the same customer groups; SAP even sells full banking systems for medium-sized banks.

Your point that TCO is lower than a well executed alternative seems very dubious to me though. Maybe lower than cloud and also certainly lower than whatever crap F100-consultants sold you, but running database unloads with basic ETL for a few dozen terrabytes per month creating a MSU-bill in the millions is just ridiculous. The thing which probably lowers the TCO is that EVERY mainframe-dev/ops-person in existence is essentially a fin-ops-expert formed by decades of cloud-style billing. Also experience on a platform where your transaction processing historically has KB-range size limits, data-set-qualifiers are max. 44 chars, files (which you allocate by cylinders) don't expand by default and whatever else you miss from your 80ties computing experience naturally leads to people creating relatively efficient software.

In general even large customers seem to agree with me on that (see Amadeus throwing out TPF years ago) with even banks mostly outrunning the milking machine called IBM. What is and will be left is governments. Captured by inertia and corruption (at the top) and being kept alive by underpaid lifelong experts (at the bottom) who have never seen anything else.

> during the AWS outage this week.

Also the reliability promises around mainframes are "interesting" from what I've seen so far. The (IBM) mainframe today is a distributed system (many LPARs/VMs and software making use of it) which people are encouraged to run on maximum load. Now when one LPAR goes down (and might pull down your distributed storage subystem) and you don't act fast to drop the load you end up in a situation not at all unlike what AWS experienced this week: critical systems are limping on, while the remaining workload has random latency spikes which your customers (mostly Unix systems...) are definitely going to notice...

The non-IBM-way of running VMs on a Linux box and calling it a mainframe just seems like a scam if sold for anything but decommissioning. So I guess those vendors are left with governments at this point.

discuss

rbanffy|4 months ago

> The (IBM) mainframe today is a distributed system (many LPARs/VMs and software making use of it)

Not really. While you can partition the machine, you can also have one very large partition and much smaller ones for isolated environments. It also has multiple redundancy paths for pretty much everything, so you can just treat it as a machine where hardware never fails. It’s a lot more flexible than a rack of 2u servers or some blade chassis. It is designed to run at 100% capacity with failover spares built in. This is all transparent to the software. You don’t need to know a CPU core failed or some memory died - that’s all managed by the software. You’ll only noticed a couple transactions failed and were retried. You are right in that mainframe operations are very different from Linux servers, and that a good mainframe operator knows a lot about how to write performant software.

fock|4 months ago

And incidentally all documentation recommends not extending your LPARs beyond what is available on a single CPC-"node" (see [0]-2-23 for a nice (and honest...) block-diagram). If you extend your LPAR across all CPCs I doubt that many of the HA and hotswap-features continue to work (also there is bugs...). E.g.: you won't hotswap memory when it's all utilized: > Removing a CPC drawer often results in removing active memory. With the flexible memory option, removing the affected memory and reallocating its use elsewhere in the system is possible.

So while you can have single-system-images on a relatively large multinode setup I doubt many people are doing that (at the place I know, no LPARs have TB of memory...). Also in the given price-range you easily can get SSI-images for Linux too: https://www.servethehome.com/inventec-96-dimm-cxl-expansion-...

If you don't need the single-system-images, VMWARE and Xen advertise literally the same features on a blade chassis minus redundant hardware per blade, which is not really necessary when you just migrate the whole VM...

Also if you define the whole chassis as having 120% capacity, running it at 100% capacity becomes trivial too. And this is exactly what IBM is doing keeping around spare CPUs and memory in all setups spec'ed correctly: https://en.wikipedia.org/wiki/Redundant_array_of_independent...

You are right though that the hardware was and is pretty cool and that kind of building for reliability has largely died out. Also up until ARM/Epyc arrived maximum capacity was over-average, but that is gone too. Together with the market-segment likely not buying for performance I doubt many people today are running workloads which "require" a mainframe...

[0] https://www.redbooks.ibm.com/redbooks/pdfs/sg248951.pdf