Doesn't seem practical. It might be useful as a learning-framework for MPI / Supercomputer programming... but it wouldn't be a tool that I'd use personally.
As another commenter said: the primary use of this NanoPi is the ability to emulate a "real" super-computer and really use MPI and such. MPI is a different architecture than a massive node (like a 96-core Thunder X ARM), and you need to practice a bit with it to become proficient.
I wonder if you could run distributed QEMU[1] on it and present it as a single (very "NUMA-ish") virtual machine? I know - node to node latency would kill you, but it could be fun to try.
> Doesn't seem practical. It might be useful as a learning-framework for MPI / Supercomputer programming... but it wouldn't be a tool that I'd use personally.
I read somewhere that some real supercomputer systems programmers actually use toy clusters of Raspberry Pi's to test their scheduling software. It helps speed up their development cycle because they can do initial testing on their desktops.
OTOH if you want to see how your massively parallel algorithm behaves on a 96-node cluster / network, such a box is just $500, and is portable and can work offline.
The comparisons by GFlops was more or less a lark. Especially the ones comparing energy efficiency with a supercomputer from the 90s. This 96 core rig produces 1 GFlop per Watt, compare that to an i9-9900k (250GFlop), z390 chipset and 1 stick of DDR4 (95W + 7W + 2.5W = 104.5W) which does ~2.3 GFlop per Watt.*
* this is back of napkin, real world results will vary
The real question is "Will it be powerful enough even though I use a Desktop operating system and a software stack designed for programmer comfort rather than efficiency to control <simple-ish device>".
> The NanoPi Fire3 is a high performance ARM Board developed by FriendlyElec for Hobbyists, Makers and Hackers for IOT projects. It features Samsung's Cortex-A53 Octa Core [email protected] SoC and 1GB 32bit DDR3 RAM
Who needs such a powerful CPU with so little RAM? The reason I have still not bought any Pi is all of them have 2 or less GiBs of RAM and I don't feel interested in buying anything with less than 4.
I've been trying to do something similar with 4 Orange Pi Zero Plus boards (this blog was one of my main inspirations). While I know it's not practical, it's fun to design the case and the stand, how everything needs to connect, and route it all together. I hope to in the end host a distributed personal website on it and a MQTT server on it for any IoT tinkering I'd want to do!
Nice! Distcc based compilation might be something to try on this. :) One thing I noticed is that heatsink fins are oriented in a wrong direction. Air should be going through the fins, not to the side of them. But I guess any air movement is enough to cool this.
Here is a simple study on distcc, pump and using of the make -j# option on low end hardware. It seems that the network could be a bottleneck. The compilation time probably would decrease to 1/4. But I think the use of -j# is the best advice.
The only supercomputer they compare it to is 27 years old, and it uses Gigabit Ethernet as its interconnect. I think they have a much looser definition of 'Supercomputer' than most people.
I wonder what topology this has--it definitely seems reminiscent of older supercomputers like the famous Thinking Machines CM-5, which used a hypercube.
[+] [-] dragontamer|7 years ago|reply
A practical baseline for anyone interested in ARM-compute, would be the Thunder X CPU (Cloud rental: https://www.packet.com/cloud/servers/c1-large-arm/). 48-cores per socket, 2x for 96-core servers.
As another commenter said: the primary use of this NanoPi is the ability to emulate a "real" super-computer and really use MPI and such. MPI is a different architecture than a massive node (like a 96-core Thunder X ARM), and you need to practice a bit with it to become proficient.
[+] [-] rwmj|7 years ago|reply
[1] https://events.linuxfoundation.org/wp-content/uploads/2017/1...
[+] [-] eiaoa|7 years ago|reply
I read somewhere that some real supercomputer systems programmers actually use toy clusters of Raspberry Pi's to test their scheduling software. It helps speed up their development cycle because they can do initial testing on their desktops.
Edit: I think this is what I was thinking of: https://www.youtube.com/watch?v=78H-4KqVvrg
http://www.bitscope.com/blog/FM/?p=GF13L
[+] [-] marmaduke|7 years ago|reply
Wouldn’t containers be a easier way to do that?
[+] [-] nine_k|7 years ago|reply
OTOH if you want to see how your massively parallel algorithm behaves on a 96-node cluster / network, such a box is just $500, and is portable and can work offline.
[+] [-] patrioticaction|7 years ago|reply
* this is back of napkin, real world results will vary
[+] [-] walterbell|7 years ago|reply
[+] [-] sannee|7 years ago|reply
[+] [-] ElBarto|7 years ago|reply
Cue the many forum questions: "I'm planning to use a Raspberry Pi to control a <simple-ish device>. Will it be powerful enough?"
[+] [-] adrianN|7 years ago|reply
[+] [-] geezerjay|7 years ago|reply
[+] [-] qwerty456127|7 years ago|reply
Who needs such a powerful CPU with so little RAM? The reason I have still not bought any Pi is all of them have 2 or less GiBs of RAM and I don't feel interested in buying anything with less than 4.
[+] [-] giancarlostoro|7 years ago|reply
https://www.pine64.org/?page_id=61454
There's others that are pricier (> $100) with x86 arch the UDOO boards if you really want a SBC with much more RAM too.
[+] [-] gnulinux|7 years ago|reply
What do you need that much RAM for? What do you plan to run in this machine?
[+] [-] epanchin|7 years ago|reply
[+] [-] nightcracker|7 years ago|reply
[+] [-] sheepybloke|7 years ago|reply
[+] [-] floatboth|7 years ago|reply
[+] [-] megous|7 years ago|reply
[+] [-] otherlife35|7 years ago|reply
https://forums.gentoo.org/viewtopic-t-1056580-start-0.html
[+] [-] mschaef|7 years ago|reply
[+] [-] geezerjay|7 years ago|reply
[+] [-] fluxty|7 years ago|reply
[+] [-] aepiepaey|7 years ago|reply
There are two 8-port ethernet switches.
With 12 nodes, this leaves 4 unused port (2 in each switch).
From the pictures you can see that the box itself has two jacks, both of which are likely connected to one switch each.
The switches don't seem to support link aggregation, so likely to look like this:
and if you connect both the switches to the same external switch, you'd get something like:[+] [-] albertgoeswoof|7 years ago|reply
[+] [-] zamadatix|7 years ago|reply