an-unknown's comments

an-unknown | 10 months ago | on: Running Clojure in WASM with GraalVM

Not sure why people always say it's so hard to build GraalVM ... all you need is roughly 2 prerequisites and one build command. The prerequisites are a "Labs JDK" which is essentially a slightly modified OpenJDK with more up to date JVMCI (the JIT interface used by Graal) and the build tool "mx".

Since you want to build completely from source, you start by installing OpenJDK. Then you clone the Labs JDK repo [0] and build it just like how you would build any other OpenJDK. Once you have the Labs JDK, you don't need the OpenJDK anymore, since that's only necessary to build the Labs JDK. If you use a normal OpenJDK instead of Labs JDK for Graal, the Graal build will most likely tell you something about "too old JVMCI" and fail. Don't do that.

Next you clone mx [1] and graal [2] into some folder and add the mx folder to PATH. You also need Python and Ninja installed, and maybe something else which I can't remember anymore (but you'd quickly figure it out if the build fails). Once you have that, you go to graal/vm and run the relevant "mx build" command. You specify the path to the Labs JDK via the "--java-home" CLI option and you have to decide which components to include by adding them to the build command line. I can't remember what exactly happens with just "mx build" but chances are this only gives you a bare GraalVM without anything else, which means also no SubstrateVM ("native-image"). By adding projects on the command line, you can include whatever languages/features are available. And that's it. After some time (depending on how beefy your computer is), you get the final GraalVM distribution in some folder, with a nice symlink to find it.

It's not exactly documented in a good way, but you can figure it out from the CI scripts which are in the git repos of Graal and Labs JDK. The "mx build" command is where you decide which languages and features to include; if you want to include languages from external repositories, you have to clone them next to the graal and mx folder and add the relevant projects to the mx build command.

[0] https://github.com/graalvm/labs-openjdk

[1] https://github.com/graalvm/mx

[2] https://github.com/oracle/graal

an-unknown | 1 year ago | on: Users don't care about your tech stack

> Yesterday File Pilot (no affiliation) hit the HN frontpage. File Pilot is written from scratch and it has a ton of functionality packed in a 1.8mb download. As somebody on Twitter pointed out, a debug build of "hello world!" in Rust clocks in at 3.7mb. (No shade on Rust)

While the difference is huge in your example, it doesn't sound too bad at first glance, because that hello world just includes some Rust standard libraries, so it's a bit bigger, right? But I remember a post here on HN about some fancy "terminal emulator" with GPU acceleration and written in Rust. Its binary size was over 100MB ... for a terminal emulator which didn't pass vttest and couldn't even do half of the things xterm could. Meanwhile xterm takes about 12MB including all its dependencies, which are shared by many progams. The xterm binary size itself is just about 850kB of these 12MB. That is where binary size starts to hurt, especially if you have multiple such insanely bloated programs installed on your system.

> If you want to make something that starts instantly you can't use electron or java.

Of course you can make something that starts instantly and is written in Java. That's why AOT compilation for Java is a thing now, with SubstrateVM (aka "GraalVM native-image"), precisely to eliminate startup overhead.

an-unknown | 1 year ago | on: RT64: N64 graphics renderer in emulators and native ports

I think there is some confusion about "ubershaders" in the context of emulators in particular. Old Nintendo consoles like the N64 or the GameCube/Wii didn't have programmable shaders. Instead, it was a mostly fixed-function pipeline but you could configure some stages of it to kind of somewhat fake "programmable" shaders with this configurable pipeline, at least to some degree. Now the problem is, you have no idea what any particular game is going to do, right until the moment it writes a specific configuration value into a specific GPU register, which instantly configures the GPU to do whatever the game wants it to do from that very moment onwards. There literally is no "shader" stored in the ROM, it's just code configuring (parts of) the GPU directly.

That's not how any modern GPU works though. Instead, you have to emulate this semi-fixed-function pipeline with shaders. Emulators try to generate shader code for the current GPU configuration and compile it, but that takes time and can only be done after the configuration was observed for the first time. This is where "Ubershaders" enter the scene: they are a single huge shader which implements the complete configurable semi-fixed-function pipeline, so you pass in the configuration registers to the shader and it acts accordingly. Unfortunately, such shaders are huge and slow, so you don't want to use them unless it's necessary. The idea is then to prepare "ubershaders" as fallback, use them whenever you see a new configuration, compile the real shader and cache it, and use the compiled shader once it's available instead of the ubershader, to improve performance again. A few years ago, the developers of the Dolphin emulator (GameCube/Wii) wrote an extensive blog post about how this works: https://de.dolphin-emu.org/blog/2017/07/30/ubershaders/

Only starting with the 3DS/Wii U, Nintendo consoles finally got "real" programmable shaders, in which case you "just" have to translate them to whatever you need for your host system. You still won't know which shaders you'll see until you observe the transfer of the compiled shader code to the emulated GPU. After all, the shader code is compiled ahead of time to GPU instructions, usually during the build process of the game itself. At least for Nintendo consoles, there are SDK tools to do this. This, of course, means, there is no compilation happening on the console itself, so there is no stutter caused by shader compilation either. Unlike in an emulation of such a console, which has to translate and recompile such shaders on the fly.

> How come this was never a problem for older [...] emulators?

Older emulators had highly inaccurate and/or slow GPU emulation, so this was not really a problem for a long time. Only once the GPU emulations became accurate enough with dynamically generated shaders for high performance, the shader compilation stutters became a real problem.

an-unknown | 1 year ago | on: F8 – an 8 bit architecture designed for C and memory efficiency [video]

> It lacks features like lambda calculus, closures, and coroutines—powerful and proven paradigms that are essential in modern programming languages. These limitations make it harder to fully embrace contemporary programming practices.

And what features exactly would you propose for a future CPU to have to support such language constructs? It's not like a CPU is necessarily built to "support C", since a lot of code these days is written in Java/JavaScript/Python/..., but as it turns out, roughly any sane CPU can be used as a target for a C compiler. Many extensions of current CPUs are not necessarily used by an average C compiler. Think of various audio/video/AI/vector/... extensions. Yet, all of them can be used from C code, as well as from any software designed to make use of it. If there is a useful CPU extension which benefits let's say the JVM or v8, you can be sure these VMs will use those extensions, regardless of whether or not they are useful for C.

> Intel tried to introduce hardware assisted garbage collection, which unfortunately failed miserably because C doesn't need it, and we are still having to cope with garbage collection entirely in software.

Meanwhile IBM did in fact successfully add hardware assisted GC for the JVM on their Z series mainframes. IBM can do that, since they are literally selling CPUs purely for Java workloads. With a "normal" general purpose CPU, such a "only useful for Java" GC extension would be completely useless if you plan to only run let's say JavaScript or PHP code on it. The problem with such extensions is that every language needs just so slightly different semantics for a GC and as a result it's an active research topic how to generalize this to make a general "GC assist" instruction for a CPU which is useful for many different language VMs. Right now such extensions are being prototyped for RISC-V, in case you missed it. IIRC for GC in particular, research was going in the direction of adding a generalized graph traversal extension since that's the one thing most language VMs can use somehow.

C is in no way "holding back" CPU designs, but being able to efficiently run C code on any CPU architecture which hopes to become relevant is certainly a requirement, since a lot of software today is (still) written in C (and C++), including the OS and browser you used to write your comment.

Just to be clear: this topic here is about tiny microcontrollers. The only relevant languages for such microcontrollers are C/C++/assembly. Nobody cares if it can do hardware assisted GC or if it can do closures/coroutines/... or something.

an-unknown | 1 year ago | on: 1972 Unix V2 "Beta" Resurrected

Actually I have the VT240 firmware ROM dumps, that's where I got the original font from. The problem is, at least the VT240 is a rather sophisticated thing, with a T-11 CPU, some additional MCU, and a graphics accelerator chip. There is an extensive service manual available, with schematics and everything, but properly emulating the whole firmware + all relevant peripherals is non-trivial and a significant amount of work. The result is then a rather slow virtual terminal.

There is a basic and totally incomplete version of a VT240 in MAME though, which is good enough to test certain behavior, but it completely lacks the graphics part, so you can't use it to check graphics behavior like DRCS and so on.

EDIT: I also know for sure that there is a firmware emulation of the VT102 available somewhere.

an-unknown | 1 year ago | on: 1972 Unix V2 "Beta" Resurrected

The only problem with real VTs is you have to be careful not to get one where the CRT has severe burn-in, like in the ebay listing. Sure, some VTs (like the VT240 or VT525) are a separate main box + CRT, but then you're missing the "VT aesthetics". The VT525 is probably the easiest one to get which also uses (old) standard interfaces like VGA for the monitor and PS/2 for the keyboard, so you don't need an original keyboard / CRT. At least for me, severe burn-in, insane prices, and general decay of some of the devices offered on ebay are the reason why I don't have a real VT (yet).

The alternative is to use a decent VT emulator attached to roughly any monitor. By "decent" I certainly don't mean projects like cool-retro-term, but rather something like this, which I started to develop some time ago and which I'm using as my main terminal emulator now: https://github.com/unknown-technologies/vt240

an-unknown | 1 year ago | on: 1972 Unix V2 "Beta" Resurrected

> After that, I've had great fun playing with RT-11 [...]

If you want to play around with RT-11 again, I made a small PDP-11/03 emulator + VT240 terminal emulator running in the browser. It's still incomplete, but you can play around with it here: https://lsi-11.unknown-tech.eu/ (source code: https://github.com/unknown-technologies/weblsi-11)

The PDP-11/03 emulator itself is good enough that it can run the RT-11 installer to create the disk image you see in the browser version. The VT240 emulator is good enough that the standalone Linux version can be used as terminal emulator for daily work. Once I have time, I plan to make a proper blog post describing how it all works / what the challenges were and post it as Show HN eventually.

an-unknown | 1 year ago | on: Majora's Mask decompilation project reaches 100% completion

> For video games the binary is the covered work, not its disassembly.

For the law it doesn't matter much if you look at binary code in a hex editor or at disassembly, since disassembly is just a 1:1 translation of the binary code. Otherwise it would be sufficient to let's say gzip compress the binary and distribute that without fearing any copyright claims since the result would be different and no longer covered, which is obviously not the case. The same applies for decompilation results. Means: for clean room implementations, you cannot look at any of it.

> And disassembly is covered by the DMCA and explicitly allowed for this type of purpose, so that’s not illegal at least.

You can, under certain circumstances, disassemble code for interoperability purposes. This explicitly does NOT cover the case of "I want to make a 1:1 clone of the whole software" which is what these decompilation projects are about. After all, the point of the "matching decompilation" projects is that if you compile the source code, you get the exact same binary again. And for the non-matching decompilation projects, you get at least very similar code after compilation.

For the DMCA exception, think about it more like you are working on GIMP and you want to add support for reading/writing PhotoShop files. You can look at PhotoShop code to understand how these files are read/written to then derive the file structures and implement the relevant I/O code for GIMP. You can NOT look at any PhotoShop code that is not absolutely required for this task, nor can you look at PhotoShop code and build your own clone of PhotoShop using that knowledge and call the result GIMP, which would still not be a "decompilation project" comparable to these game decompilation projects. I hope you can see how these "clean room" claims for such game decompilation projects are pure nonsense.

> Why does using a computer to aid that process make things questionable?

This essentially boils down to "if I didn't see the original code anyway, how am I supposed to have 'copied' it?" which you can't easily do if the computer did in fact see the original code.

Obligatory EFF link for completeness: https://www.eff.org/issues/coders/reverse-engineering-faq

an-unknown | 1 year ago | on: My MEGA65 is finally here

Simple: a real 3.5" floppy disk drive has moving parts and various things that age and eventually break. For example I have an old device with a broken floppy disk drive which can't even read a real floppy anymore. With the metal floppy "emulator disk" you mentioned, the FDD itself still has to be fully functional in order to read this "emulator disk".

A floppy emulator board which reads SD/CF cards or USB sticks doesn't have that problem at all since it's purely solid state electronics and directly connected to the electronic interface of the FDD instead of the real FDD, and usually you can put thousands of floppy disk images onto such a memory card/stick and select which disk image is to be put into the emulated floppy disk drive ⇒ there is simply no need for the "emulator disk" technology you mentioned anymore.

an-unknown | 1 year ago | on: Veles: Open-source tool for binary data analysis

Since this is an old tool and it fails to deal with large files, I made a completely new version from scratch a while ago, using a completely different rendering approach that's more like how you'd render a volumetric data set like an MRI scan. It loads the file and processes it into a 256x256x256 volumetric data set, which is then rendered using shaders. As a result, the file size doesn't matter for rendering, only the loading time depends on the file size. Unlike the original Veles, it also doesn't need any subsampling for huge files, but you need a powerful enough graphics card.

The source code is on github and unlike the original Veles, it doesn't have countless dependencies and build problems on modern systems: https://github.com/hackyourlife/veles

an-unknown | 1 year ago | on: Writing memory safe JIT compilers

> TruffleRuby even had the extremely big brain idea of running a Truffle interpreter for C for native extensions […]

TruffleC was a research project and the first attempt of running C code on Truffle that I'm aware of. It directly interpreted C source code and while that works for small self-contained programs, you quickly run into a lot of problems as soon as you want to run larger real world programs. You need everything including the C library available as pure C code and you have to deal with the fact that a lot of C code uses some UB/IB. In addition, your C parser has to fully adhere to the C standard and once you want to support C++ too because a lot of code is written in C++, you have to re-start from scratch. I don't know if TruffleC was ever released as open source.

The next / current attempt is Sulong which uses LLVM to compile C/C++/Rust/… to LLVM IR ("bitcode") and then directly interprets that bitcode. It's a lot better, because you don't have to write your own complete C/C++/… parser/compiler, but bitcode still has various limitations. Essentially as soon as the program uses handwritten assembler code somewhere, or if it does some low level things like setjmp/longjmp, things get hairy pretty quickly. Bitcode itself is also platform dependent (think of constants/macros/… that get expanded during compilation), you still need all code / libraries in bitcode, every language uses a just so slightly different set of IR nodes and requires a different runtime library so you have to explicitly support them, and even then you can't make it fully memory safe because typical programs will just break. In addition, the optimization level you choose when compiling the source program can result in very different bitcode with very different IR nodes, some of which were not supported for a long time (e.g., everything related to vectorization). Sulong can load libraries and expose them via the Truffle FFI, and it can be used for C extensions in GraalPython and TruffleRuby AFAIK. It's open source [1] and part of GraalVM, so you can play around with it.

Another research project was then to directly interpret AMD64 machine code and emulate a Linux userspace environment, because that would solve all the problems with inline assembly and language compatibility. Although that works, it has an entirely different set of problems: Graal/Truffle is simply not made for this type of code and as a result the performance is significantly worse than Sulong. You also end up re-implementing the Linux syscall interface in your interpreter, you have to deal with all the low level memory features that are available on Linux like mmap/mprotect/... and they have to behave exactly as on a real Linux system, and you can't easily export subroutines via Truffle FFI in a way that they also work with foreign language objects. It does work with various guest languages like C/C++/Rust/Go/… without modifying the interpreter, as long as the program is available as native Linux/AMD64 executable and doesn't use any of the unimplemented features. This project is also available as open source [2], but its focus somewhat shifted to using the interpreter for execution trace based program analysis.

Things that aren't supported by any of these projects AFAIK are full support for multithreading and multiprocessing, full support for IPC, and so on. Sulong partially solves it by calling into the native C library loaded in the VM for subroutines that aren't available as bitcode and aborting on certain unsupported calls like fork/clone, but then you obviously lose the advantage of having everything in the interpreter.

The conclusion is, whatever you try to interpret C/C++/… code, get ready for a world of pain and incompatibilities if you intend to run real world programs.

[1] https://github.com/oracle/graal/tree/master/sulong

[2] https://github.com/pekd/tracer/tree/master/vmx86

an-unknown | 1 year ago | on: Microsoft Recall should make you consider Linux

> […] like any audio hardware that ain’t totally mainstream is not good.

Funnily enough, it totally depends on the hardware. If your audio hardware is supported by Core Audio on Mac, chances are it will also work on Linux without problems. If your audio hardware is older, chances are it will not work on Windows due to the lack of modern drivers but will work on Linux without problems. This also applies to a lot of other older hardware like scanners/printers/... or even built-in peripherals in older laptops.

But if it's something strange (e.g., MOTU MIDI interfaces come to mind which use a non-standard USB protocol), you'll have to hack your own kernel driver which you probably don't want to do. Of course if you are adventurous and do hack your own kernel driver, even this will work, and you'll probably even find someone else's driver code on github.

What's also interesting about hardware support on Linux is that a lot of things which require extra drivers on Windows just work out of the box on Linux, especially if it's in any way relevant for servers like various network cards or if it's one of the many "standard" USB UART chips.

an-unknown | 1 year ago | on: Microsoft Recall should make you consider Linux

Games are the easier part. There are a few examples of software which is only available for Windows and Mac or even Windows only, like professional audio software or various CAD tools. If you need one of those, you can't really avoid Windows / Mac and usually there is no proper replacement for such software that works on Linux. Of course you can have a dedicated Windows or Mac machine just for those programs, but that's still not ideal.

an-unknown | 1 year ago | on: Microsoft Recall should make you consider Linux

> […] Windows-specific apps that they use, such as games.

As if many Windows games don't work on Linux via proton, to the point that Valve's Steam Deck runs on Linux and is "good enough" most of the time. Compatibility purely depends on the game and more often than not incompatibilities are caused by anti-cheat mechanisms.

And about jumping to Linux: we had Windows computers in my family, originally with Windows 7, then upgraded to Windows 8.1, and once 8.1 was EOL, they were reinstalled with Linux (KDE as desktop environment). Since these computers were mainly used for email, web browsing, and some basic "office activities" like writing a simple document occasionally, there was exactly no issue with it. KDE itself is also similar enough to a Windows desktop that it wasn't hard for anyone to learn the few relevant things that are different. I'd be quite surprised if this was different for the majority of current Windows users.

an-unknown | 1 year ago | on: Show HN: Brawshot – Basic temporal denoise for videos in BRAW format

I implemented it just now. Turns out when performing the subtraction on the raw data before applying the color space conversion LUT, it's also necessary to add the mean of the noise image's pixel values because otherwise some bias is lost and the entire image's brightness changes. With this in place, it seems to work. Interestingly enough, this even results in noticeable noise reduction when the noise image was not recorded in the same environment but only with the same camera settings!

When performing the subtraction after applying the LUT (that is, result = apply_lut(raw) - apply_lut(raw_noise) instead of result = apply_lut(raw - raw_noise + mean(raw_noise))), the result is quite different for reasons that I don't fully understand yet, but the noise is reduced in the same way. This will need some further investigation.

an-unknown | 1 year ago | on: Show HN: Brawshot – Basic temporal denoise for videos in BRAW format

You are indeed right. I'm not sure how to efficiently implement a median here though, because a single frame is around 120MB large already. I assume I could/should compute the median over only a few frames to limit memory consumption and feed the result into the moving (arithmetic) average?

I'm also not entirely sure if computing the average of the non-linear "raw" sensor data which is what I did so far is a good or a bad idea / how it interacts at the endpoints of the value range.

an-unknown | 1 year ago | on: Show HN: A Golang CP/M emulator

This project looks interesting. I saw you use stty to configure the console, but there is a native "UNIX way" to do this via tcgetattr/tcsetattr, you'll have to figure out how to use these C functions elegantly in Go though / if there is some Go package which wraps them already. On Windows you'll have to configure the console via win32 API calls GetConsoleMode/SetConsoleMode and again you'll have to figure out how to do this from Go ⇒ you'll have to add some compile time switch between "UNIX like" OS and Windows to support both. The "UNIX like" version should work on Linux, macOS, and various other UNIX like systems.

You could also improve the debugging experience with full execution trace recording (record all executed instructions / memory accesses / … to a trace file) which would not only give you detailed information about what exactly went wrong if something goes wrong but also allow you to directly debug your own Z80 code on assembler level.

an-unknown | 1 year ago | on: Arch Linux RISC-V

Did you run into a boot loop because of some SDMMC controller problem? If yes: this is caused by hardware changes in newer rpi4 and old rootfs images from ALARM which don't support this yet but it's fixed if you update it. To do that, you have to play around with chroot + qemu-user to run a standard "pacman -Syu" on the rpi rootfs before the first boot on the real rpi4. Afterwards it should boot properly on the rpi4.

That being said, if you want to use e.g. the official RPi 7" touch display via DSI, you should also switch from the default upstream kernel to the rpi kernel unless you like to mess around with DTB and debug strange problems.

page 1