top | item 37724014

Code Golfing in Commodore BASIC

54 points| Two9A | 2 years ago |imrannazar.com

40 comments

order
[+] wazoox|2 years ago|reply
Back in the 80s when type-in program magazines were common, in France we had the wonderful "Hebdogiciel" with a perpetually running BASIC programming contest called "deulignes" -- which means "twolines".

"Deulignes" programs could target any platform, but must only take 2 lines of BASIC (most implementations allow only a limited line length, often 255 characters).

Some programs were really impressive; I remember one complete breakout implementation in MSX-BASIC for instance. People actually made whole (small) games in 2 lines of BASIC!

Here's an example page : https://archive.org/details/hebdogiciel-french-098/page/n15/...

[+] qsort|2 years ago|reply
In a roundabout way BASIC on those machines is like modern high-level languages.

Slow and inefficient for sure, but most of the magic is happening directly at the hardware level (sprites, memory-mapped IO, etc.), so there's a surprisingly large amount of stuff you can do with very acceptable performance.

[+] wkjagt|2 years ago|reply
Do you think you'd be able to find that 2 line breakout? I'm currently doing a breakout implementation on my Commodore 64. In assembly though, so definitely more than two lines ;-) Nevertheless, a two line breakout in Basic would probably give lots of pointers on how to make things more compact.
[+] wiz21c|2 years ago|reply
+1 for mentionning the best (objectively :-) ) computer magazine of all time (in french, that is).

Et les dessins de Carali...

[+] p0w3n3d|2 years ago|reply
that's a nice amount of code there.
[+] bump-ladel|2 years ago|reply
If you enjoy this, then you should definitely checkout 8-Bit Show And Tell’s YouTube channel. The presenter, Robin, regularly does deep dives into code optimisation and fixes on Commodore 64 and other machines.

https://www.youtube.com/watch?v=jhQgHW2VI0o

[+] afro88|2 years ago|reply
BASIC defaults to the tape device if you leave off the device number in LOAD/SAVE commands. So you can save another byte or two by saving to tape instead.
[+] dep_b|2 years ago|reply
In a C64 BASIC program keywords like SAVE and PRINT can be abbreviated:

https://www.c64-wiki.com/wiki/BASIC_keyword_abbreviation

That would shave off some more precious bytes!

[+] masswerk|2 years ago|reply
This is how the program is actually stored:

  10SAVE"4",8:PRINT4

  0801  0F 08               link to next line at $080F
  0803  0A 00               line number (16-bit binary): 10
  0805  94                  token SAVE
  0806  22 34 22 2C 38 3A   ascii «"4",8:»
  080C  99                  token PRINT
  080D  34                  ascii «4»
  080E  00                  -EOL-
  080F  00 00               -EOP- (link = null)
As we may see, "SAVE" has been compressed already to a single byte (0x94), as is "PRINT" (0x99). Moreover, the line number is a 16-bit binary integer, meaning, the number of decimal digits in the listing has no effect on the in-memory format.

BTW, abbreviations of BASIC keywords work, because of how upper-case/shifted letters are encoded in the PETSCII character set: they have their sign-bit set. (So normal letters are all smaller than 0x80, and shifted characters are >= 0x80. We may also note that codes > 0x80 are used exclusively for tokens in the stored BASIC text, discriminating them from any other text.) Now, the tokenizing routine uses a table, which also uses a set sign-bit: as a marker on the last character on each of the keywords, which are stored in a table. It will compute the difference of each letter in an input word to the entries in that table, and, if the difference is exactly 0x80 (the sign-bit), this means, (a) we arrived at the end of the word stored in the table, and (b) all the letters up until here did match (otherwise, we would have already exited the loop, in order to test the next keyword). We have a match! The routine then adds 0x80 to the table index of that keyword, and voila, there is your BASIC token.

Notably, if we're dealing with single-byte values, for a difference of 0x80 it doesn't matter, which of the two bytes, this is the difference of, holds the bigger value. It's effectively unsigned and agnostic of which was the larger byte. For our tokenizing routine, this means it will only "know" that one character has the sign-bit set, while the other has not (but is otherwise the same), but it will not "know" which of the two this is. Therefore, adding the sign-bit to an input character will fool the routine into assuming, it already went over the entire keyword and hit the sign-bit set in the last character of the table entry. And we achieve this by shifting the character in the input text. And, voila, there is your abbreviated BASIC keyword.

(We can also see how the length of the input keyword doesn't contribute to the storage format, as it will be compressed to a token, which is 0x80 + the table index of the keyword, anyways. We may also see why "iN" matches "input#" but not "input", because the longer version has to come first in the table, in order to match at all, and it will be also the first to be recognized by the erroneous match.)

[+] Hackbraten|2 years ago|reply
Doesn’t BASIC tokenize those abbreviations to the exact same in-memory bytes like the full keywords?
[+] lifthrasiir|2 years ago|reply
I don't know anything about C64 or C64 BASIC, but would it be possible to intentionally write a shorter binary which will break the interpreter and do what we want instead? For example jump directly to a middle of the kernel ROM routine (akin to ROP in the modern days), or use a bad address in the "next line" offset etc.
[+] wkjagt|2 years ago|reply
In Commodore BASIC there's already SYS, which lets you jump to an arbitrary address anywhere in the 64k address space, including ROM. You can even include raw bytes in a BASIC program and have the CPU execute them as machine code.
[+] einr|2 years ago|reply
Using line number 1 instead of 10 seems like an easy 1 byte save.
[+] pgeorgi|2 years ago|reply
they're stored as 16 bit little-endian word, so unless it's used for goto/gosub (whose targets are stored in petscii) the line number makes no difference.
[+] p0w3n3d|2 years ago|reply
remember when people didn't have FDDs but cassete drives instead, because FDDs were too expensive? Pepperidge Farm remembers