top | item 38746696

(no title)

sdkgames | 2 years ago

>The program is smaller than even the 'compressed' form of its output,

>and thus represents a new departure in text compression standards.

7z and gzip disagree with this statement.

discuss

order

o11c|2 years ago

It's only "compressed" if it was made by compress(1) in France. Otherwise it's just sparkling entropy coding.

For reference:

    1490 xmas-with-leading-comment.c
     913 xmas-without-leading-comment.c
    2357 xmas.out
    1297 xmas.out.9.Z
    1038 xmas.out.10.Z # actually better than with more bits!
    1048 xmas.out.11.Z # compression with 11..16 bits have the same size
     319 xmas.out.1.gz # compression levels 1..2 has same size
     317 xmas.out.3.gz
     307 xmas.out.4.gz # compression levels 4..9 have same size
Note that despite looking hard I haven't found a version of `compress` that supports `-H`, which is referenced and decompressable by gzip. I'm not sure how common it was in the wild.

arp242|2 years ago

  % gcc xmas.c
  xmas.c:2:1: warning: return type defaults to 'int' [-Wimplicit-int]
      2 | main(t,_,a)
        | ^~~~
  xmas.c: In function 'main':
  xmas.c:2:1: warning: type of 't' defaults to 'int' [-Wimplicit-int]
  xmas.c:2:1: warning: type of '_' defaults to 'int' [-Wimplicit-int]


  % wc -c <xmas.c
  913
  % ./a.out | wc -c
  2359
  % ./a.out | compress | wc -c
  1048

jrockway|2 years ago

zstd -19 compresses the text of the song to 309 bytes.

To be a fair comparison, though, you'd have to write zstd -d in 604 bytes. I suppose to be REALLY fair, though, you have to count the bytes of code in the compiler itself. A convenient enough implementation of compression could index into the C compiler binary to find the bytes it needs. (For example, my GCC contains "first", "second", and "third" in the binary, which a sufficiently clever implementation could make use of. "Exit on the first error occurred.", "Append a second underscore if the name already contains an underscore.", "Warn about suspicious calls to memset where the third argument is constant literal zero and the second is not.", etc. I didn't check but I doubt turtle doves or maids-a-milking come up that often in the description of warning flags.)

theideaofcoffee|2 years ago

This particular entry to the IOCCC was for 1988, those two algorithms didn’t come about for a few more years after that, gzip in the early nineties and 7z later that decade. The note is probably correct when comparing against the state of the art at the time.

acqq|2 years ago

You are right. Here is the source of "compress" that existed at that time. It compresses the produced song to 1048 bytes, not less.

https://www.nic.funet.fi/index/minix/compress.c

The program that produces the song, without the introductory comment, is 913 bytes, as presented in the article. Removing whitespaces it uses just 800 bytes and produces the song which is 2359 chars here. The whole C is:

    main(t,_,a)char*a;{return!0<t?t<3?main(-79,-13,a+main(-87,1-_,
    main(-86,0,a+1)+a)):1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==
    2?_<13?main(2,_+1,"%s%d%d\n"):9:16:t<0?t<-72?main(_,t,
    "@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l,+,/n{n+,/+#n+,/#;\
    #q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/+k#;\
    q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# ){nl]!/n{n#'; \
    r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#\
    \
    n'wk nw' iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c ;;\
    {nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;\
    #'rdq#w! nr'/ ') }+}{rl#'{n' ')# }'+}##(!!/")
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1):
    0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,
    "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);}
It compiles and links even without #include.

natch|2 years ago

Since the word 'compressed' is in quotes they are probably suggesting that they mean when processed by the 'compress' command as available on UNIX at the time, as opposed to some other compression available at the time.

ecesena|2 years ago

I just checked, gzip is from 1992, 7z from 1999.

edub|2 years ago

The program was written in 1988. I ran the text through LZSS which was published in 1982, so was available before 1988. I used a 1989 public domain version by Haruhiko Okumura, which is after 1988, but I don't believe it is optimized to improve upon the compression level of the 1982 algorithm.

It took it from 2357 bytes to 534 bytes, which is smaller than the Xmas.c program which I counted as 917 bytes, but another poster counted 913 bytes.

adrianmonk|2 years ago

"New departure" simply doesn't mean "record-breaking compression ratio".