(no title)
eadmund | 4 months ago
Is it[1]? My version below accumulates alphabetical characters until it encounters a non-alphabetical one, then increments the count for the accumulated word and resets the accumulator.
(let (c accumulator (counts (make-hash-table :test #'equal)))
(handler-case
(loop
(setq c (read-char))
(if (find c "ABCDEFGHIJKLMNOPQRSTUVWXYZ" :test #'char-equal)
(push (char-downcase c) accumulator)
(when accumulator
(incf (gethash (coerce (reverse accumulator) 'string) counts 0))
(setq accumulator nil))))
(end-of-file ()
(when accumulator
(incf (gethash (coerce (reverse accumulator) 'string) counts 0))
(setq accumulator nil))
(maphash #'(lambda (word count)
(push (list count word) accumulator))
counts)
(format t "~{~&~{~a ~a~}~%~}" (reverse (last (sort accumulator #'< :key #'car)
(let ((n (second sb-ext:*posix-argv*)))
(if n (parse-integer n) 100))))))))
It’s not exactly pretty or idiomatic, but its 19 lines appear to get the job done.1: Well, technically it is, because there is SPLIT-SEQUENCE: https://github.com/sharplispers/split-sequence
kragen|4 months ago
It does look a lot like what I was thinking would be necessary. About 9 of the 19 lines are concerned with splitting the input into words. Also, I think you have omitted the secondary key sort (alphabetical ascending), although that's only about one more line of code, something like
Because the lines of code are longer, it's about 3× as much code as the verbose Perl version.In SBCL on my phone it's consistently slower than Perl on my test file (the King James Bible), but only slightly: 2.11 seconds to Perl's 2.05–2.07. It's pretty surprising that they are so close.
eadmund|4 months ago
Were I trying to optimise this, I would test to see if a hash table of alphabetical characters is better, or just checking (or (and (char>= c #\A) (char<= c #\Z)) (and (char>= c #\a) (char<= c #\z))). The accumulator would probably be better as an adjustable array with a fill pointer allocated once, filled with VECTOR-PUSH-EXTEND and reset each time. It might be better to use DO, initializing C and declaring its type.
Also worth giving it a shot with (optimize (speed 3) (safety 0)) just to see if it makes a difference.
Yes, definitely more verbose. Perl is good at this sort of task!