Tidy Data

A huge amount of effort is spent cleaning data to get it ready for data analysis,
but there has been little research on how to make data cleaning as easy and effective
as possible. This paper tackles a small, but important, subset of data cleaning: data
“tidying”.

— Wickham
Tidy Data is a must-read paper.

color-theme-retro-green on Emacs 24

Wonder if anyone is using color-theme anymore; seems like wasted effort to just bail on all of the great themes in there. Today I wanted to get color-theme-retro-green at least working, and here is what it took against 6.5.5 of the Marmalade release:

(defun gcr/plist-to-alist (ls)
  "Convert a plist to an alist. Primarily for old color-theme themes."
  (let ((result nil))
    (while ls
      (add-to-list 'result (cons (car ls) (cadr ls)))
      (setq ls (cddr ls)))
    result))
(defalias 'plist-to-alist 'gcr/plist-to-alist)

Make a change in color-theme.el’s color-theme-retro-green to initialize face and faces with an empty list:

 ;; Build a list of faces without parameters
  (let ((old-faces (face-list))
        (faces '())
        (face '())
        (foreground (or color "green")))

Didn’t find a Github project to submit a patch so I emailed the owner.

One Emacs SML Workflow

Being partial to the full-REPL-reboot style of development (ala DrRacket) for most situations I wanted the same thing in Emacs with sml-mode. The value add is that you know all of your files are saved and that your environment is in a fresh and known state. I came up with this:

(defun gcr/sml-eval-buffer ()
  "Intelligently evaluate a SML buffer."
  (interactive)
  (gcr/save-all-file-buffers)
  (let ((sml-process (get-process "sml")))
    (when sml-process
      (quit-process sml-process)))
  (sleep-for 0.25)
  (let ((sml-buffer (get-buffer "*sml*")))
    (when sml-buffer
      (kill-buffer sml-buffer)))
  (sml-prog-proc-load-file buffer-file-name t))

Only to be delighted (though not surprised) to find yet another nearly identical approach here by wenjun.yan:

(defun isml ()
  "If sml repl exists, then restart it else create a new repl"
  (interactive)
  (when (get-buffer "*sml*")
    (with-current-buffer "*sml*"
      (when (process-live-p "sml")
        (comint-send-eof)))
    (sleep-for 0.2))
  (sml-run "sml" ""))

My urge to attain Emacs Comint mastery only grows.

Deleting trailing whitespace for auto savers

real-auto-save is a great package if you like that sort of thing. For example, I like every file to always be saved without me worrying about doing it myself, so I stick with the default save occurring every 10 seconds. A really nice function to call on write-file-hooks is delete-trailing-whitespace, but, with 10s saves this means that in the middle of typing you have spaces eaten and this is clearly unacceptable!
Here is an attempt at a tweaked cleanup function that cleans up every line in the file but for the current line on which your cursor sits:

(defun gcr/delete-trailing-whitespace ()
  "Apply delete-trailing-whitespace to everything but the current line."
  (interactive)
  (let ((first-part-start (point-min))
        (first-part-end (point-at-bol))
        (second-part-start (point-at-eol))
        (second-part-end (point-max)))
    (delete-trailing-whitespace first-part-start first-part-end)
    (delete-trailing-whitespace second-part-start second-part-end)))

Cask for the truly impatient

Thanks to some kind Emacsers I’m now in the modern age using Cask, and what ease it brings to using Emacs. It is truly a joy; anyone not using Emacs for fear of difficulty pulling in packages can let go of their hesitation. It is as easy as writing one config file, installing the packages, and adding a couple lines to your Emacs init script. Here are the basic steps:

  • Clone the cask repo.
  • Add the bin dir to your path.
  • Create a file named Cask, add it to your VCS, and create a link to it from your .emacs.d directory
  • Add a repo and packages to the file.
  • From your .emacs.d directory, run ‘cask’
  • Add the cask load and init to your init file.
  • Start Emacs.

Excellent work by that team.