Tidy Data

A huge amount of effort is spent cleaning data to get it ready for data analysis,
but there has been little research on how to make data cleaning as easy and effective
as possible. This paper tackles a small, but important, subset of data cleaning: data
“tidying”.

— Wickham
Tidy Data is a must-read paper.

Wrapping up the 2009-2010 School Year

This past May, I completed the Simulation and Parallel and Distributed Systems class that I was attending. While taking two classes while working full time was challenging; the pure fun of it all more than made up for the challenge! I will have fond memories of that semester for a long time.
I can’t wait to get started with Applied Mathematical Analysis next week.

Matthias Felleisen and the PLT Team win the ACM Karl Karlstrom Award

Presented annually to an outstanding educator who is: appointed to a recognized educational baccalaureate institution; recognized for advancing new teaching methodologies, or effecting new curriculum development or expansion in Computer Science and Engineering; or making a significant contribution to the educational mission of the ACM. Those who have been teaching for ten years or less will be given special consideration. A prize of $5,000 is supplied by the Prentice-Hall Publishing Company.

Via Matthias via ACM via Shriram.

Lambda Calculus Modeled with PLT Redex

Here is material on understanding Lambda Calculus using PLT Redex:

In this zip directory you can find file lc-with-Redex.doc which is a short intro in Lambda Calculus and contains info on how to use the material that goes with the essay. I have included a very condenced note on a system even without lambda, which is called ‘combinatory logic’. My essay is adressed to Schemers and uses PLT Scheme, particularly PLT’s redex library. It has become somewhat more verbose than I had in mind originally. If you have any ideas how to condence and simplify further without loosing too much content and accuracy, I welcome your suggestions. Comments about inaccuracies and other faults are welcome too, of course. My essay does not contain any new views. It is a compilation of views taken from books that have inspired me. They are mentioned in the essay.
If the format of the above link does not suit you, give me a ping and I’ll try to send you the material in a format that suits you.
With thanks to Douglas R. Hofstadter, Daniel P Friedman, Matthias Felleisen, Roby Findler, Casey Klein and others, in all feasible orders.

— Jos
I did not read it yet; this is on my long list.
(via plt)

Standford Programming Paradigms Course Videos

Programming Paradigms (CS107) introduces several programming languages, including C, Assembly, C++, Concurrent Programming, Scheme, and Python. The class aims to teach students how to write code for each of these individual languages and to understand the programming paradigms behind these languages.

The videos are available here.
(via reddit)

Fractal Imaging

What is fractal imaging? Well, it’s more than just the algorithmic generation of ferns (like the generated image above) from non-linear equation systems. It’s a way of looking at ordinary (bitmap) images of all kinds. The hypothesis is that any given image (of any kind) is the end-result of iterating on some particular (unknown) system of non-linear equations, and that if one only knew what those equations are, one could regenerate the image algorithmically (from a set of equations) on demand. The implications are far-reaching. This means:
1. Instead of storing a bitmap of the image, you can just store the equations from which it can be generated. (This is often a 100-to-1 storage reduction.)
2. The image is now scale-free. That is, you can generate it at any scale — enlarge it as much as you wish — without losing fidelity. (Imagine being able to blow up an image onscreen without it becoming all blocky and pixelated.)

Kas Thomas
Here is the book, Fractal Imaging, referenced by the article.