2014-09-05

Utterly Pointless Update

A quick update on my previous post, Utterly Pointless. I showed the same problem to my brother, who at first concluded the same thing that I did, but later raised the question as to why the first and third arguments have to be of the same type. His argument was that the polymorphic nature of the type parameters meant that they could take on different type signatures in different contexts. If we deconstruct the definition of f and uncurry it, we arrive at the equality

f(x,y,z) = const(y,f(z))

where const is a function that returns its first argument and ignores its second argument. Since we can call f with a first argument of type elephant, or a first argument of type giraffe, why do x and z have to both be elephants or both be giraffes?

While I am in no way an expert or by any means well-acquainted with Haskell's type inference system, my guess is that when Haskell is trying to construct a type for f from scratch, it assumes that f must be of a fixed type, and lets the polymorphism enter after it decides what relationship should exist between the various types in f's type signature.

>After my father lectured me for a brief while about category theory, and I was still unable to find a reason why the first and third arguments should have to be the same type, I decided to explicitly write out my brother's more general type for f, namely

f :: a -> b -> (c -> b)

Surprisingly, Haskell was completely okay with this type signature for the same definition, and allowed me to pass first and third arguments of different types, whereas it complained when I tried to do so without the custom signature. My theory is this: Haskell has no way of knowing that it might assume the more general type, so it starts out the same way that I did in my previous post. This more general signature is perfectly valid, and can be verified to be valid, but it cannot be constructed from scratch. If you give a mathematician an attempted proof of Goldbach's conjecture, they will probably be able to tell you whether it is a valid and complete proof, but no mathematician thus far has been able to construct a proof. I know that this analogy is getting the Curry-Howard isomorphism completely backwards, but I hope it at least gets my point across. 

If anyone has insight into what is happening here inside the type inference system and the type checking system, or about any aspect of this seeming paradox, please leave a comment. As usual, the comments section is also open to comments/criticisms/heckling.

EOF

2014-09-04

Utterly Pointless

While reading a Haskell book that made mention of Haskell coding golf, I got a bit distracted and did a quick search for "haskell golf." Besides several results for the "Krueger-Haskell Golf Course" in Wisconsin, I found an exercise to write the shortest function of type a -> b -> (a -> b) . For those of you not familiar with curried functions (or the variety used in Haskell in particular), this means that if a function f has that type (written as f :: a -> b -> (a -> b)), then applying f to a value of type a produces another function, that takes a value of type b and produces a third function, which simply takes a value of type a and returns a value of type b. In other words, if we wanted type a to be a String and type b to be an Int, then we would have something like

f("Fish!") = g(x) 
g(2) = h(x) 
h("No fish...") = 42

if that doesn't make it even more confusing. Since Haskell is functional, lambdas are important, which are written as backslashes in Haskell because keyboards with lambdas are harder and harder to find as the Greek keyboard-fabrication industry hasn't been doing very well as of late. I will try to explain everything as I go along, but do your own research if you would like to understand anything I am saying before, while, or after I say it.

Of the entries posted thusfar, the shortest one was this one

f _ = (. f) . const

with the whitespace added in to make it more readable.

I had learned enough Haskell to understand every individual part of this (except for const, which wasn't hard to look up), and yet the whole was far from clear to me. (Feel free to treat this as an exercise if you wish, in which case you should stop reading until you work out why this function has the type specified. Otherwise, read on.)

We will start, naturally, at the end of the expression. const is a function that takes any two results, and always produces the first. In curried terms, it takes a value and produces a function that returns that value for any input it is given. In lambda expressions this would be expressed as

const = λ x → λ y → x

or, written in a way that is more consistent with Haskell,

const = \x -> _ -> x

That underscore is a catch-all that avoids naming a variable that will never actually be used, similar to a '.' in regular expressions. Speaking of which, '.' is Haskell's syntax for function composition, meaning that (f . g) x == f (g (x))

Now that we have understood that much, we can turn back to our Franken-function. By the definition of const, we know that it is of type a -> b -> a, since it has the same return value, and therefore type, of its first argument, and we don't actually care what its second argument's type is. Function composition (.) has an even more complicated type, namely (b -> c) -> (a -> b) -> (a -> c)   This is merely saying that it takes a function whose input type is the output type of its second function, and glues them together to produce a function that first applies one, and then applies the other on the result of the first. Due to the way it is defined, both in Haskell and in mathematics, function composition applies the innermost, or rightmost, function first, and the outermost, or leftmost, function last. In other words, whatever is to the right of a . is applied first, and whatever is on the immediate left is applied second.

Let's turn back to our problem. In order to make it possible to determine the type of our function f, we first need to name its type so we can perform some type inference. Since all functions ultimately take one value (which might be an integer, a string, a list, an elephant, or even another function) and produce another value (which might be an integer, a string, a list, a giraffe, or even another function), we can say that our function f takes a value of type x and returns a value of type y. We would write this as

f :: x -> y

Right now, we don't know or care what x or y actually represent, we are just using them in order to refer to the argument and return types of f. We could just as easily have called them a and b, or elephant and giraffe, but it is certainly more concise than referring to them as "the argument type of f" and "the return type of f". Because parentheses have the highest precedence, we will examine the term inside the parentheses first. (. f) might look like someone forgot a function to put to the left of the composition operator (let's just call it 'dot' for brevity), but the infix expression foo . bar is really just shorthand for (.) foo bar . Because Haskell tends to curry everything, the partially applied expression (. f) is really just another function, which is asking for a function to put on its left side so it can be a complete expression. In other words, back to our lambda expressions, (. f) == \g -> g . f . If we look at the type of this expression, we can see that it must be (. f) :: (y -> z) -> (x -> z) , since the composition is already half-done and f is of type x -> y

Now we can finally look outside the parentheses and see what is going on. If we refer to the expression (. f) as g, we can save a lot of trouble. Our RHS now looks like g . const . Remembering the type expression for const , we can work out what the expression as a whole must be. From the type of g, we can eventually see that, in this case, the exact type of the composition operator ('dot' sounds a bit silly) must be

. :: ((y -> z) -> (x -> z)) -> (a -> b -> a) -> ?

Hmm, we seem to have reached an obstacle. Nothing to worry about, though, as long as we can match up the type expression for (.) with the weird expression we have here.

. :: ((y -> z) -> (x -> z)) -> (a -> (y -> z)) -> (a -> (x -> z))

If we look at these two expressions, it isn't too hard to figure out that y == b and z == a. So our overall RHS has a type of a -> (x -> a) . Now, as to the input type of f, the underscore after f on the LHS tells us that we discard the first argument, so the first type is arbitrary. Recalling the definition of x, we now know that

f :: x -> a -> (x -> a)

but the type variables were never really special anyway, so we can just rename them as we please, so long as we are consistent about the renaming. If we replace every 'x' with an 'a', and every 'a' with a 'b' (in parallel, not in sequence, of course), we will get the definitive type of our Franken-function:

f :: a -> b -> (a -> b)

 which was what we wanted in the first place.

Thanks for letting me take you on a magical journey through the Krueger-Haskell Golf Course. Please return any borrowed clubs to the front desk.

(If you're wondering about the title, point-less or point-free is a style of writing Haskell that omits all named references to arguments, where such named references are known as 'points.')

EOF

Counting to 100

Today, while reading Chapter 6 of The Unix Programming Environment by Kernighan and Pike (Programming with Standard IO), I wanted to test a program that took a long-ish file as input. Naturally, I needed to have a long-ish file, and I don't generally keep them lying around, as I an near-obsessive about the organization of my files and folders and trash anything I won't need again. I opened up a new document in vim and decided that I wanted 100 lines of text to properly test the program. Okay then.

Like a true programmer, I wanted to make sure that no lines were being dropped, so I intended to number them. Also like a true programmer, I was too lazy to type out 100 line numbers by hand. While normally, when in such a scenario, I would try to learn [far more than] enough perl or awk or some other basic scripting language to perform this task, I had been reading about IO in Haskell only the day before.

Haskell, for those who are not familiar, is a functional programming language that can be described as "like lisp, but without the parentheses," or "like Haskell, but with a less recursive comparison." Being a functional programming language, Haskell is great for all sorts of things, with many mathematical applications. Granted, without optimization, Haskell can be much slower than other, less functional languages, but it can be useful sometimes.

As the task I faced involved counting to 100, the mathematical nature of Haskell lent itself to this problem. I wrote out the following line of code

main = sequence $ map print [1..100]

and entered

:.!runhaskell

into vim. This produced the numbers from 1 to 100, all on their own line, with some extra output at the end:

[(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),()]

but that was quickly dealt with. While I mostly view this application of programming knowledge as an achievement, I also feel somewhat of a false programmer, because my laziness actually expedited the result instead of impeding it by leading me on a wild goose-chase of discovery and mosquito-nuking.

This is my first post on my first real blog, so I welcome any comments/criticisms/heckling either related or wildly unrelated to this post or blog.

EOF