Trying out the Go programming language

I’ve been playing around with Go, and so far I like it a lot. Go’s greatest innovation is that the compiler is the build tool, the package manager, the style checker, the linter, and the runner.

Here are a few Go libraries/programs I made:

Go’s language tour, documentation, and standard library are all excellent. I was able to be productive in Go very soon after finishing the first half of the language tour.

The quality of third-party libraries is quite high as well; I am optimistic that this is the result of Go’s careful design, and not just because it’s a fairly new language. In particular, I found go-imap to be spectacularly easy IMAP library to use and extend with new functionality (it is far better than Python imaplib and JavaMail).

One thing about Go that’s still unclear to me is which Web framework is the best (and most Go-idiomatic) for writing REST APIs. I haven’t yet had a chance to try out many of them, and the Libraries Written in Go page is purely descriptive, not normative.

Blend Labs Tech Talk at Stanford on HBase + Scala + Hadoop

Eugene Marinelli and I just gave a Stanford ACM tech talk about our use of HBase, Scala, and Hadoop at Blend Labs. The goal of the talk was to give back some of the code and patterns we’ve developed for working with high-level, modeled objects in Scala, storing (serializing and deserializing) the objects in HBase, and performing Hadoop MapReduce analysis on large datasets. We’ve posted the slides and code online.

Discovered HPaste, a great Scala DSL for HBase

I recently came across HPaste, a Scala DSL for HBase. Once you’ve declared an schema like below, you can run HBase operations and MapReduce jobs on your data much more easily than with the standard HBase Java library.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
object EventSchema extends Schema {
 
  implicit val conf = new org.apache.hadoop.hbase.HBaseConfiguration
 
  class EventTable extends HbaseTable[EventTable, String, EventRow](
    tableName = "event",
    rowKeyClass = classOf[String]
  ) {
    def rowBuilder(result: DeserializedResult) = new EventRow(this, result)
 
    val msg       = family[String, String, Any]("msg")
    val msgSender = column(msg, "sender", classOf[String])
    val msgBody   = column(msg, "body", classOf[String])
    val msgDate   = column(msg, "date", classOf[String])
 
    val info      = family[String, String, Any]("info")
    val name      = column(info, "name", classOf[String])
    val startDate = column(info, "startDate", classOf[String])
    val endDate   = column(info, "endDate", classOf[String])
    val body      = column(info, "body", classOf[String])
    val source    = column(info, "source", classOf[String])
    val tags      = column(info, "tags", classOf[String])
  }
 
  class EventRow(table: EventTable, result: DeserializedResult) extends HRow[EventTable, String](result, table)
 
  val EventTable = table(new EventTable)
}

HPaste stops short of being an ORM (like ActiveRecord or DataMapper) on purpose, but I’ve found myself making a number of ad-hoc singletons to wrap common operations against HBase. I’d find it very useful to have an ORM that sits on top of HBase and lets you define the operations and the de/serialization you want for your types. Time permitting, I’ll factor out what I have and post it up on Github. (I also made a Ruby ORM for HBase back in 2008, but I think the way people use HBase has become more sophisticated since then and that ORM design is no longer suitable for heavy use.)

From Quora: Will Linux incorporate tcpcrypt?

Someone on Quora just asked: “Will Linux incorporate tcpcrypt?”. I posted a response over there:

I have been working on and off with tcpcrypt for about a year. I believe that if someone puts in the time to polish the Linux kernel implementation, it’d be a likely candidate for inclusion. Andrea Bittau (the lead tcpcrypt guy) told me he would like to work on the kernel implementation himself sometime in the future. Andrea, Mark Handley, and David Mazieres are also working on the Internet Draft.

For now, the userspace implementation works well (and supports Linux, Mac OS X, FreeBSD, and Windows). It has a library so that endpoints can see the tcpcrypt session ID (to perform their own authentication), at https://github.com/sorbo/tcpcrypt/blob/master/user/include/tcpcrypt/tcpcrypt.h, and I made an Apache module that passes the session ID to Web apps, at https://github.com/sqs/mod_tcpcrypt.

On the advantages of tcpcrypt over TLS:

If your system runs tcpcryptd, then existing applications will use tcpcrypt. If the destination host doesn’t support tcpcrypt, then the channel falls back to normal TCP with no communication round-trip overhead. If both sides support tcpcrypt, then the applications get encryption for free (no code changes and minimal overhead). Of course, this only protects against passive attackers; to protect against active attacks, the apps would have to be modified to authenticate using the tcpcrypt session ID. Still, it’s better than cleartext. This, I think, is the key benefit tcpcrypt has over TLS: it makes encrypted-by-default really simple to bring about.

See also the tcpcrypt Internet-Draft and the tcpcrypt Wikipedia page.

What I’ve finished reading, Jan-May 2011

I enjoyed all of them–or else I wouldn’t have finished reading them.