Tuesday, August 23, 2005

Google Talk is Live

Google has launched Google Talk.
It's a Jabber service, and gmail users can use their email address and password to login to talk.google.com.

What's next?

Tuesday, August 16, 2005

Closures in Ruby and Python

Edit: This post has come up in a recent (2008-04-27) post on reddit. For an account of where the term, closure, came from look here.

While reading the the Django/RoR comparison by Sam over at magpiebrain the "Ruby has closures, Python doesn't" argument was mentioned. I admit to not being really familiar with the concept, but I believe I understand what it is. The thing is, I don't see what's important about it.

A link was given in the post to an interview with Yukihiro Matsumoto where "Matz" is asked about what a closure is and how it's beneficial. Here's the most relevant portion of the interview:

Bill Venners: OK, but what is the benefit of having the context? The distinction that makes Ruby's closure a real closure is that it captures the context, the local variables and so on. What benefit do I get from having the context in addition to the code that I don't get by just being able to pass a chunk of code around as an object?

Yukihiro Matsumoto: Actually, to tell the truth, the first reason is to respect the history of Lisp. Lisp provided real closures, and I wanted to follow that.

Bill Venners: One difference I can see is that data is actually shared between the closure objects and the method. I imagine I could always pass any needed context data into a regular, non-closure, block as parameters, but then the block would just have a copy of the context, not the real thing. It's not sharing the context. Sharing is what's going on in a closure that's different from a plain old function object.

Yukihiro Matsumoto: Yes, and that sharing allows you to do some interesting code demos, but I think it's not that useful in the daily lives of programmers. It doesn't matter that much. The plain copy, like it's done in Java's inner classes for example, works in most cases. But in Ruby closures, I wanted to respect the Lisp culture.


I respect cool implementations of higher-order concepts as much as the next guy, but I can't help but feel cheated by that response. I've seen some bickering between Python and Ruby clerics that try to demonstrate why closures are important and how Python doesn't have them; nothing I saw gave a practical example.

Anyone have any?

Edit: I may have found a practical example, but I'm not sure what's so special about it.

Here's a wiki entry about the benefits of closures in Ruby and here are the Pythonized versions that work the same way. This is just a special case of closures, I'm still looking for something completely different.

Tuesday, August 09, 2005

Pythonic file I/O?

A coworker of mine has recently been using Python on and off for basic administrative tasks and ran into a problem this week with modifying the contents of a file without rewriting the entire file. He wasn't trying to modify the file size, just change a number of bytes that were somewhere in the file. I had forgotten about mmap, but is that the best Python module for file modification?

It has two main limitations: 1) No way to insert into or delete from a random location in the file. 2) A mmap object maps a file in part or in total, but the mapping must start at the beginning of the file. This is a problem when modifying large files which can't be loaded entirely into memory, a solution would be to allow the mmap to begin a file mapping at an arbitrary position.

I understand there are some technical details to make this work. Completely rewriting a file out may still be needed in some cases; but, all of this should be implemented in a slick module and hidden from the developer. It's just not very pythonic.

Friday, August 05, 2005

Python Rocks

I was thinking of just getting to sleep after my last post, but started fiddling with a little Python module to handle parsing words from a source into a vocabulary. Well, from the time from my last post, minus a quick browsing of Planet Python and email checking, I have a basic module written that will accept text input, trash all special characters (e.g., (){}!?), use the stemmer, and drop the output in a list.

List comprehensions are great.

Stemming, information filtering

I've started work on an information filtering system to assist in returning only relevant documents. I'll be using my own implementation of probabilistic latent semantic analysis (PLSA) at first. A couple ideas of how to extend it have already entered my head, and a buddy of mine has started playing around with a probabilistic model based off least squares.

I'm doing all of this in Python and already found a decent word stemmer algorithm that's been written in Python. Not much to it, but it seems to work well. The next step is building a vocabulary; I've found a few with some Google searches, but I'm going to go ahead and write a Python module to ingest text files, clean out punctuation and digits, stem the remaining words, and write the result to a master vocabulary.

Wednesday, August 03, 2005

For your browsing pleasure

I came across these two posts recently: The Multiple Self and A Spooky Mind Hack.

They're good reading. If you start The Multiple Self and get tired of it before finishing, go read the second post.

Overall, very good stuff, though I do think there are a few inaccuracies in the definition of the "emotional net". The gist of it is right on.

Monday, August 01, 2005

Ahhh, Django...

I've been testing out Django this past week, mostly finished the the first two tutorials. There's a lot to become familiar with, but I'm amazed at how powerful the framework is. Though I haven't done much with Rails other than research it, I'm glad that the community is shaping up to allow for exchange of ideas and friendly competition.

This is also a good thing for Python, as I think there were a decent number of Pythoners wishing to do some web application work in Python but ended up using Java (*shudder*) or Rails. Not to discount the work that many put in on templating engines and basic frameworks for Python, but it's nice to use one framework that's a complete package.

dynamictyping.com is coming along, I've just been sidetracked of late, getting into AI solutions for information retrieval.

Probabilistic latent semantic analysis is the topic of the day.