Thursday, February 23, 2012

NeoBlog - My Neo4j Challenge Entry

It's been an interesting start to the year. Towards the end of January I purchased a copy of Seven Databases in Seven weeks to give myself a boost in the world of nosql databases. Within a week I had discovered the neo4j challenge. It seemed too good an opportunity to miss, so I embarked on writing an application for the competition. This is my write up of how it went.
Some Design

I decided early on that my focus was.
  1. Learning about neo4j
  2. Learning about writing web apps in python
  3. Submitting an entry

Getting Started

With this in mind I spent a week playing around with neo4j via Nigel Small's excellent py2neo library. I started off with modelling the london underground in neo4j, and playing around with finding routes around. Here's a tweet I made with a photo of part of the network. This was a great learning activity, I found a bug in the pyneo library which I fixed and Nigel was good enough to pull into his repo. You can see the commit here This was my first real contribution to an open source project, which I was pretty pleased with.


The Blog Idea

Despite having lots of fun playing I couldn't get it quite working the way I wanted, so I decided to keep it on the back burner, and try out another idea I had for the competition. One I knew wasn't going to take much. The idea was a for each node to be a post. Instead of using tags to connect similar posts you would just connect them with edges. It seemed simple enough - so off I went.


The Doing

I'd not done any real web applications in python before - I few toy django applications, but django (and rails for that matter) always feel a bit heavyweight for my liking (that's a topic for another post) so I was looking forward to using flask. The application took shape quite quickly. I spent a few hours adding users and admin pages - but I felt this began to detract from the aim. My intention from the start was to keep the application simple, I felt that an application that would be shared for other people to clone from should be as small as possible. I wanted others to be able to understand the application in under ten minutes, by removing admin pages and users I managed to get rid of about half the code until I was down to an application that could do 3 things:

  • Add a post
  • Link a post to another one
  • View all posts
Not especially ground breaking or shippable - but ok I believe as an example.

What would I do different next time?

When I started, I wasn't totally sure how everything was going to end up, so I decided to play safe and use a language I was familiar with. Looking back at it, I wish I had taken the chance and written it in clojure, I think this would have been an ideal opportunity to play more with clojure.

Something that didn't occur to me until after I deployed and I'm considering adding (I really should do at some point) is that when two people link a post two edges are created. I think instead an edge should have a weighting, and each user that creates a connection adds to the weight. You could then display similar posts in order of similarity. This idea is playing into some other work I'm doing - but I should really add to this one.

Summary

I had lots of fun doing this. I learned a bit more about writing web apps. I learned a bit more about git. and I learned how to use neo4j, all in all, not bad for a few days work.

References:

Neo4j Challenge

Neo Blog Entry

Neo Blog Source Code