Monthly Archives: February 2009

Times Open

Update: The Times has written a short blog post about the event.

This past Friday I attended the Times Open, a day-long seminar hosted by the New York Times for developers interested in working with their newly-released APIs. Tim O’Reilly was the keynote speaker, and he gave an interesting presentation on the future of the newspaper (with slides). A few of the many notes I took (many of which I also posted on Twitter) –

  • “The future is here. It’s just not evenly distributed yet.” – William Gibson
  • “The central fact of web 2.0 is harnessing collective intelligence… Web 2.0 is about finding meaning in user-generated data, and turning that meaning into real services that you can deliver in real time.”
  • The network as platform means that competitive advantage goes to systems that harness network effects so that they get better as more people use them.
  • ‘The breakthrough on facebook that really got attention was the social graph.’
  • NYTimes needs to do more conversation on Twitter
  • “The bestowal of status” is a lot of what publishers do,” O’Reilly included, reminded me of Clay saying he had a ‘magic wand’ when he was guest blogging on Boing Boing and could link to things
  • Times People is currently a ghost town – just one more social network – How do API’s enable your content to be syndicated out
  • Fascinated by big comment streams – 414 comments might be manageable, Joe Biden thing with 35,000 comments “What do you do with 35,000 comments besides count them?”
  • We are moving out of the world in which people typing on keyboards will drive collective intelligence applications. Increasingly applications are driven by new kinds of sensors. (every laptop with accelerometer becoming an earthquake sensor through app that works similar to Seti-at-home)
  • “It’s remarkable how much of our future is going to be driven by information exhaust from the devices we carry around with us.”
  • Lessons from Twitter: do one thing and do it well, let others build on what you do, even if it appears to compete with you, when users innovate, support their behaviors in your platform( @, #, $)
  • *If you’re really building a platform, your customers and partners may be building new features before you do.*
  • Digg as parasitic on all news media, without a business model, devaluing old media
  • Quote from Alan Kay: “The best way to predict the future is to invent it” and “if there’s some feature you want, don’t wait for the Times to do it; hack it.”

Who are you?

After the keynote, the NYT developers presented the following APIs –

The one that seemed to have the most potential and significant uses was the Newswire API – it is “an up-to-the-minute stream of links and metadata for items published on NYTimes.com. [...'] Better than RSS, the Times Newswire API offers chronologically ordered cross-site results, including rich metadata.” I also thought ShifD had potential as a basis for building interesting things – it’s a tool for coherently shifting content and information between devices and contexts, and Ted Roden, one of the developers gave a cool demonstration of a quick little app that can track personal expense data in a spreadsheet using SMS messages. The Times People API also seemed to be promising, and it is built so that its content-sharing capability can be incorporated into that of other social networks. Derek Gottfrid, Senior Software Architect at the NYTimes, put it well: “Our goal is not to own the social graph – we actually have a pretty good news and information site.” Hopefully this becomes a trend in social networking websites – they should focus on applications or content (like Twitter does on communication, Flickr does with photos, or Times People does with articles) and allow themselves to be integrated into coherent user experiences. Facebook, in contrast, offers only limited integration between its own functionality and that of external sites, and keeps the user in a ‘walled garden’ cut off from the rest of the internet. I’m formulating a separate post on this topic for later.

The final presentation was from Jacob Harris, who works on the Interactive Newsroom Technologies Team. He described interactive news as “kind of like pornography – you know it when you see it, but it’s kind of hard to define” (haha), and listed three essential components: data, story, and interaction. He presented a few recent interactive pieces, including the presidential election map used in the 2008 election (which was the best and most informative available online), a confidential government document that had been leaked to the NYT and posted online with associated metadata and enhanced browsing ability, and an easily accessible database of the prisoners at Guantanamo Bay.

I had an idea about considering NYTimes.com as a one larger interactive experience, with articles and other media as the individual pieces of data, so I asked about it. Jacob said that his department ‘tried to stay small and do one off things, rather than deal with rearranging the homepage and interacting with 80 committees,’ which makes complete sense. But it reminded me that I had heard once of Google making tiny, pixel-level changes to the home page and using the behavior of huge numbers of people to determine what was the best design. I also remembered a presentation I heard at a NY Tech meetup several months ago from someone who worked at the Huffington Post – they have a real time traffic monitoring system to determine what articles on their main page are the most popular, allowing them to rearrange their content layout dynamically. I wonder if the NYT could use similar strategies to optimize their website?

IMG_5410
As a side note, Adam Harvey (above left), another first-year student from ITP, was at the event. Nathan (above right), whose blog I was reading last semester when I was starting to work with Processing and Scala, was there too.

And, as a former aspiring architect, I must comment that Renzo Piano’s New York Times Building was both well designed and well executed. The lobby was spacious and inviting, the (climbable) rods along the sides filtered the light nicely and cast interesting shadows, the finishes were attractive, the spaces were pleasant and flowed nicely, and the elevators (with the floor buttons in the waiting hallways) were very cool.

IMG_5903

Photos are from everyplace and Times Open on Flickr, thanks!

Mobile Tech for Social Change Barcamp

cross-posted at textonic.org

Last Saturday I attended the Mobile Tech for Social Change barcamp. From their website –

Mobile Tech 4 Social Change Barcamps are local events for people passionate about using mobile technology for social impact and to make the world a better place. Each event includes interactive discussions, hands-on-demos, and collaborations about ways to use, deploy, develop and promote mobile technology in health, advocacy, economic development, environment, human rights, citizen media, to name a few areas. Participants for Mobile Tech 4 Social Change barcamps include nonprofits, mobile app developers, researchers, donors, intermediary organizations, and mobile operators.

The event began with a talk (via Skype!) from Ethan Zuckerman, who I had seen speak previously in my Applications class last semester. He’s been involved in various service projects in Africa and is a co-founder of Global Voices Online, a community brought together by ‘bridgebloggers’ that translate posts between languages and cultures. His most salient and useful point was that mobile technology was most powerful when it was paired with another medium, such as FM radio.
mobiletech27

I went to three breakout sessions. The first was given by a few people from the Innovations Team at UNICEF and a couple of students from Columbia working on the aforementioned Malawi project. I had seen some of what they presented before, but got to play with the RapidAndroid version of the RapidSMS software (which runs on a G1 mobile phone), and I saw some sample database inputs and SMS form instructions. In addition, I learned that while initial SMS error rates have been high in the pilot studies, the system will respond asking the user to resend the message, and this feedback loop is effective at teaching users to send correctly formatted SMS messages. My group in the Design for UNICEF class will continue with our Mechanical Turk project – there will still be unparseable messages or messages that don’t get resent – but it will be good to keep this in mind as we develop.

In between sessions I saw a demo from an MIT PhD student named Nadav Ahrony at the Viral Communications group at the Media Lab. He was working on a not-yet-released general platform for development of wifi/bluetooth peer-to-peer mobile applications. He had built a demo application that would let people associate their phones with a particular group of phones, and then automatically sync content on these phones over an ad-hoc network. The most interesting use case he suggested: if a protester takes a photo with the device, and there is risk that the device might be confiscated, it will automatically be downloaded by the others in the group immediately after being taken, so even if the original device is lost the data is not.

The second breakout session was lead by Josh Nesbit, a current undergrad at Stanford graduating this year. He presented an SMS-based project he did in Malawi for hospitals and the surrounding villages that used FrontlineSMS, an alternative SMS platform (that isn’t necessarily comparable in aim to RapidSMS). More information on that project is available here.

The last breakout session was for mobile developers, and we had an interesting conversation about developing for Android. Overall I enjoyed the day and found it useful, and I’m looking forward to going to the next m4change barcamp.

mobiletech12
Photos are from Meredith Whitefield on Flickr, thanks!

Programming A to Z – Assignment #5 Markov v vokraM

The class preceding our fifth assignment was on Markov chains, which involve statistical models of text that can be used to predict what character (or word, or some other unit) would be likely to follow a preceeding series of n characters (or words or other units). (An interesting paper of such an algorithm from the assigned reading can be found here.) Our assignment, with a few suggestions, was simply to:

Modify, augment, or replace the in-class Markov chain example.

As presented by Adam, the MarkovFilter example looked at each series of, say, three characters to build a model of what the next character is most likely to be, and then used this model to generate new lines of text by starting with an initial series of three characters used by the actual text, choosing the next character based on the first three characters, choosing the character after that based on the new most recent three characters (the latter two from the first set of three, and the new fourth character), and so on.

It occurred to me that perhaps it was unnecessary that the algorithm examine the text as we read it, from left to right. Instead, I wanted to rewrite MarkovFilter.java to work backwards – starting with the last three characters of a line and working backwards instead of forwards, looking at the set of three characters at the (temporary) beginning of the line, choosing a new first character for the line from a modified statistical model about which character is most likely to precede them, and repeating until the line is of the desired length. VokramFilter.java represents this reversed Markov filter, and the zip file of all the classes is here.

The text this generated seemed approximately as similar to English as that generated using the forward looking method, and I considered for a while how to be sure this is the case. I suspected that the operation performed by backward-looking Vokram analysis was equivalent to reversing the text that was input to a forward-looking Markov algorithm and then reversing the result, and it seemed like that operation would do the same thing as the Markov algorithm on its own. Yet I couldn’t quite work out a more thorough proof of that intuition, and will see if Adam has any insights. I considered doing a comparative analysis of several large texts using both MarkovFilter and VokramFilter (by comparing the Markov analysis of a text generated from applying VokramFilter to an original text to a Markov analysis of that actual original text), but didn’t have a chance.

Programming A to Z – Assignment #4 Concordance Sorting

The class prior to the fourth assignment covered concordances, data structures for word counts, and other related topics. I completed one of the more challenging suggested alternative tasks for the assignment:

Investigate Java’s Collections class. See if you can figure out how to use Collections.sort() to sort the output of ConcordanceFilter.java—first in alphabetical order, then ordered by word count. (See the official Sun tutorial.)

The Java documentation was relatively clear about what I needed to do using the Collections and Comparator classes to get it working, and Google answered any remaining questions I had about syntax. There are a few files that I edited to run it (including AlphabeticComparator and a WordCountComparator classes), and you can download a zip file of the assignment here. When run, ConcordanceFilter.java will search for a word within a text and output each line on which that word occurred, then output those lines again in alphabetical order, and then output those lines a third time with the line with the fewest words first. For example:

$ java ConcordanceFilter place <lovecraft_dreams.txt
All contexts
remote place beyond the horizon, showing the ruin and antiquity of the city,
over a bridge to a place where the houses grew thinner and thinner. And it was
Contexts sorted alphabeticallydsfadsfd
over a bridge to a place where the houses grew thinner and thinner. And it was
remote place beyond the horizon, showing the ruin and antiquity of the city,
Contexts sorted by word count
remote place beyond the horizon, showing the ruin and antiquity of the city,
over a bridge to a place where the houses grew thinner and thinner. And it was

Design for UNICEF – RapidSMS and Mechanical Turk

cross-posted at textonic.org

This is my first post about Clay Shirky’s Spring 2009 class Design for UNICEF (syllabus). Our task is to design, build, and deploy solutions to improve the lives of people in Africa under the age of thirty. We’ve spent the first part of the semester in small groups iterating through a huge number of potential ideas, but now things have begun to solidify.

I’m excited to be working in a group with Thomas Robertson, Lina Maria Giraldo, Amanda Syarfuan, and Yaminie Patodia. Our project in a sentence: We plan to extend UNICEF’s existing RapidSMS platform and RapidSMS-based projects to use Amazon’s Mechanical Turk online task marketplace to provide automated correction of malformatted SMS database inputs.

RapidSMS (link 1, link 2) is a project developed by Evan Wheeler, Adam Mckaig and others in UNICEF’s Division of Communication. It is designed to be an extensible platform for sending and receiving SMS text messages using a computer server. Mobile phone penetration in Africa is relatively high and growing quickly, and SMS is a powerful tool that can be applied to a wide variety of UNICEF-type projects. It is particularly useful for quickly aggregating large amounts of data from the field; where previous methods required the tedious process of faxing in and compiling paper forms, mobile phones can be used to submit that data quickly via text message, and this ultimately allows coordinators to make better decisions about the allocation of limited resources. RapidSMS allows for automatic insertion of SMS messages into a centralized database, as well as the export of this data in human- and machine-readable formats (such as graphs and Excel files). It has already been deployed for a food supply distribution project in Ethiopia (link) and a child malnutrition monitoring project in Malawi (link).

One challenge for such SMS-based database input systems is the problem of malformed texts inputs – users won’t always know the proper message format or might be in a hurry and mis-type a key. It’s practically impossible to design a system that can handle all database inputs, so as a result valuable information gets thrown out, even though it is present in the messages. An actual person might be able to successfully parse many of these malformed messages and determine which pieces of information from the SMS goes in which database fields; UNICEF workers, however, generally have more pressing tasks while in the field.

We plan to extend the open-source RapidSMS system to have the functionality of automatically sending these malformed SMS database inputs to Amazon’s Mechanical Turk for conversion into proper database inputs. (Mechanical Turk is an online marketplace that automatically pairs tasks that are simple (yet too hard for a computer to do) with people who want to do them for money (often just a few cents).) This RapidSMS extension could then be integrated into existing projects mentioned above, making them more scalable and more effective.

I’ll post more as the project progresses throughout the semester, and please leave a comment with any feedback!