Open Data

Cool open-source models? I'm looking to develop my idea of open models, which I motivated here and started to describe here.

I wrote the post in March 2012, but the need for such a platform has only become more obvious. I’m lucky to be working with a super fantastic python guy on this, and the details are under wraps, but let’s just say it’s exciting. So I’m looking to showcase a few good models to start with, preferably in python, but the critical ingredient is that they’re open source.

They don’t have to be great, because the point is to see their flaws and possible to improve them. For example, I put in a FOIA request a couple of days ago to get the current teacher value-added model from New York City.A friends of mine, Marc Joffe, has an open source municipal credit rating model. The idea here is to get the model, not necessarily the data (although even better if it can be attached to data and updated regularly). Like this: Like Loading... Open Models (part 2) In my first post about open models, I argued that something needs to be done but I didn’t really say what.

This morning I want to outline how I see an open model platform working, although I won’t be able to resist mentioning a few more reasons we urgently need this kind of thing to happen. The idea is for the platform to have easy interfaces both for modelers and for users. I’ll tackle these one at a time. Modeler Say I’m a modeler. It then asks for the data and I either upload the data or I give it a url which tells the platform the location of the data. Next, I specify the extent to which the data needs to stay anonymous (hopefully not at all, but sometimes in the case of medical data or something, I need to place security around the data). Finally, I specify which parameters in my model were obvious “choices” (like tuning parameters, or prior strengths, or thresholds I chose for cleaning data). User Now say I’m a user. I can also change the model more fundamentally. Open-data et format -

Il n’y a pas de doute, l’open-data est un sujet qui a une place importante aujourd’hui dans l’éco-système numérique.

Open-data et format -

Il fait un peu suite, à mon avis, à la montée en puissance de l’open-source. L’édition numérique, aujourd’hui, est menée par ces logiciels libres, qu’ils soient moteurs de blogs, CMS, logiciels … J’essaye de faire le parallèle entre ces 2 choses car l’open-data a de quoi apprendre du logiciel libre. Apprendre des erreurs commises par ce dernier, qu’il ne faut pas (si c’est encore possible), reproduire. Tout d’abord, l’essence même d’un logiciel libre, tout comme pour l’open-data, c’est la communauté qu’il fédère autour de lui. Pour que l’utilisation soit aisée, il est impératif que le format utilisé possède une base solide de structuration.

