JOSHUA HOLBROOK github : twitter

AGU Day 1 (posted 13 Dec 2010)


##Let’s get some whining out of the way

First off: My back is killing me from all this walking and perhaps the air mattress I slept on. Like father like son, I guess. Also, I need to walk more. Oh, and Aleve don’t impress me much.

Impressions of San Francisco

The weather is amazing, and BART is pretty rad. However: The crazy homeless people that are allowed to flourish under SF’s indiscriminate, treat-the-symptoms safety nets (and gorgeous weather)? Not so much. One consequence of living somewhere so unliveable for most of the year is that being a panhandler is actually way more difficult than any Bona Fide job out there. I’m also not used to the “You’re not allowed to do this” signs or the scarcity of coffee shops (not as bad as Austin though).

Posters!

I really like the posters. To be honest, my ADD keeps me from being able to concentrate on an entire talk, so posters are way more my speed.

I had two favorite posters:

###Cloud Base Height and Wind Speed Retrieval through Digital Camera Based Stereo Vision ####Fernando Janeiro, Frank Wagner, Pedro Ramos

I thought this was really cool. These guys used off-the-shelf digital cameras to take pictures of clound features simultaneously (Fernando said they were within a millisecond, though that strikes me as overkill), and then used image processing to triangulate the cloud positions in 3-space. By doing this over time, they were also able to calculate wind speeds. I thought it was really clever, and the fact that doing something like this is comparatively cheap definitely appeals to me. In fact, had I the time and the money I would love to try replicating what they’ve done.

One of the limitations of their method is that they use USB directly to interact with the cameras, which limited their camera spacing to about 50 m. This, in turn, effects the accuracy of their setup poorly. One way to potentially deal with this that I thought of was to hook each camera up to some sort of small networked server (perhaps running on a beagleboard), and control the cameras over tcp or something similar. I have to admit, my first thought was dnode, though there are of course many other (more conventional) ways of doing this. I also wouldn’t mind seeing the image processing done with python and scikits.image instead of MATLAB. ;) At any rate: Great science, really cool idea.

###Ultra-Deep Subduction of Continental Material: Results from Coupled Thermodynamic-Thermomechanical Numerical Modeling!! ####Sergio Zlotnik and Juan Carlos Afonso

  1. Craziest, catchiest graphics out of any poster today. By far.

  2. I’m a sucker for 3-D simulation stuff like this. It turns out they’re using a “Lagrangian particle-in-cell finite element scheme.” I wonder if Dr. Lee could explain this for me?

  3. There’s a series of screenshots of (a simulation of) a miles-long chunk of rock dangling and then breaking off into the earth’s magma on this poster. How intense is that?!

###My Thoughts On The Informatics Section

Besides learning that informatics is a word, I also found interest in this because of my (limited) knowledge of webstuff from hanging out with the node.js and startup crowds.

So, first, here’s my impression of the current Big Problem in geophysical informatics: You have mountains of data. For example, grib files used for things like puff can be upwards of a gigabyte apiece, and one usually has tons of them.

  • How do you store them?
  • How do you sort them?
  • How do you catalog them?
  • How do you retrieve them?
  • How do you make heads or tails of any of this stuff??

As evidenced by all the posters on the subject, it’s a currently unsolved problem. There are various bits and pieces to work with, of course. For example, we have CDF and NetCDF, grib, the HDF formats, shapefiles, all all sorts of XML schemas. However, while there are plenty of ad-hoc solutions in-place for dealing with the organization, collection, etc. of these files, there doesn’t seem to be any sense of canonical methodology about it, besides “use these formats.” Part of this is almost definitely due to the fact that, well, making a one-size-fits-all solution for highly disparate data isn’t a simple thing to do! In fact, it may not even be a good idea!

I’m not about to claim that I know something these guys don’t, but I also have the feeling that people in various web3.0-ish businesses might know a thing or two about storing and handling large datasets. After all, social networking means tracking a social graph in one sense or another, and doing search means extracting gargantuan sets of associations out of a dataset that’s basically the size of the entire internet. I guess my ideas are:

  1. Can any of the big nosql databases help with the problem? How are things like mongoDB and Redis being used already?
  2. Why doesn’t anybody use JSON or YAML for scientific data?

I also secretly (or not-so-secretly) want to see node.js used by the scientific community (we’ll need some GSL bindings or something first, of course). >:)