Python as a “Platform” for the Atmospheric Sciences

Editor’s note: In this post, Tommy Zhang, a Ph.D. student working on tropical climate, shares about his experiences with using Python in a WRF modeling and analysis workflow.

In the atmospheric sciences, we already have many computational tools. If you want do numerical discretization, use a method from Numerical Recipes in Fortran or C. If you want to do some analysis, the library of NCL is super. If you want to slice-and-dice the data, GrADS‘s 4-D data structure will help you do the job.

We are not short of tools. Why should we add Python to our already long list?

Let me use my own workflow as a case study to answer this question. Recently, I needed to run the WRF model on a cluster. The process to do this involved: 1) downloading GCM data, 2) editing the configuration file, 4) running WPS to pre-process GCM data, 4) editing the configuration file, 5) running part of WRF to produce initial and boundary conditions, and 6) running another part of WRF as the simulation.

I could write some shell scripts to do the whole workflow, and it would work great; I can drink some coffee and give myself a break. However, ten days later, I analyse the result and find myself dissatisfied. Hoping to make some change to the setup and redo the work, I go back to the script. Although I know everything is contained in this long, long script and I need to do this, it’s painful to have to read it all again.

That’s one thing Python has saved me from. Python is more module-structured, and you can abstract the functions on several levels—function, module, class—and so it reads more logically. The strict format requirements, although peculiar on first glance, greatly increase the readability of the code. When reading Python code, it’s more like reading literature, and it is much easier to get the the idea of the code instead of struggling with the syntax.

In terms of functionality, whatever the shell script can do (e.g., execute system commands) can also be done using Python, but with greater cross-platform flexibility. First, while the Linux shell heavily relies on the utilities of Linux itself, and would presumably be cross-platform, nonetheless different computers may have different versions of these utilities. If I write a script using the Linux date command, and it runs successfully on one computer, it is still possible for another computer to give me a bug (and this really happened to me). Python integrates many utility packages and there is no need to use the OS utilities themselves. Second, with some versions of Python (I use EPD Python), it’s very easy to install the same Python on different computers and on different OS systems. Finally, even if you limit yourself to Linux only, there are several versions of shell (e.g., csh, sh, bash) which have different syntax, and some OS distributions do not have all the shells installed. If it’s your own personal PC, the fix is easy: go and download. If it’s a cluster, it’s not so easy: you have to persuade the possibly stubborn administrator. In short, Python is more cross-platform because it integrates most of the utility of shells within itself and it is very easy to maintain the same version of Python on all kinds of platforms and OSs.

So Python wins big in helping me make model runs. How about with the analysis of my results? If I try to run a long-period simulation, there will be a huge amount of output data. WRF provides some tools (e.g., Registry) to decrease the amount of outputs, but I want to do more, say some kind of average, interpolation, or integration. These requirements are usually done by some post-processing system such as NCL. But do I have to use NCL or can I use Python?

NCL has a better library for atmospheric science analysis than Python. However, as is written in the title of the NCL manual, NCL is its own “mini” language. Interaction with the post-processing system is very troublesome, and NCL’s string processing abilities are very basic. On the other hand, Python is good at managing files and has analysis tools, in the Scipy/Numpy library, though the scope of analysis tools is still not comparable to NCL. (I guess the Scipy community doesn’t receive enough support from atmospheric and climate science communities.)

So, based on my experience in utilizing Python in my workflow, I conclude that that I should use Python to organise my ideas, direct the structure of the modeling program, and interact with the system. For post-processing calculations, I can try to use Python but may need to have Python execute NCL to do specific calculations Python cannot do natively.

Python is easy to read, easy to maintain, and easy to use. It’s powerful and could replace shell completely. With Scipy/NumPy/matplotlib, Python can be used directly in scientific research. My only regret is that Python doesn’t have all the analysis tools for the atmospheric sciences. It would be great if the community could create these tools. However, what I want to emphasize here is that, rather than being a single tool, Python is more like a platform. Python enables you to use the computer in a high-level and low-level way, and since it’s so easy to read and write Python code, you can use it to do all computer-related stuff in a more concise and smarter way. The end result: higher productivity.

This entry was posted in Beginner, Featured Tips, Stories, Why Python?. Bookmark the permalink.
  • Damien Irving

    Thank you for this very useful post.

    You note that NCL has a better library for atmospheric science analysis than Python, which I completely agree with. However, have you used CDAT at all? I would imagine (I am primarily a CDAT user and therefore not very experienced with NCL) that the capabilities of the Python/CDAT libraries combined are not far short of NCL.

  • Felix Carrasco

    Hi! I’m recently starting my phd and I will work with WRF. It’s kind of nice read this, because I’d like to start learning Python and used it during my phd. I just have one question, What kind of things would you think that Python needs? Something like the statistical library in Matlab? or is it something else?

  • http://www.johnny-lin.com/ Johnny Lin

    Hi Felix!  Your question has a big answer.  If you’re interested in a list of the packages one should install for AOS computing, you should see one of the posts we have on installing Python (here’s the OS X one:  http://pyaos.johnny-lin.com/?p=190).  There are literally tens of thousands of packages for Python.  If you’re interested in statistics, you can even call the R statistical library from Python (via rpy2)!  You might want to check out our AOS-specific tutorials list for details on learning and using the language:  http://pyaos.johnny-lin.com/?page_id=217.  Come join the mailing list, and hope we can help! 

  • Gökhan Sever

    Hello, I am new to the WRF world as well. I was wondering how much of NCL WRF functionality exists [http://www.ncl.ucar.edu/Document/Functions/wrf.shtml] in the Python ecosystem? At first look, this simple 2d plotting script [http://www.mmm.ucar.edu/wrf/OnLineTutorial/Graphics/NCL/Examples/wrf_Hill2d.ncl]can be mapped to Python easily, except the wrf_user_intrp3d function. Any ideas if this is already implemented? Thanks.

  • http://www.johnny-lin.com/ Johnny Lin

    That’s a good question. My only experience with Python interpolators have been with those bundled with CDAT and NumPy/SciPy. You may also want to ask the PyAOS mailing list.

  • Juan Carlos Bazo

    Hi Johnny, is possible to post a installing manual UVCDAT in OS X or Maverics OS. Thanks

  • http://www.johnny-lin.com/ Johnny Lin

    Hi Juan! I think you’ll want to send an email to the UV-CDAT folks (e.g., their mailing list). They have binary installs for most OSes. Or, if you’d like to write an installing manual, I’d be happy to post it at PyAOS :). Thanks!

  • Juan Carlos Bazo

    Thanks Johnny!!

  • auvipy
  • http://www.johnny-lin.com/ Johnny Lin

    Very cool! I’ve added it to the Specialized AOS Packages listing. Thanks!