Jekyll2023-03-07T23:50:12+00:00/feed.xmlMichael BertolacciResearch Fellow at the University of Wollongong
Some R packages: armspp and WoodburyMatrix2020-11-05T00:00:00+00:002020-11-05T00:00:00+00:00/2020/11/05/r-packages<p>Over the past year I’ve published two R packages that I’d like to highlight in this blog post.</p>
<!--more-->
<p>The first is armspp (<a href="https://github.com/mbertolacci/armspp">github</a>, <a href="https://CRAN.R-project.org/package=armspp">CRAN</a>), which provides an efficient Rcpp implementation of the Adaptive Rejection Metropolis Sampling (ARMS) algorithm. The algorithm can be called from R with a user-specified target log density, or it can be called from C++ directly through inclusion of the header-only implementation.</p>
<p>The second package is WoodburyMatrix (<a href="https://github.com/mbertolacci/WoodburyMatrix">github</a>, <a href="https://cran.r-project.org/package=WoodburyMatrix">CRAN</a>). It provides a hierarchy of classes and methods for manipulating matrices formed implicitly from the sums of the inverses of other matrices, a situation commonly encountered in spatial statistics and related fields. It makes it easy to use the Woodbury matrix identity and the matrix determinant lemma to allow computation (e.g., solving linear systems) without having to form the actual matrix.</p>
<p>There is a more general idea here, which is to define S4 classes to encapsulate implicitly formed matrices. For example, one could make a matrix-product class to encapsulate the product of two matrices \(AB\) without explicitly calculating the product. Then the class could provide operations like <code class="language-plaintext highlighter-rouge">%*%</code> and <code class="language-plaintext highlighter-rouge">solve</code>, just as in the WoodburyMatrix package. This could be advantageous in a few situations, for example when \(A\) is dense and \(B\) is sparse (or vice-versa), and the product may be dense. Or possibly the matrices could themselves be implicit. I don’t think such a package exists yet, so if I need it, I’ll write it and release it. But if you do so first, please tell me about it!</p>Over the past year I’ve published two R packages that I’d like to highlight in this blog post.Talk at MCM2019, and our rainfall paper published!2019-06-15T00:00:00+00:002019-06-15T00:00:00+00:00/2019/06/15/talk-and-paper<p>Last week I gave a talk about our rainfall work at <a href="https://mcm2019.unsw.edu.au">MCM2019</a>, a fantastic conference on Monte Carlo methods that was held in Sydney, Australia. The conference was awesome and the organisers deserve many kudos for their great work. I came away with lots of great Monte Carlo insights, and many ideas and techniques to try!</p>
<!--more-->
<p>The slides from my talk are <a href="/assets/2019-06-mcm-talk.pdf">available online here</a>.</p>
<p>Another great recent win is that our paper on the same topic recently hit the Annals of Applied Statistics, <a href="https://projecteuclid.org/euclid.aoas/1560758424">available here</a> (ungated version <a href="/assets/bertolaccietal2019.pdf">here</a>).</p>Last week I gave a talk about our rainfall work at MCM2019, a fantastic conference on Monte Carlo methods that was held in Sydney, Australia. The conference was awesome and the organisers deserve many kudos for their great work. I came away with lots of great Monte Carlo insights, and many ideas and techniques to try!Sampling sparse Gaussian Markov Random Fields in R with the Matrix package2018-03-22T00:00:00+00:002018-03-22T00:00:00+00:00/2018/03/22/sparse-gmrf<p>Over the past few months I’ve been involved in a fun project with <a href="https://andrewzm.wordpress.com/">Andrew Zammit Mangion</a> and <a href="https://niasra.uow.edu.au/cei/people/UOW202822">Noel Cressie</a> at the <a href="https://www.uow.edu.au/index.html">University of Wollongong</a>. This project involves inference over a large spatial field using a model with a latent space distributed as a multivariate Gaussian with a large and sparse precision matrix (it also involves me learning a lot from Andrew and Noel!). This is my first time working with sparse precision matrices, so I’ve been discovering many new things: what working in precision-space rather than covariance-space means, and how to draw samples from such models even when the number of data points is large. In this post I share a little of what I’ve learned, along with R code. A lot of what follows is derived from the <a href="https://folk.ntnu.no/hrue/GMRF-book/">excellent book</a> on this topic by Rue and Held.</p>
<!--more-->
<p>Let’s write</p>
\[\tilde{y} \sim N(\tilde{\mu}, Q^{-1}),\]
<p>where \(\tilde{y}\) and \(\tilde{\mu}\) are \(n\) element column vectors, and \(Q\) is a sparse \(n \times n\) precision matrix. The \(ij\)th entry of \(Q\), \(Q_{ij}\), has a simple interpretation:</p>
\[\mathrm{cor}(y_i, y_j \mid \tilde{y}_{-ij})
=
-\frac{Q_{ij}}{\sqrt{Q_{ii} Q_{jj}}},\]
<p>that is, the correlation between \(y_i\) and \(y_j\), conditional on all the other entries in \(\tilde{y}\), is in proportion to the \(ij\)th entry of the precision matrix. Then, if \(Q_{ij} = 0\), \(y_i\) and \(y_j\) are independent given all other entries, and the converse is also true. This is what leads to the interpretation of these systems as Gaussian Markov Random Fields (GMRFs): the ‘Gaussian’ part is (hopefully) obvious, while the ‘MRF’ part arises by constructing a graphical model with nodes labelled from 1 to \(n\), interpreting the non-zero entries \(Q_{ij}\) as indicating that an edge exists between nodes \(i\) and \(j\) (so they are neighbours), and noting that, conditional on its neighbours, a node is independent of its non-neighbours - the Markov property.</p>
<p>A really simple example of a model that can be cast this way is an AR(1) process, where \(y_t = \rho y_{t - 1} + \epsilon_t\), with \(\epsilon_t\) i.i.d. standard normal. For this model, the conditional correlations are \(\rho\) for adjacent entries and zero otherwise. The precision matrix is</p>
\[Q = \begin{pmatrix}
1 & -\rho & & & & \\
-\rho & 1 + \rho^2 & -\rho & & & \\
& -\rho & 1 + \rho^2 & -\rho & & \\
& & \ddots & \ddots & \ddots & \\
& & & -\rho & 1 + \rho^2 & -\rho \\
& & & & -\rho & 1
\end{pmatrix},\]
<p>which is very sparse for large \(n\). The covariance matrix, by constrast, is not at all sparse.</p>
<p>So that’s the interpretation of \(Q\), and one reason why working in precision space is valuable. How to sample \(\tilde{y}\)? The usual way, the one I was taught by <a href="http://handbooks.uwa.edu.au/unitdetails?code=STAT4063">my PhD supervisor</a>, is to construct the Cholesky decomposition, \(Q = L L^T\), draw \(\tilde{z} \sim N(0, I_n)\), and then set \(\tilde{y} = \tilde{\mu} + L^{-T} \tilde{z}\). This works, because, as <a href="https://en.wikipedia.org/wiki/Multivariate_normal_distribution">Wikipedia</a> tells us, an affine transformation \(\tilde{c} + B\tilde{x}\) with \(\tilde{x} \sim N(\tilde{a}, \Sigma)\) has distribution \(N(\tilde{c} + B\tilde{a}, B \Sigma B^T)\), and in our case this means that \(\tilde{y}\) has mean \(\tilde{\mu}\), and covariance</p>
\[L^{-T} I_n L^{-1} = L^{-T} L^{-1} = (LL^T)^{-1} = Q^{-1}\text{.}\]
<p>It turns out that there are Cholesky decomposition algorithms that are efficient for sparse matrices, but there is a catch. Consider the following sparse 100x100 precision matrix with just 442 non-zero entries:</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">library</span><span class="p">(</span><span class="n">Matrix</span><span class="p">)</span><span class="w">
</span><span class="n">str</span><span class="p">(</span><span class="n">Q_100</span><span class="p">)</span><span class="w">
</span><span class="c1">## Formal class 'dsCMatrix' [package "Matrix"] with 7 slots</span><span class="w">
</span><span class="c1">## ..@ i : int [1:442] 0 1 0 2 3 4 3 5 5 6 ...</span><span class="w">
</span><span class="c1">## ..@ p : int [1:101] 0 1 2 4 5 6 8 10 11 14 ...</span><span class="w">
</span><span class="c1">## ..@ Dim : int [1:2] 100 100</span><span class="w">
</span><span class="c1">## ..@ Dimnames:List of 2</span><span class="w">
</span><span class="c1">## .. ..$ : NULL</span><span class="w">
</span><span class="c1">## .. ..$ : NULL</span><span class="w">
</span><span class="c1">## ..@ x : num [1:442] 8 8 -0.96 8 5 ...</span><span class="w">
</span><span class="c1">## ..@ uplo : chr "U"</span><span class="w">
</span><span class="c1">## ..@ factors : list()</span><span class="w">
</span></code></pre></figure>
<p>where the length of the <code class="language-plaintext highlighter-rouge">@i</code> entry gives the number of non-zero values. Here I am using the <a href="https://cran.r-project.org/web/packages/Matrix/index.html">Matrix package</a>, which is very well engineered and has tons of useful sparse matrix classes and functions. We can use <code class="language-plaintext highlighter-rouge">image</code> to visualise the sparsity pattern:</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">image</span><span class="p">(</span><span class="n">Q_100</span><span class="p">)</span></code></pre></figure>
<div style="text-align: center">
<img src="/assets/2018-03-22-sparse-gmrf-Q.png" width="480" height="440" />
</div>
<p>The direct Cholesky decomposition of this matrix is</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">chol_Q_100</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">t</span><span class="p">(</span><span class="n">chol</span><span class="p">(</span><span class="n">Q_100</span><span class="p">))</span><span class="w">
</span><span class="n">str</span><span class="p">(</span><span class="n">chol_Q_100</span><span class="p">)</span><span class="w">
</span><span class="c1">## Formal class 'dtCMatrix' [package "Matrix"] with 7 slots</span><span class="w">
</span><span class="c1">## ..@ i : int [1:2403] 0 2 8 11 24 33 40 55 91 1 ...</span><span class="w">
</span><span class="c1">## ..@ p : int [1:101] 0 9 18 29 35 44 51 62 71 84 ...</span><span class="w">
</span><span class="c1">## ..@ Dim : int [1:2] 100 100</span><span class="w">
</span><span class="c1">## ..@ Dimnames:List of 2</span><span class="w">
</span><span class="c1">## .. ..$ : NULL</span><span class="w">
</span><span class="c1">## .. ..$ : NULL</span><span class="w">
</span><span class="c1">## ..@ x : num [1:2403] 2.828 -0.339 -0.339 -0.339 -0.339 ...</span><span class="w">
</span><span class="c1">## ..@ uplo : chr "L"</span><span class="w">
</span><span class="c1">## ..@ diag : chr "N"</span><span class="w">
</span><span class="n">image</span><span class="p">(</span><span class="n">chol_Q_100</span><span class="p">)</span></code></pre></figure>
<div style="text-align: center">
<img src="/assets/2018-03-22-sparse-gmrf-Q-chol.png" width="480" height="440" />
</div>
<p>which has 2403 non-zero entries, around 6 times less sparse the original matrix. In general there is no guarantee that the Cholesky decomposition of a sparse matrix will be particularly sparse.</p>
<p>However, all is not lost. If one permutes the indices of \(\tilde{y}\), the precision matrix of the permuted vector is just \(Q\) with rows and columns permuted the same way. It turns out that this can often be done in such a way that the Cholesky decomposition of the permuted precision matrix is much sparser than that of the original matrix. Algorithms that find these permutations are called <a href="https://en.wikipedia.org/wiki/Minimum_degree_algorithm">minimum degree algorithms</a>, but the problem in general is NP-hard, so that finding an optimal permutation is infeasible. Still, fast approximate algorithms exist and work well, and are also available in the Matrix package:</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">chol_Q_100_permuted</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">Cholesky</span><span class="p">(</span><span class="n">Q_100</span><span class="p">,</span><span class="w"> </span><span class="n">LDL</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w"> </span><span class="n">perm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)</span><span class="w">
</span><span class="n">str</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">)</span><span class="w">
</span><span class="c1">## Formal class 'dCHMsimpl' [package "Matrix"] with 10 slots</span><span class="w">
</span><span class="c1">## ..@ x : num [1:932] 1.732 -0.268 -0.339 -0.268 2.22 ...</span><span class="w">
</span><span class="c1">## ..@ p : int [1:101] 0 4 9 15 21 29 38 46 54 62 ...</span><span class="w">
</span><span class="c1">## ..@ i : int [1:932] 0 1 5 6 1 2 5 6 7 2 ...</span><span class="w">
</span><span class="c1">## ..@ nz : int [1:100] 4 5 6 6 8 9 8 8 8 6 ...</span><span class="w">
</span><span class="c1">## ..@ nxt : int [1:102] 1 2 3 4 5 6 7 8 9 10 ...</span><span class="w">
</span><span class="c1">## ..@ prv : int [1:102] 101 0 1 2 3 4 5 6 7 8 ...</span><span class="w">
</span><span class="c1">## ..@ colcount: int [1:100] 4 5 6 6 8 9 8 8 8 6 ...</span><span class="w">
</span><span class="c1">## ..@ perm : int [1:100] 78 97 62 53 85 51 83 43 25 52 ...</span><span class="w">
</span><span class="c1">## ..@ type : int [1:4] 2 1 0 1</span><span class="w">
</span><span class="c1">## ..@ Dim : int [1:2] 100 100</span><span class="w">
</span><span class="n">P</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">as</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="s1">'pMatrix'</span><span class="p">)</span><span class="w">
</span><span class="n">image</span><span class="p">(</span><span class="n">P</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="n">Q_100</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="n">t</span><span class="p">(</span><span class="n">P</span><span class="p">),</span><span class="w"> </span><span class="n">main</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Q (permuted)'</span><span class="p">)</span><span class="w">
</span><span class="n">image</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="n">main</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Cholesky'</span><span class="p">)</span></code></pre></figure>
<div style="text-align: center">
<img src="/assets/2018-03-22-sparse-gmrf-Q-permuted.png" width="740" height="370" />
</div>
<p>The Cholesky of the permuted system is only twice as dense as the precision matrix, with 932 non-zero entries versus 442 in \(Q\). Mathematically, this permuted decomposition can be written as</p>
\[Q = P^T L L^T P\text{,}\]
<p>where \(P\) is a permutation matrix (for which, handily, \(P^{-1} = P^T\)). A simple rearrangement gives \( L L^T = P Q P^T \), showing that \(L\) factorises the permuted \(Q\). In the implementation in the <code class="language-plaintext highlighter-rouge">Cholesky</code> function in the Matrix package, the matrix \(P\) is found using heuristics in the <a href="http://faculty.cse.tamu.edu/davis/suitesparse.html">CHOLMOD</a> library, which seem to do a good job most of the time. Now, finally, returning to the problem of sampling using a sparse precision matrix, we can again draw \(\tilde{z} \sim N(0, I_n)\), and then set \(\tilde{y} = \mu + P^T L^{-T} \tilde{z}\), which works because the resulting samples have the covariance matrix</p>
\[P^T L^{-T} I_n L^{-1} P = P^T L^{-T} L^{-1} P = (P^T LL^T P)^{-1} = Q^{-1}\text{,}\]
<p>exactly as desired. In R this can be implemented (assuming \(\tilde{\mu} = 0\)) as</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">z</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rnorm</span><span class="p">(</span><span class="n">nrow</span><span class="p">(</span><span class="n">Q_100</span><span class="p">))</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">as.vector</span><span class="p">(</span><span class="n">solve</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="n">solve</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w">
</span><span class="n">z</span><span class="p">,</span><span class="w">
</span><span class="n">system</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Lt'</span><span class="w">
</span><span class="p">),</span><span class="w"> </span><span class="n">system</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Pt'</span><span class="p">))</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">y</span><span class="p">)</span><span class="w">
</span><span class="c1">## [1] -0.010436811 -0.104921003 0.416806878 0.014558426 -0.325958512 0.421416694</span><span class="w">
</span><span class="c1">## [7] 0.017556657 0.294846807 0.342143599 0.584348518 -0.745948125 0.502591827</span><span class="w">
</span><span class="c1">## [13] -0.211289349 -0.530267664 -0.492578588 0.255440512 -0.033373118 0.543754332</span><span class="w">
</span><span class="c1">## [19] -0.359336565 -0.244953719 -0.402822998 0.081855516 0.253129386 0.205448992</span><span class="w">
</span><span class="c1">## [25] 0.429277080 -0.570717950 -0.355061101 -0.367764418 0.547808516 0.006163957</span><span class="w">
</span><span class="c1">## [31] 0.547535317 0.263772641 0.081585252 0.358314917 -0.684981518 0.349907450</span><span class="w">
</span><span class="c1">## [37] 0.118787977 0.736998466 0.291061633 1.014721231 -0.654090497 0.076018863</span><span class="w">
</span><span class="c1">## [43] -0.242907011 -0.535462456 -0.604620123 0.067914043 0.794672200 0.012960978</span><span class="w">
</span><span class="c1">## [49] 0.760320360 -0.624262194 -0.009172130 0.125357591 0.708511268 -0.256838400</span><span class="w">
</span><span class="c1">## [55] 1.230920479 -0.025501688 -0.282647795 -0.516675265 -0.156191890 -0.030417522</span><span class="w">
</span><span class="c1">## [61] -0.778278611 -0.625331836 0.452920865 0.131189388 -0.380328115 0.390079796</span><span class="w">
</span><span class="c1">## [67] -0.076916683 -1.042158717 0.243373908 -0.364218763 0.440914464 -0.099416308</span><span class="w">
</span><span class="c1">## [73] 0.353931288 -0.197764896 0.289573501 -0.340746684 0.126392280 0.645720329</span><span class="w">
</span><span class="c1">## [79] 0.307557118 0.135445659 0.358892563 0.275572959 0.221368375 0.800978241</span><span class="w">
</span><span class="c1">## [85] -0.306959557 0.111877324 -0.245831320 0.281856754 -0.183687867 -0.132821530</span><span class="w">
</span><span class="c1">## [91] -0.241247800 0.068393000 -0.089732671 -0.191843241 -0.313567706 -0.186392786</span><span class="w">
</span><span class="c1">## [97] 0.656691494 -0.083198611 -0.160093445 0.377602280</span><span class="w">
</span></code></pre></figure>
<p>and the sample can be visualised as</p>
<div style="text-align: center">
<img src="/assets/2018-03-22-sparse-gmrf-Q-sample.png" width="480" height="345" />
</div>
<p>which shows a fair amount of correlated structure (at least it does to me).</p>
<p>As a programming aside, the <code class="language-plaintext highlighter-rouge">chol_Q_100_permuted</code> object produced by the <code class="language-plaintext highlighter-rouge">Cholesky</code> function is an S4 object of class <code class="language-plaintext highlighter-rouge">CHMfactor</code> that contains both the permutation matrix \(P\) and the decomposition \(L\). You can extract these like so:</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">P</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">as</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="s1">'pMatrix'</span><span class="p">)</span><span class="w">
</span><span class="n">L</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">as</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="s1">'Matrix'</span><span class="p">)</span></code></pre></figure>
<p>and manipulate them directly, but it’s generally more efficient to use the <code class="language-plaintext highlighter-rouge">solve</code> method associated with the <code class="language-plaintext highlighter-rouge">CHMfactor</code> class:</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="c1"># Calculates solve(Q_100, b), the solution to the original matrix system:</span><span class="w">
</span><span class="n">solve</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">system</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'A'</span><span class="p">)</span><span class="w">
</span><span class="c1"># Calculates L %*% b:</span><span class="w">
</span><span class="n">solve</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">system</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'L'</span><span class="p">)</span><span class="w">
</span><span class="c1"># Calculates t(L) %*% b:</span><span class="w">
</span><span class="n">solve</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">system</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Lt'</span><span class="p">)</span><span class="w">
</span><span class="c1"># Calculates P %*% b:</span><span class="w">
</span><span class="n">solve</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">system</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'P'</span><span class="p">)</span><span class="w">
</span><span class="c1"># Calculates t(P) %*% b:</span><span class="w">
</span><span class="n">solve</span><span class="p">(</span><span class="n">chol_Q_100_permuted</span><span class="p">,</span><span class="w"> </span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">system</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Pt'</span><span class="p">)</span></code></pre></figure>
<p>These use fast CHOLMOD routines that are generally faster than extracting the raw internals, and as a bonus they avoid the extra copying associated with that extraction.</p>Over the past few months I’ve been involved in a fun project with Andrew Zammit Mangion and Noel Cressie at the University of Wollongong. This project involves inference over a large spatial field using a model with a latent space distributed as a multivariate Gaussian with a large and sparse precision matrix (it also involves me learning a lot from Andrew and Noel!). This is my first time working with sparse precision matrices, so I’ve been discovering many new things: what working in precision-space rather than covariance-space means, and how to draw samples from such models even when the number of data points is large. In this post I share a little of what I’ve learned, along with R code. A lot of what follows is derived from the excellent book on this topic by Rue and Held.New R package on Github - climatedata2017-07-06T00:00:00+00:002017-07-06T00:00:00+00:00/2017/07/06/climatedata<p>I’ve recently put an R package on Github, <a href="https://github.com/mbertolacci/climatedata/">climatedata</a>, that I’ve put together
as part of my PhD work. At present, the idea is to help the package user
download climate index data, either directly or as calculated from source data.</p>
<!--more-->
<p>At this point, there are three indices available to download:</p>
<ul>
<li>the Indian Ocean Dipole, via the Dipole Mode Index provided by <a href="http://www.jamstec.go.jp/frsgc/research/d1/iod/iod/dipole_mode_index.html">JAMSTEC</a>;</li>
<li>the Southern Oscillation Index, as provided by the <a href="http://www.bom.gov.au/climate/current/soi2.shtml">Australian Bureau of Meteorology</a>; and</li>
<li>the Southern Annular Mode, either the <a href="https://legacy.bas.ac.uk/met/gjma/sam.html">Marshall index</a>, or calculated from <a href="http://www.metoffice.gov.uk/hadobs/hadslp2/">HadSLP2</a>.</li>
</ul>
<p>I expect to add more functionality in dribs and drabs as time passes. The package is not yet submitted to CRAN, but it can be installed from Github using devtools:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>devtools::install_github('mbertolacci/climatedata')
</code></pre></div></div>
<p>Feel free to reach out to me if this package is useful to you or you have anything you’d like to add to it—or, even better, send a pull request!</p>I’ve recently put an R package on Github, climatedata, that I’ve put together as part of my PhD work. At present, the idea is to help the package user download climate index data, either directly or as calculated from source data.Attending NIPS 20162016-11-29T00:00:00+00:002016-11-29T00:00:00+00:00/2016/11/29/nips2016<p><img src="/assets/rainfall-component-probabilities.png" /></p>
<p>This is just a quick note to say I’ll be at <a href="https://nips.cc/">NIPS</a> in Barcelona this year presenting a poster on ‘Bayesian mixture models for multivariate time series with an application to Australian rainfall data’ as part of the <a href="https://sites.google.com/site/nipsts2016/">NIPS Time Series Workshop</a>.</p>
<!--more-->
<p>I’m super excited! Here’s a little preview of the poster:</p>
<p><img src="/assets/poster.png" /></p>This is just a quick note to say I’ll be at NIPS in Barcelona this year presenting a poster on ‘Bayesian mixture models for multivariate time series with an application to Australian rainfall data’ as part of the NIPS Time Series Workshop.