Last week I helped migrating a small Riak production cluster from 0.14.2 to 1.2. I’d like to share the experiences regarding the preparation and testing of the process.
Upgrading straight from 0.14.2 to 1.2 is possible if you are willing to take an additional risk. And, as always, you should be very cautions about your steps. When in doubt, ask the guys on the ML, or do the detour via 1.1.4.
Basho does not recommend to upgrade straight from 0.14.2 to 1.2, but instead to take a detour via 1.1.4 first while following all steps of the rolling upgrade guide. According to the Basho guys (see postings on the riak-users mailing list) they didn’t test this upgrade path but they also left some compatibility code around so that it should actually work. The main issue is that
riak-admin transfers isn’t going to work, because there were changes to the underlying cluster system in 1.0 and again in 1.2. The hint from the upgrade page to run
riak_core_node_watcher:service_up(riak_pipe, self()). on pre-1.0 nodes isn’t going to help either.
Please make absolutely sure that you test the upgrade process with your specific settings. And when you are in any doubt, do the upgrade as Basho recommends: 0.14.2 to 1.1.4 to 1.2.
The Riak cluster in question was a rather small one. Only basic read/write operations are used, no search, no map/reduce. The current workload on the cluster was also quite moderate. I thought that under these circumstances I should give the straight upgrade process a try, especially because the capability negotiation in Riak 1.2 should make the process a lot simpler.
The cluster is running on EC2 under Ubuntu 11.04 (natty). A staging cluster with the same configuration is running also on EC2 which made it pretty simple to test things first. While trying to figure out the steps and getting a good feeling that everything will work out, I upgraded and downgraded the cluster a couple of times without any problems. Backing up the ring files (by default located at
/var/run/riak/ring) is a good idea in case you want to be able to downgrade. Since we were only using bitcask and the on-disk format of bitcask did not change, we skipped creating additional backups for the staging cluster.
The following steps can be used against your staging as well as your production cluster. But, I already said, make sure you tested everything a couple of times.
Let’s get started! First, pick the correct version of Riak from Basho’s download page. In our case this is the Ubuntu natty version of 1.2. Make sure that you create a backup of your ring data as well of the current configuration. The backup of the ring is important for downgrading the node if necessary.
Stop Riak (
riak stop) before creating a backup of your ring and configuration files:
Now install Riak 1.2:
In our case, I choose to override the config files to get the latest version of the package maintainer. Since the configuration of the stage and production system are almost identical, I just copied a prepared version, changed the node name in
vm.args and the cluster name in
app.config. We also had to increase the
nofiles limit for the
riak user, since we activated the LevelDB backend which requires more available file handles.
Now it is time to start Riak again (
riak start) and check if the node is coming up properly. We tried
riak-admin wait-for-service riak_kv riak@HOST as suggested by the rolling upgrade wiki page). For some reason this constantly said, that
riak_kv is not up duo to
riak_pipe not being ready, BUT
riak_kv seems to be operational nonetheless.
riak_pipe was introduced in Riak 1.0 and therefore disabled by the legacy mode of the cluster. This leads to the inaccurate statement that
riak_kv is not up.
You can check, if your newly upgraded node is using the correct capabilities by attaching to the Riak console (
riak attach) and run:
riak_core_capability:all().. The output should look something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
If you are done, hit
CTRL+D to detach from the console again.
riak-admin transfers does not work duo to the cluster system changes between 0.14.2 and 1.2, but
riak-admin member-status and
riak-admin ring-status should work.
member-status should mark your new node as
valid and the old ones as
member-status information is your friend while running the upgrade process.
ring-status should confirm that you are running the cluster in legacy mode.
We were a bit confused, why the new Riak node was running compaction (bitcask merging). It turned out that this was normal, as we never hit the compaction triggers before and the restart of Riak started it. We decided to let the node finish the slightly IO-heavy operations first, before continuing with the next node.
After we updated the last note, we noticed that
riak-admin wait-for-service riak_kv riak@HOST was working as expected (eventually Riak noticed that the legacy mode was no longer required).
riak-admin member-status showed that all cluster members where
riak-admin transfers started to work again and reported some hinted handoff transfers.
We did not encounter any critical issues and according to our application logs all operations against the cluster were successfully performed while the upgrade was done.
During the upgrade process we noticed that calls to the
/stats endpoint weren’t running successfully (the monitoring system tried to collect stats). I suspected that this is also duo to the legacy mode of the system and
riak_pipe being disabled (@tsantero confirmed that, thanks).
While running riaknostic via
riak-admin diag on the upgraded Riak nodes I got:
The following preflists do not satisfy the n_val. After a bit of googling and asking on the IRC channel I found the answer: it means, that there are replicas on the same physical node. In our case this is most likely because of the small node count and a less solid ring claim algorithm in Riak 0.14.2.
@evanmcc noted on IRC that we need to be a bit careful when adding another node to our cluster. It turns out that, when you have a small cluster (<5 nodes) and 256 partitions, Riak does not forcefully apply the new ring claim algorithm, because this would be too expensive. So if you have a small cluster and you upgrade to 1.2, you have to keep in mind that adding a new node will cause a more expensive one-time rebalancing of your partitions. If you stage the node join,
riak-admin cluster plan will show you what is going to happen.
All in all the upgrade went pretty smoothly. With some testing upfront and the very supportive help of Basho the upgrade wasn’t a big problem. Thanks for that, Basho, you rock!