Reducing emoncms write load and a minimal python version of emoncms

Over the last few weeks I have been investigating how to reduce the write load of emoncms to explore if it might be possible to achieve long term logging on SD cards.

A brief summary

Through a mix of buffered writing of feed data to disk, reduction of the number of meta and average data files written to and use of the FAT filesystem, my home raspberrypi now has a write load of 0.4kb_wrtn/s down from 197kb_wrtn/s.. almost 500x less and the writes are all append only (there's no regular writing of the same part of a file again and again). The question is could this mean years of SD card logging rather than months?

Write buffering

A single Emoncms PHPFina (PHP Fixed Interval No averaging) or PHPTimeSeries datapoint uses between 4 and 9 bytes. The write load on the disk however is a bit more complicated than that. Most filesystems and disk's have a minimum IO size that is much larger than 4-9 bytes, on a FAT filesystem the minimum IO size is 512 bytes this means that if you try and write 4 bytes the operation will actually cause 512 bytes of write load. But its not just the datafile that gets written to, every file has inode meta data which can also result in a further 512 bytes of write load. A single 4 byte write can therefore cause 1kb of write load.

By buffering writes for as long as we can in memory and then writing in larger blocks its possible to reduce the write load significantly. The full investigation with a lot of benchmarking of different configurations including differences between FAT and Ext4 can be found here:

https://github.com/openenergymonitor/documentation/blob/master/BuildingBlocks/TimeSeries/writeloadinvestigation.md

A minimal version of emoncms in python

To make the investigation easier I simplified emoncms down to the core parts needed on a raspberrypi to get basic timeseries graphing: serial listener, node decoder, basic feed engine and a ui consisting of a nodelist and a single rawdata visualisation type and wrote the result in python making use of Jerome and Paul's (pb66) excellent work on the emonhub serial listener and node decoder.
Here's a diagram of the main components of the python app:


And some screenshots of what it looks like:

Node list:



Basic graphing:





The source code and an installation guide for this minimal python version of emoncms can be found here:

https://github.com/emoncms/development/tree/master/experimental/emon-py

Further development questions (copied from the end of the write load investigation page)

Is the reduced write load and longer SD card lifespan that might result from using the FAT filesystem worth the increased chance of data corruption from power failure that Ext4 might prevent?

It would be interesting to compare the performance of the FAT filesystem + 5 min application based commit time with the EXT4 filesystem with Journaling turned off and filesystem delayed allocation set to 5 min instead of write buffering in the application.


I posted on the EmonHub issues list about this last friday and I've been having a bit of a discussion there with Paul https://github.com/emonhub/emonhub/issues/48

I've also posted this on the forum's here:
http://openenergymonitor.org/emon/node/5387 To engage in discussion regarding this post, please post on our Community Forum.