Blog
OpenEnergyMonitor

The current emoncms feed storage implementation

Following on from the last blog post on emoncms.org server load and disk use, lets look at the current emoncms implementation of feed storage in a bit more depth before going on to look at how it can be improved.

Emoncms currently stores realtime feed data in a mysql database, every feed has its own mysql table. A feed table contains two fields: timestore and data value. Feed data is usually on a regular time interval, ie: 5,10,60s data. The time interval is set by the posting sensor node rather than emoncms.

Calculating feed disk use
We can calculate the estimated feed table size using the current implementation used in emoncms.

Lets say we want to store a year of 10s data. There are 31536000 seconds in a year and so 3153600 datapoints at a 10s data rate.

A single datapoint is made up of a timestamp which is stored as an unsigned integer, which takes up 4 bytes, and a float data value which also takes up 4 bytes.

3153600 datapoints x 8 bytes per datapoint (table row) = 24 Mb

In addition to the feed data we also have a table index which speeds up queries considerably. The worst case index size can be estimated with the equation detailed on this page:

http://dev.mysql.com/doc/refman/5.0/en/key-space.html

index row size = (key_length+4) / 0.67

The key we are using is the time field which is 4 bytes and so the index row size is = (4 + 4) / 0.67 =~ 12 bytes

The index size for 3153600 datapoints is therefore approximately = 3153600 * (4 + 4) / 0.67 = 36Mb

The total feed table size will therefore be approximately 60Mb.

Feed query speeds
As emoncms has developed a fair bit of work has gone into improving the method that realtime data is queried. At first improvements seem promising, see this documentation page for detailed discussion on the query implementation and query speeds:


But growing server demand on emoncms.org and feed table size means they have often only staved off an eventual slow down. 

I think the last idea I had of using a php for-loop to request a single row at given intervals that originally  reduced query times by about 10x is no longer working well on emoncms.org, it still gives the 1.6s query time on my local installation of emoncms but on emoncms.org Im getting a mixture of short query times 500ms and much longer query times 20s+ (in the more than 55 hour timewindow). The reason for this I think is due to the php for loop having to wait when the server is under heavy load for other mysql queries to complete. I think another solution is needed.

In the next few blog posts I will look at some of the potential solutions to both disk use and query speeds.
To engage in discussion regarding this post, please post on our Community Forum.