Timestore timeseries database

The first and most developed solution to both the query speed problem and disk space problem is timestore.

http://www.mike-stirling.com/redmine/projects/timestore

Timestore is a lightweight time-series database developed by Mike Stirling. It uses a NoSQL approach to store an arbitrary number of time points without an index.

Query speeds
Timestore is fast, here's the figures given by Mike Stirling on the documentation page:

From the resulting data set containing 1M points spanning about 1 year on 30 second intervals:

Retrieve 100 points from the first hour: 2.6 ms
Retrieve 1000 points from the first hour (duplicates inserted automatically): 6.2 ms
Retrieve 100 points over the entire dataset (about a year worth): 2.5 ms
Retrieve 1000 points over the entire dataset: 7.0 ms

Disk use

Timestore uses a double as a default data type which is 8 bytes. The current emoncms mysql database stores data values as floats which take up 4 bytes, its easy to change the data type in timestore so for a fair comparison we can change the default datatype to a 4-byte float:

Layer 1: 10s layer = 3153600 datapoints x 4 bytes = 12614400 bytes
Layer 2: 60 layer1 datapoints averaged = 52560 datapoints x 4 bytes = 210240 Bytes
Layer 3: 10 layer2 datapoints averaged = 5256 datapoints x 4 bytes = 21024 bytes
Layer 4: 6 layer3 datapoints averaged = 876 datapoints x 4 bytes = 7008 bytes
Layer 5: 6 layer4 datapoints averaged = 146 datapoints x 4 bytes = 1168 bytes
Layer 6: 4 layer5 datapoints averaged = 36 datapoints x 4 bytes = 288 bytes
Layer 7: 7 layer6 datapoints averaged = 5 datapoints x 4 bytes = 40 bytes

total size = 12854168 Bytes or 12.26Mb

The current emoncms data storage implementation uses 60Mb to hold the same data as it saves both the timestamp and an associated index. Timestore therefore has the potential to reduce diskuse by 80% for realtime data feeds.

Interestingly all the downsampled layers created by timestore only come too 0.23 Mb. Before doing the calculation above I used to think that adding all the downsampled layers would add to the problem of disk space significantly but evidently it a very small contribution compared with the full resolution data layer.

Emoncms timestore development branch

I made a start on integrating timestore in emoncms, there's still a lot to do to make it fully functional but it works as a demo for now, here's how to get it setup:

1) Download, make and start timestore

$ git clone http://mikestirling.co.uk/git/timestore.git
$ cd timestore
$ make
$ cd src
$ sudo ./timestore -d

Fetch the admin key

$ cd /var/lib/timestore
$ nano adminkey.txt

copy the admin key which looks something like this: POpP)@H=1[#MJYX<(i{YZ.0/Ni.5,g~<
the admin key is generated anew every time timestore is restarted.

2) Download and setup the emoncms timestore branch

Download copy of the timestore development branch

$ git clone -b timestore https://github.com/emoncms/emoncms.git timestore

Create a mysql database for emoncms and enter database settings into settings.php.

Add a line to settings.php with the timestore adminkey:
$timestore_adminkey = "POpP)@H=1[#MJYX<(i{YZ.0/Ni.5,g~<";

Create a user and login

The development branch currently only implements timestore for realtime data and the feed/data api is restricted to timestore data only which means that daily data does not work. The use of timestore for daily data needs to be implemented.

The feed model methods implemented to use timestore so far are create, insert_data and get_data.

Try it out

Navigate to the feeds tab, click on feed API helper, create a new feed by typing:
http://localhost/timestore/feed/create.json?name=power&type=1

It should return {"success":true,"feedid":1}

Navigate back to feeds, you should now see your power feed in the list.
Navigate again to the api helper to fetch the insert data api url

Call the insert data api a few times over say a minute (so that we have at least 6 datapoints - one every 10 seconds). Vary the value to make it more interesting:
http://localhost/timestore/feed/insert.json?id=1&value=100.0

Select the rawdata visualisation from the vis menu
http://localhost/timestore/vis/rawdata&feedid=1

zoom to the last couple of minutes to see the data.


I met Mike Stirling a little over a month ago in Chester for a beer and a chat after Mike originally got in contact to let me know about timestore. We discussed data storage, secure authentication, low cost temperature sensing and openTRV the project Mike is working on. I think there could be great benefit to work on making what we're developing here with openenergymonitor interoperable with what Mike and others are developing with openTRV, especially as we develop more building heating and building fabric performance monitoring tools. This could all develop into a super nice open source whole building energy (both electric and heat) monitoring and control ecosystem of hardware and software tools.

Check out Mike's blog here:

http://www.mike-stirling.com/
and http://www.earth.org.uk/open-source-programmable-thermostatic-radiator-valve.html To engage in discussion regarding this post, please post on our Community Forum.