More direct file storage research

I was surprised to find how easy it was to use flat file storage for feed data using PHP file access commands and how fast this approach could be.

While reading up on indexes I realised that the timestamp column in a feed data table is its own index as it is naturally sorted in ascending order and each row (datapoint) should be unique. A data-point can then be searched for efficiently using binary search which I remember covering in A-level computing. The feed data in mysql had a binary tree index which If I understand correctly is similar but it is implemented using a separate index layer which uses quite a bit of extra disk space.

I had a go at implementing the get_feed_data function used in emoncms to select a given number of datapoint's over a timewindow used for drawing graphs. An example of the standalone function can be found here:

https://github.com/emoncms/experimental/blob/master/storage/directfiles/get_feed_data.php

A development branch of emoncms that uses this flat file approach and includes this function can be found here (inserting data, input processing such as power to kwhd and visualisation all work, but its still quite conceptual)

https://github.com/emoncms/emoncms/tree/flatfilestore

The get_feed_data function as implemented above takes roughly 120-230ms on a RaspberryPI to select 1000 datapoints over 1 to 300 days with a feed table with over 9 million rows.

Thats much better than the 900-2700ms achieved at similar ranges with the current mysql implementation.

The feed table in mysql used 178Mb of disk space. The same feed with no loss of data stored without an index and accessed as above takes up 67Mb so that's a considerable saving. Interestingly a 67Mb feed can be compressed to 18.5Mb with tar.gz compression.

One of the issues with the above get_feed_data query is that it needs to know the data interval to do the get a datapoint every x number of datapoints approach. We could use binary search to find every datapoint but this would be slower although maybe worth trying to get a benchmark so that it can be compared.

The other issue is that the datapoints selected may not be representative of the window they represent as they are just one random datapoint at a particular point in time. Which is the problem that the averaging approach used by Mike Stirling in Timestore and by Frank Oakner in EmonWeb solves.

Timestore is also a fair bit faster than the above get_feed_data function returning 1000 datapoints in 45ms.

The advantage of the above approach is that it can fit into emoncms without having to change the current implementation too much, the feed data retains its timestamps, input processing is used in the same way.

Not storing timestamps as timestore does could also be an advantage as it helps keep data quality high: fixed interval datapoints should be easier to use for comparison's, mathematical operations between feeds, it gives you higher certainty when querying the data, fetching data is faster and disk use is potentially half the size of the above approach if the values are stored as 4 byte floats as above rather than the default 8 byte double. This coupled with averaged layers provides data that is representative at all time scales and datapoint number requests.

My next step will therefore be to explore timestore further, first creating a script to export data from emoncms into timestore. The script needs to analyse the emoncms feed to work out the most common data interval. It needs to check for missing data, If a monitor went offline for an extended length of time it needs to give you the option to take this into account. It then needs to export and import into timestore as efficiently as possible.

In memory storage: PHP shared memory vs Redis vs MYSQL

Continuing on the theme of rethinking the data core of emoncms, as previously mentioned for short term storage, storage to disk may not be necessary, instead we can store data in memory using an in-memory database. Here are some tests and benchmarks for in memory storage:

To start with I created baseline test using MYSQL updating a feed's last time and value in the feeds table row for that feed. This took around: 4800 – 5100 ms to update a row 10000 times.


We would imagine Redis doing a lot better as its in-memory, it isn't writing to disk each time which is much slower than memory access. Redis did indeed perform faster completing the same number of updates to a key-value pair in 1900 – 2350ms. I'm a little surprised thought that it was only 2.3x as fast and not much faster, but then there is a lot going on Redis has its own server which needs to be accessed from the PHP client this is going to slow things down a bit, I tested both the phpredis client and Predis. Phpredis was between 500-1000 ms faster than the Predis client and is written in c.


How fast can in-memory storage be? A general program variable is also a form of in-memory storage, a quick test suggests that it takes 21ms to write to a program variable 10000 times, much better than 2.3x faster that's 230x faster! The problem with in program variables is that if they are written to in one script say an instance of input/post they cannot be accessed by another instance serving feed/list, we need some form of storage that can be accessed across different instances of scripts.


The difference between 21ms and 1900-2350ms for redis is intriguingly large and so I thought I would search for other ways of storing data in-memory that would allow access between different application scripts and instances.

I came across the PHP shared memory functions which are similar to the flat file access but for memory, the results of a simple test are encouraging showing a write time of 48ms for 10000 updates. So from a performance perspective using php shared memory looks like a better way of doing things.


The issue though is implementation, mysql made it really easy to search for the feed rows that you wanted (either by selecting by feed id or by feeds that belong to a user or feeds that are public), I'm a little unsure about how best to implement the similar functionality in redis but it looks like it may be possible by just storing each feed meta data roughly like this: feed_1: {"time":1300,"value":20}.

Shared memory though looks like it could be quite a bit more complicated to implement, but then it does appear to be much faster. Maybe the 2.3x speed improvement over mysql offered by redis is fast enough? and its probably much faster in high-concurrency situations. I think more testing and attempts at writing full implementations using each approach is needed before a definitive answer can be reached.

Load stat's for MYISAM vs INNODB for feed storage on the RaspberryPI

Here are some historic load stats for a raspberrypi running here, with 36 feeds being written to.

First using the INNODB storage engine:

A load of 3.5 causes an issue where the time that is recorded for data packets coming in gets messed up creating bunched up datapoints:

Switching the storage engine over the MYISAM, reduced the load to around 0.2 and the timing issue is no longer present:


To convert your raspberry pi emoncms Innodb tables to MYISAM you can run the following script on your raspberrypi which will go through each table converting them in turn:



  $mysqli = new mysqli("localhost","root","raspberry","emoncms");

  $result = $mysqli->query("SHOW tables");
  while ($row = $result->fetch_array())
  {
    echo "ALTER TABLE `".$row[0]."` ENGINE=MYISAM\n";
    $mysqli->query("ALTER TABLE `".$row[0]."` ENGINE=MYISAM");
  }


Rethinking the data input and storage core of emoncms: benchmarks

Over the last few days I've been looking again at the core data input, storage and access part of emoncms. There is definitely a lot of opportunity to improve performance and there are a lot of options so I thought I would start to do some more systematic benchmarking.

So here are some initial benchmarks of feed data storage in different storage engines: mysql (myisam vs innodb), timestore and direct file access. I also thought Id have a go at writing the current implementation of input processing in both python and nodejs in addition to php to learn a bit more about these languages as they are being used and favoured by others in the community such as Jerome (python), Houseahedron (python) and Jean Claude Wippler of Jeelabs (nodejs). Id like to see if there is any measurable difference in performance between these different languages for the kind of application that we are developing and if there are any other benefits: easier to do certain things etc.

Housemon by Jean Claude Wippler is a good example of how a timeseries data storage and visualisation application can be implemented in a different way by using a mixture of direct file storage and a redis in-memory database with the server side part of the application written in nodejs.

Intrigued by the idea of using direct file storage as Jean Claude Wippler does in Housemon and following the approach used by Mike Stirling in timestore of using a fixed time interval to simplify and speed up searching I had a go at writing a basic implementation using php file access and the results are good.

Storage engine test

All tested on a raspberrypi, running off the standard SanDisk SDHC 4Gb SD Card. 

MYSQL

https://github.com/emoncms/experimental/blob/master/storage/MYSQL/mysql.php
  • InnoDB INSERT 1000 points 21s,25s,20s (Normalised to 100,000 ~ 2200s)
  • InnoDB INSERT 10000 points 167s,183s (Normalised to 100,000 ~ 1750s)
  • MYISAM INSERT 10000 points 15-17s (Normalised to 100,000 ~ 160s)
  • MYISAM INSERT 100000 points 165s
MYISAM | INNODB READ

Benchmark of current emoncms mysql read function that selects given number of datapoints over a time window.

MYISAM results on the left | INNODB results on the right

https://github.com/emoncms/experimental/blob/master/storage/MYSQL/mysql_read.php

10000 datapoint table:
  • 1000dp over 5 hours (average method) 232ms | 391ms
  • 1000dp over 24 hours (average method) 424ms | 675ms
1000000 datapoint table: (115 days @ 10s)
  • all 0.2 hours (all method) 40ms | 38ms
  • all 0.5 hours (all method) 58ms | 55ms
  • all over 1 hours (all method) 90ms | 82ms
  • all over 1.3 hours (all method) 108ms | 100ms
  • 1000dp over 3 hours (average method) 237ms | 272ms
  • 1000dp over 5 hours (average method) 280ms | 327ms
  • 1000dp over 24 hours (average method) 726 ms | 949ms
  • 1000dp over 48 hours (average method) 1303 ms | 1767ms
  • 1000dp over 52 hours (php loop method) 2875 ms | 2650ms
  • 1000dp over 100 hours (php loop method) 3124 ms | 2882ms
  • 1000dp over 200 hours (php loop method) 2934 ms | 2689ms
  • 1000dp over 400 hours (php loop method) 2973 ms | 2749ms
  • 1000dp over 2000 hours (php loop method) 2956 ms | 2762ms
  • 1000dp over 2600 hours (php loop method) 2969 ms | 2767ms
PHP loop method timing may be quite a bit longer if the server is under heavy load as it involves making many separate mysql queries, each query needs to wait for other queries in the mysql process list to complete.
Timestore

Timestore is a promising solution, developed specifically for timeseries data, written by Mike Stirling.
Blog post on timestore: Timestore timeseries database

https://github.com/emoncms/experimental/blob/master/storage/timestore/timestore.php
  • 10000 inserts 52s
  • 100,000 inserts 524s
https://github.com/emoncms/experimental/blob/master/storage/timestore/timestore_read.php
  • Read 1000 datapoints over 5 hours: 45ms
  • Read 10 datapoints over 5 hours 20ms
Includes layer averaging and multiple layers so there is quite a bit more going on (that would still need to be added to other implementations like direct file and mysql above), so benchmarks are not directly comparable.

Direct file
For some reason I did not think this method would work as well as the benchmarks show but its great that it does because from an implementation point of view its really simple and very flexible as its easy to modify the code to do what you want, see the examples linked:
  • Direct file write 100,000: 6-7s
  • Direct file write 100,000 open and close each time: 27,24,26s
  • Direct file read 1000 datapoints over 5 hours of 10 second data in 85-88ms
  • Reads 1000 datapoints over 200 hours of 10 second data in 93ms
  • Reads 1000 datapoints over 2000 hours of 10 second data in 130ms
  • Reads 1000 datapoints over 2600 hours of 10 second data in 124ms
Redis
For a short term storage, storage to disk may not be necessary, instead we can store data in memory using an in-memory database like redis. Benchmarks to add.

Blog post: Redis idea

Other ideas for storage format
Languages
What about the programming language? No benchmarks yet but interesting to look at the difference in how the code looks. I found each language pretty straightforward to use and online resources to get me passed the bits I didn't know where readily available. The language links below show the core parts of the input processing stage of emoncms written in php, nodejs and python. I've also linked to emonweb a port of emoncms (or more a build in its own right be now) by Frank Oxener in ruby on rails.
Servers
Emoncms.org stats
HouseMon

HouseMon by Jean Claude Wippler stores data in 3 forms: 
  • Raw log of the serial data received to file (compressed daily) 
  • Redis in-memory storage for last 48 hours which makes for quick access of most recent data. 
  • Archival storage via direct file access for data older than 48 hours, the archive is hourly aggregated data (hourly - unless a use case demands finer resolution at which point the archive can be rebuilt from the raw logs). 
http://jeelabs.org/2013/02/17/data-data-data/
http://jeelabs.org/2013/02/18/who-needs-a-database/

Its quite clear from some of the above tests that the housemon implementation is going to be fast in terms of data access speeds (with redis storing everything in memory for the last 48 hours) and efficient in terms of data storage (binary files – hourly data), the big difference is that full resolution data is not available after 48 hours but Jean Claude Wippler argues that it would be better to wait for a use case rather than implement higher resolution for higher resolution sake and that logs can be used to rebuild archives at higher resolution if needed anyway.

Next steps

If you have a standard emoncms raspberrypi install, changing the mysql storage engine to myisam should bring immediate performance improvements, especially if you have a lot of feeds being recorded, I will try and put together a script to make this easier and also update the ready to go image.

The next development step I think is to integrate redis into emoncms by rebuilding the input processing implementation to use redis rather than go to disk to get the last feed and input values. Then it would be good to test both timestore and the integrated direct file storage in action on several parallel raspberrypi's, keep benchmarking the differences and then see where that gets us.

Idea for using redis in-memory database to improve emoncms performance

Most of the time taken to handle posting data to emoncms, (input processing and feed updating) is taken up by requests to the mysql database. The php part of the code is usually pretty fast especially with Opcode cashing enabled such as APC.

Reducing the number of MYSQL queries required is usually a sure way to improve the performance of the application. Back in March of this year I re-factored the input processing implementation removing a lot of un-needed repeated queries as described at the bottom of this page: http://emoncms.org/site/docs/developinputproc
The results where really good, see the pic of time spent serving queries here

I've been thinking about how to improve this further, mysql queries are used a lot for getting the last value of a feed or input, its used for all the +,-,x,/ input processes, the Power to kWh/d, histogram processes, in-fact most of the processes. When a new datapoint is added to a feed data table emoncms also updates the last time and value into the feeds table every time. This is then used by the input processors and also to provide the  live view on the feeds page.

None of these input or feed last updated time/value reads and write queries need to be persistent beyond the very short term that an in-memory database would be fine for. Using an in-memory database like redis should be much faster than going to the hard disk or SD card and so hopefully implementing this will lead to a good performance improvement. It also has benefits for longevity of the raspberrypi SD card as it reduces SD card writes. That's the idea anyway, here's a bit of basic testing of using redis, the next step is to try to implement this in emoncms.

Install redis and redis php client ubuntu

sudo apt-get install redis-server

Install PHP redis client https://github.com/nrk/predis. To install using PEAR which we've already been using for installation of the serial dio library used by the raspberrypi module, call the following: 

sudo apt-get install php-pear php5-dev (if you dont already have pear installed) 

pear channel-discover pear.nrk.io
pear install nrk/Predis

Trying redis out: 

< ?php

require 'Predis/Autoloader.php';
Predis\Autoloader::register();

$redis = new Predis\Client();

// 1) Set redis in-memory key-value pair to hold feed meta and last value data
// ONCE SET COMMENT OUT THIS LINE TO SEE THAT feed_1 DATA IS PERSISTENT 
// BETWEEN CALLS TO THIS SCRIPT - AS LONG AS MEMORY IS NOT RESET.
$redis->set('feed_1',json_encode(array('name'=>"Solar Power",'time'=>time(),'value'=>1800)));

// 2) Fetch the feed_1 entry from in-memory db
echo $redis->get('feed_1');

Redis is also used by Jean Claude Wippler in HouseMon which is cool:
https://github.com/jcw/housemon
http://jeelabs.org/tag/housemon/
I'd like to learn more about how HouseMon works, take a little time to get a HouseMon install up and running and familiarise myself with the code and architecture as it looks really nice!

Building Energy Modelling: A simple javascript model

Putting what we covered in the last two posts together we can create a basic building energy model that covers building fabric heat loss and infiltration heat loss.

To start with the model extends the very basic example given in the first blog of the cube house by allowing for different building elements with different U-values: Walls, Roof, Floor, Windows. This is implemented in the same way as can be found in the full SAP model.


When calculating the heat loss from a building with multiple elements its useful to break the equation for building fabric element heat loss (Building fabric element heat loss = Area x U-Value x temperature difference)  down into three parts.
  1. Calculate the Area x U-Value for each element giving the heat loss in Watts per Kelvin (W/K) for that building element.
  2. Calculate the sum of the heat loss in W/K for all elements.
  3. Calculate the the total heat loss in Watts for a given temperature difference.
Here's an example house with a fairly simple range of elements, also added is an average infiltration rate for a modern house as covered in the last blog post.



Area

U-Value

W/K
Solid floor uninsulated
49 m2
x
0.7
=
34.3 W/K
Timber frame walls with 50mm of insulation
156 m2
x
0.45
=
70.2 W/K
Roof with 100mm of loft insulation
49 m2
x
0.25
=
12.3 W/K
Double glazed windows
12 m2
x
2.0
=
24.0 W/K
Infiltration: average modern house 1.5 air changes per hour.
1.5 x 0.33 x 294m3 =

145.5 W/K


TOTAL   
286.3 W/K


Temperature difference
9.0 C


Heat loss (286W/K x 9C =)
2577 W
Annual total heating demand (including internal and solar gains)
22570 kWh

Note: We still need to take into account: solar and internal gains and seasonal temperature variation as a minimum before we get the actual demand on the heating system.

So that's all quite straightforward, the open source SAP implementation is written primarily in javascript with all the calculations happening on the 'client' (in the internet browser). The model combined with an interface updated in real-time makes for a dynamic experience with everything being calculated and visualised on the fly.

Here's a javascript implementation of the above with just a simple console output for now:

Javascript code example
var elements = [ 
{itemname: "Floor", grossarea: 49.0, openings: 0, uvalue: 0.7},
{itemname: "Walls", grossarea: 168.0, openings: 12.0, uvalue: 0.45},
{itemname: "Loft", grossarea: 49.0, openings: 0, uvalue: 0.25},
{itemname: "Windows", grossarea: 12.0, openings: 0, uvalue: 2.0}
];

var fabric_heat_loss_WK = 0;

for (z in elements) {
elements[z].netarea = elements[z].grossarea - elements[z].openings;
elements[z].axu = elements[z].netarea * elements[z].uvalue;
fabric_heat_loss_WK += elements[z].axu;
}

var volume = 294;
var infiltration = 1.5; // Air change per hour
var infiltration_WK = 0.33 * infiltration * volume;

var internal_temperature = 21;
var external_temperature = 12;

var total_heat_loss_WK = fabric_heat_loss_WK + infiltration_WK;

var heatloss = total_heat_loss_WK * ( internal_temperature - external_temperature );

console.log("Total heating requirement: "+heatloss.toFixed(0)+" W");
console.log("Annual heating demand: "+(heatloss*0.024*365).toFixed(0)+" kWh");

You can run this code directly on your computer via linux terminal without using an internet browser using nodejs http://nodejs.org/. To install nodejs on Ubuntu type:
sudo apt-get install nodejs

create a file called bem01.js and copy the javascript above into it and save.
Locate the file with terminal and then run using:
nodejs bem01.js

it should output:
Total heating requirement: 2577 W 
Annual heating demand: 22570 kWh

Seasonal temperature variation
So far to keep things simple we have assumed constant external temperature and internal temperature. There are different ways to take into account temperature variation, one common method is degree days http://www.degreedays.net/introduction.

If I understand it correctly the SAP model uses average indoor temperature minus average external temperature on a monthly basis to calculate total heat demand and then it has a factor that reduces the % of a month the heating is on for if gains from solar and internal sources are significant.

Different sources of heat gain
What we have covered so far covers the heat loss side of the equation but the heating energy requirement is not yet what our heating system needs to provide instead it is the total heat energy going into the system that is our house including heat from other sources in addition to the main heating system, the other heat gains typically taken into account are: Solar gains and Internal Gains which usually consist of Cooking, Lighting  Appliances and metabolic gains (These are the sources covered in the SAP model)

The full energy balance equation for a steady state building energy model looks like this:

solar_gains + cooking + lighting + hotwater + appliances + metabolic + heating_system =
( fabric_heat_loss_WK + infiltration_WK) x (internal_temperature – external_temperature)

This is really the fundamental equation that describes a simple steady state building energy model, a large part of the SAP model is concerned with calculating estimates for all the variables that go into this equation.

Integrating Monitoring into a building energy model.
The above equation shows several variables that could be provided or inferred to some degree by monitoring.

Internal temperature could be provided by an array of temperature sensors throughout a building. External temperature could be provided by an external temperature sensor or pulled in from a local weather station.

Depending on energy source: cooking, lighting, hot water, appliances and heating system input could be provided from either electricity monitoring or electric and gas, the degree of utilisation would need to be taken into account.

Solar gains could be calculated from a irradiance sensor or how about normalised solar PV data?

Summary of building energy modelling blog posts and code
For the open source implementation of the SAP 2012 model see github here:
github.com/emoncms/sap

Building Energy Modelling part 1 - The Whole House Book

emonTH Prototype

I'm currently working on a little unit called the emonTH, a remote temperature and humidity monitoring node. We wanted a tidy looking, easy to deploy little unit for monitoring the environmental conditions in various rooms of our houses. The temperature and humidity data gathered can be fed into emoncms and used for building energy modelling, heating system optimisation etc.

The design so far has got options for DS18B20 temperature sensor or DHT22 sensor for humidity & temperature. External sensors can be connected via terminal block (not soldered in on prototype). The enclosure can be wall mountable. The unit will be battery powered with option for mini-usb power. We have estimated around 6-9 months battery. I hope we might be able to get a year or so battery life with optimization and slowing down the readings to once every few min.

I'm currently testing prototype #1. 

To keep power consumption down the ATmega328 microcontroller is put to sleep in-between reading and the sensors are powered from digital outputs and are turned off altogether in between readings, this should stop any self heating effects (see forum thread), I'm planning to do some accuracy testing on prototype soon.

emonTH first prototype with DTH22 and DS18B20

emonTH enclosure

As with other the other OpenEnergyMonitor hardware the emonTH has got an ATmega328 with the Arduino bootloader so it's nice and easy to modify and upload new the code (sketches). For the wireless there is an RFM12B module to be compatible with our other hardware (RFM12Pi base station etc). Again, as with all our other hardware units the schematic and CAD filed will be open-sourced. 

The emonTH uses a little module from Ciseco called RFu328. This unit is an ATmega328 plus a radio RFM12B or SRF in the same small form factor as an Xbee. We decided to use the RFu328 partly because it's nice and small and makes manufacture easier for us and party since it allows to to easily swap between using the RFM12B radio or the SRF while keeping the flexibly and ease of use of the ATmega328 with Arduino Uno serial bootloader. 


RFu328 with RFM12B

The little red circle on the image above indicates the only hardware charge required when using an RFM12B radio on the RFu328. The SMT resistor is rotated routed 90 degrees swapping over Dig 1 SRF UART (Tx ) to Dig3 (INT 1) to be used as the RFM12B SPI interrupt. The RFu328 with the RFM12B requires a modified JeeLib Arduino library called RFu_JeeLib.
RFu328 with SRF & Chip Antenna

Building Energy Modelling: Ventilation and infiltration


Following from the previous blog that described a simple example of heat loss via heat conduction through the building fabric, the second primary cause of heat loss is ventilation and infiltration. The movement of heated air from inside the house into its surroundings.

I wrote the following as a start for the Emoncms SAP module documentation and can be found under emoncms.org/sap/airchange

The rate of air movement is typically measured in air-changes per hour. An air-change is when the full volume of air inside a house is replaced with a new volume of air. This happens surprisingly frequently.

The heat lost is equal to the energy stored in the warm air relative to the external temperature, which can be found with another fundamental physics equation, the equation for specific heat:

HLOSS = c x m x (TINTERNAL - TEXTERNAL)

Where:

c = Specific heat of air (1006 J/kg.K)
m = Mass of air that has moved out of the building per second

(HyperPhysics: Specific heat)

Example:

A house that measures 7 meters wide, 7 meters long and 6 meters high encloses a volume of: 294 m3. The house has an average air tightness for a modern house of around 1.5 air changes an hour and the internal temperature is 20C while the external temperature is 12C.

The first step is to work out the m: the mass of air that has moved out of the building per second. We know the volume of air that has moved and the rate at which the volume moved and so we can calculate the mass from this.

mass of one air change = volume x air density
There are 1.5 air changes per hour or 1.5 / 3600 air changes per second. The mass of air that has moved per second is therefore:

m = (air-change / 3600) x volume x air density
The division by density of air and 3600 and also the multiplication by the specific heat of air is in many models bundled together into one constant to reduce on the calculation steps:

m x c = air-change x volume x (density x c / 3600)
Where:

density x c / 3600 = 1.205 x 1005 / 3600 = 0.336
The heat loss from ventilation and infiltration becomes:

HLOSS = 0.33 x air-change x volume x (TINTERNAL - TEXTERNAL)


This is the form of the equation used in the SAP model in section 4 (emoncms.org/sap/4). 0.336 has been rounded down to 0.33 in accordance with the SAP value. The density and specific heat figures above come from: http://www.engineeringtoolbox.com/air-properties-d_156.html

Entering our example values in the simplified equation, we get:

HLOSS = 0.33 x 1.5 x 294 x (20 - 13) = 1018.71 Watts


Working out air-changes per hour

The hard part in the equation above to work out is of course the air changes per hour of a building. The most accurate way to find it out is to perform an air tightness test of the building. This involves de-pressurising the building with specialist fans attached to the front door.

The SAP model provides a method to estimate the air-change rate in the absence of a measured value. This method is detailed in full in section 2 (emoncms.org/sap/2) and takes into account factors from number of chimneys and flues to wind speed and the degree the building is sheltered from the wind.

Typical air-change per hour values

As a guide Pat Borer and Cindy Harris give the following values in the Whole House Book.

Old undraught stripped house: 4 air changes per hour

Average modern house: 1 to 2 air changes per hour

Very tight, super-insulated house: 0.6 air changes per hour

How much energy does it take to heat a simple cube house?

Imagine a house that is a hollow cube of uniform material, no windows, no openings, no draughts, just a simple hollow cube.

Lets say this cube house is made of nothing but mineral insulation 100mm thick, with internal dimensions: 7m wide, 7m long and 7m high.

Our cube house is situated in a climate with no wind or solar gain just a stable 12C outside air temperature year round.

How much energy would it take to keep this hypothetical house at a stable 21C?

As we heat the house, heat will flow from the hotter internal air through the walls to the colder external air via conduction and so the equation that we need is the fundamental physics equation for heat conduction.

H = (kA / l) x (Tinternal – Texternal)

See the great hyperphysics site for more on the heat conduction equation and everything else physics.

The Wikipedia table on material thermal conductivity tells us that mineral insulation has a thermal conductivity of 0.04 W/mK. We can take the area of the material to be the internal area of our cube house (imagine folding the cube house out so that we just have this one dimensional wall of area A and thickness l), there is of course a difference between the internal area and the external area of our cube house but lets come back to that one later and take the internal area for now which is:

7m x 7m x 6 surfaces = 294 m2

Putting the numbers into the heat conductivity equation we get:

H = (0.04 x 294 / 0.1) x (21 – 12) = 1529 Watts

And so we find we would need a fairly standard 1.5kW heater to keep our cube house at 21C.
1529W continuously would work out to being 37 kWh per day and 13392 kWh/year.

Heat loss through building elements is one of the main cornerstones of a building energy model. But in models such as SAP its not usually referred to as the heat conductivity equation nor is the thermal conductivity of a material the usual starting point. Instead models like SAP start with a building elements U-value and an equation that looks like this:

Heat loss = U-value x Area x Temperature Difference

For an element made of a single uniform material the U-value is simply the materials thermal conductivity k divided by its thickness. But building elements are only sometimes single uniform materials, a building element can also be an assembly of different materials such as a timber stud wall with insulation, membranes and air inside. The physical process of heat transfer through the element may also be a mixture of conductive, convective and radiative heat transfer.

Coming from a physics background I found it useful to start with what I was familiar with and I think its useful to understand that in the case where a material is uniform the heat loss through a building element equation is the same as the basic equation for heat conductivity and the U-value is just the k/l part lumped together into one constant.

The U-value of our 100mm mineral insulation wall would therefore be: U-value = k / l = 0.04 / 0.1 = 0.4  W/m2.K.

If you have a composite of materials, say a layer of wood and then a layer of insulation its possible to calculate the overall U-value in the same way as we calculate the equivalent resistance of parallel resistors in electronics.

In the next post I will go through an example of a slightly more complicated but still very simple house model made up a series of different elements with different U-values.

For further reading on U-values see U-values definition and calculation by the RIBA.

Building Energy Modelling part 3 - Carbon Coop and Open Source SAP 2012

Continuing from the last post on building energy modelling. Fast forward to late 2012 when I met Matt Fawcett of Carbon Coop and heard at length about all the exciting work they are doing around retrofit, see the blog here: http://openenergymonitor.blogspot.co.uk/2013/05/carbon-coop.html

As I mentioned in the post Carbon coop and their technical partners URBED have put a lot of work into an assessment method for assessing a households suitability for retrofit work, working out a list of measures including full details and costings, how a household can achieve 60-80% carbon reduction. Their assessment method is based on SAP 2012 and was implemented by Charlie Baker of URBED in a Mac Numbers spreadsheet.



Matt explained that to take things further they wanted to integrate the monitoring with the assessments in order to be able to reduce assumptions used and that they thought that longer term an open source online version of the retrofit assessment method would be key to make retrofit more accessible and open for a greater number of people.

At home I also wanted to move forward with this idea of being able to use monitoring combined with a building energy model to understand the current building fabric performance at home and the lab and get a better understanding of what the effect would be of adding insulation and draught proofing.

And so Matt and I started the process of converting the SAP 2012 pdf worksheet specification into a open source javascript web application which as a first draft is now about 90% complete, its implemented as an emoncms module so that as it develops we could easily pull in monitored data sets such as the actual average monthly internal temperature in the building, actual electricity consumption for internal gains and so on. 



The module source code can be found on github here: github.com/emoncms/sap

Try it out on emoncms.org here: http://emoncms.org/sap (no need to login)

Most of the calculations can be found in the javascript file equations.js and associated functions in solar.js, windowgains.js and utilisationfactor.js. The interface pages can be found in the folder named compiled (which isn't a particularly good name any more, a remnant from earlier development). The file sap_view.php is what ties it all together loading the equations.js and the relevant page interface.

Having got this far with the implementation and understanding better the requirements, what calculations are needed etc, its becoming clearer that the current implementation really needs a round of re-factoring to make it easier to develop with going forward.

The SAP model lends it self well to be broken down into a series of sub-modules, so rather than have all calculations in one file, the various parts of the calculation and related interfaces are broken out into separate modules with clear inputs and outputs and the possibility of being able to interchange these sub-modules, you could decide to use the SAP internal temperature estimation sheet/model or bring that in from monitored data for example.

So that's pretty much the state of development on this at the moment. The recent meetup with Houseahedron couldn't really have come with better timing as Matt and I had just been chatting about where this could lead in the future and what would be really cool to have, we where saying how nice it would be if the building thermal data could be visualised in 3D but thought that would be something a long way down the line, it was literally a couple of week later that the Houseahedron team got in contact saying they where going to be developing just this and all open source. With a larger team of us, with different skill-sets working on this, I think this could turn out to be a really useful tool that integrates well with monitoring to allows us to better quantify the performance of buildings and the effect of implementing various measures, exciting stuff!