Model Details
Developed by research scientists at CSIRO, 2023
LightGBM regression model
Star Rating – Version 13; Heating (MJ/m²) – Version 11; Cooling (MJ/m²) – Version 11
Intended Use
RapidRateTM is a tool developed by CSIRO using machine learning techniques, that can quickly rate the energy efficiency of a dwelling using a relatively small number of inputs.
RapidRateTM generates an estimated Star Rating aligned with the Nationwide House Energy Rating Scheme (NatHERS), and estimates both heating and cooling energy load. The model provides an estimate and a prediction interval (which indicates the accuracy at the individual dwelling level).
RapidRateTM is ideal for energy assessments when limited data, expertise, or time for a detailed assessment is available. It offers benefits for a range of users:
- Homeowners and renters could use RapidRateTM to assess the energy efficiency of their own homes and identify areas for improvement.
- Urban energy modellers could deploy RapidRateTM to quickly assess urban level energy consumption.
- Housing data providers could enrich their datasets by utilising RapidRateTM to generate energy efficiency attributes for individual dwellings.
- Financial institutions could use RapidRateTM to gain a better understanding of the energy efficiency qualities of their housing portfolios.
Note: RapidRateTM scores are not suitable for National Construction Code certification.
Context
Heating and cooling measures are the annual thermal performance loads, measured in units of MJ/m². They estimate the annual energy required for heating and cooling a residence, based on assumptions about occupancy and thermal comfort.
Star Ratings (which range from 0 to 10) provide an overall measure of the thermal comfort of homes, based on how much heating and cooling is required to maintain a comfortable temperature and the geographical location of the residence.
Factors
Inputs include dwelling type (house or apartment); floor area; external wall area by orientation; window area by orientation (including area double-glazed); main wall, floor and roof materials (including insulation); and postcode.
Input | Description | Feature importance |
---|---|---|
Dwelling type | House or apartment | Low |
Postcode | Postcode of the dwelling | High |
Floor area – conditioned | Conditioned floor area in square meters | High |
Floor area – unconditioned | Unconditioned floor area in square meters | Med |
Floor area – garage | Garage floor area in square meters | Low/Med |
External wall area by orientation | External wall area in square meters (including any windows or doors set into the wall), by orientation | Low/Med |
Total external wall area | Sum of external wall areas at all orientations | High |
Window area by orientation | Window area in square meters | High |
Total window area | Sum of window areas at all orientations | High |
Window area % double glazed by orientation | Percentage of window area that is double glazed at each orientation | Low |
Total double glazed window area | Sum of window areas that are double glazed at all orientations | High |
Main external wall construction type | Main wall materials | Low |
Main floor construction type | Main floor materials | Med/High |
Main roof construction type | Main roof materials | Med (roof type none) |
Wall, floor, roof insulation levels | R value of insulation of wall, floor and roof | High |
Site exposure | How open or protected the area surrounding a dwelling is | Med |
Project type | Reflects the age of the dwelling and whether it has been renovated | High |
Training Data
The model is trained with data from the NatHERS Universal Certificates collected by CSIRO since June 2016. A Universal Certificate is the assessment pathway used by most new dwellings in Australia to comply with the energy efficiency requirements in Australia’s National Construction Code. Around 130,000 certificates are added to our database each year.
Evaluation Data
Test data is used to evaluate the model’s performance separately from the training data. The model, which has never encountered this test data before, is assessed for the accuracy of its predictions.
Metrics
Evaluation metrics include R², root mean square error, mean absolute error, and median absolute error. These indicate overall model performance.
- R², or R-squared measures how well a regression prediction model fits the data. If R² is 1, the model’s predictions are perfect, explaining everything in the data. If R² is 0.8, this means the model can account for 80% of the trends and relationships in the data, which suggests a strong fit, while an R² of 0 means the model isn’t helping at all. R² can be between 0 and 1.
- Root mean square error (RMSE) reflects the average deviation of predictions from actual values. If RMSE is low, the model’s predictions are close to the real values, and if it’s high the predictions are further from the actual value. RMSE values can range from 0 to 500 MJ/m² (Heating) / 212.5 MJ/m² (Cooling) / 5 Stars (Star Rating).
- Mean and median absolute error measures the average distance between the model’s predictions and the actual values. It shows, on average, how much the predictions are off, with lower mean or median indicating more accurate predictions. Note that mean is the average of values, and median is the middle value in a sorted list of values. Mean and median absolute error values can range from 0 to 1000 MJ/m² (Heating) / 425 MJ/m² (Cooling) / 10 Stars (Star Rating).
Quantitative Analyses
Please see the Metrics section for explanation of the metrics displayed here (R², RMSE, mean absolute error, median absolute error), and their potential ranges.
The following tables provide model accuracy measure for the entire model:
Model | residences | R² | RMSE (stars) | Mean absolute error (stars) | Median absolute error (stars) |
---|---|---|---|---|---|
Star Rating | 910247 | 0.79 | 0.48 | 0.33 | 0.2 |
Model | residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|
Cooling | 910247 | 0.94 | 6.6 | 4.2 | 3 |
Heating | 910247 | 0.95 | 16.5 | 8.1 | 5 |
The following tables provide model accuracy measures over subsets of the data, for profiling features of particular interest:
Project type
Model | Project type | residences | % residences | R² | RMSE (stars) | Mean absolute error (stars) | Median absolute error (stars) |
---|---|---|---|---|---|---|---|
Star Rating | Existing | 14869 | 1.6 | 0.84 | 0.59 | 0.43 | 0.3 |
Star Rating | New | 861264 | 95 | 0.69 | 0.46 | 0.32 | 0.2 |
Star Rating | Renovation | 34114 | 3.7 | 0.76 | 0.68 | 0.5 | 0.4 |
Model | Project type | residences | % residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|---|---|
Heating | Existing | 14869 | 1.6 | 0.78 | 83.9 | 61.62 | 45 |
Heating | New | 861264 | 95 | 0.96 | 9.39 | 6.36 | 4 |
Heating | Renovation | 34114 | 3.7 | 0.80 | 39.43 | 26.6 | 18 |
Model | Project type | residences | % residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|---|---|
Cooling | Existing | 14869 | 1.6 | 0.73 | 12.47 | 8.45 | 6 |
Cooling | New | 861264 | 95 | 0.94 | 6.26 | 4.08 | 3 |
Cooling | Renovation | 34114 | 3.7 | 0.83 | 8.83 | 5.61 | 4 |
National Construction Code climate zone
Each dwelling postcode was mapped to a National Construction Code climate zone. Please see https://www.abcb.gov.au/resources/climate-zone-map for climate zone map and climate zone descriptions.
Model | NCC climate zone | residences | % residences | R² | RMSE (stars) | Mean absolute error (stars) | Median absolute error (stars) |
---|---|---|---|---|---|---|---|
Star Rating | 1 | 18134 | 2.0 | 0.48 | 0.62 | 0.48 | 0.4 |
Star Rating | 2 | 102119 | 11 | 0.62 | 0.65 | 0.5 | 0.4 |
Star Rating | 3 | 1133 | 0.1 | 0.58 | 0.65 | 0.53 | 0.5 |
Star Rating | 4 | 17541 | 1.9 | 0.65 | 0.43 | 0.29 | 0.2 |
Star Rating | 5 | 256515 | 28 | 0.70 | 0.55 | 0.41 | 0.3 |
Star Rating | 6 | 440522 | 48 | 0.87 | 0.37 | 0.25 | 0.2 |
Star Rating | 7 | 73663 | 8 | 0.85 | 0.38 | 0.26 | 0.2 |
Star Rating | 8 | 620 | 0.1 | 0.39 | 0.68 | 0.51 | 0.4 |
Model | NCC climate zone | residences | % residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|---|---|
Heating | 1 | 18134 | 2.0 | -11.01* | 4.78 | 1.34 | 0 |
Heating | 2 | 102119 | 11 | 0.66 | 5.64 | 3.95 | 3 |
Heating | 3 | 1133 | 0.1 | 0.68 | 15.04 | 12.21 | 11 |
Heating | 4 | 17541 | 1.9 | 0.94 | 14.4 | 9.46 | 7 |
Heating | 5 | 256515 | 28 | 0.70 | 8.19 | 5.95 | 5 |
Heating | 6 | 440522 | 48 | 0.93 | 19.17 | 9.37 | 5 |
Heating | 7 | 73663 | 8 | 0.91 | 24.32 | 13.53 | 8 |
Heating | 8 | 620 | 0.1 | -0.19* | 71.69 | 58.96 | 55 |
*R² are usually between 0 and 1, but negative R² values can present when looking at subsets of the data, and indicate that the model performs poorly for these subsets. Here, we see poor model performance for subsets of National Construction Code climate zones 1 and 8. The negative R² for NCC climate zone 8 is caused by relatively few residences in that climate zone, and for NCC climate zone 1 the large negative R² is driven by the low need for heating in this climate zone, and the presence of a few outliers in the training dataset.
Model | NCC climate zone | residences | % residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|---|---|
Cooling | 1 | 18134 | 2.0 | 0.95 | 21.03 | 14.8 | 11 |
Cooling | 2 | 102119 | 11 | 0.80 | 6.88 | 5.09 | 4 |
Cooling | 3 | 1133 | 0.1 | 0.41 | 31.26 | 23.87 | 18 |
Cooling | 4 | 17541 | 1.9 | 0.78 | 7.39 | 5.32 | 4 |
Cooling | 5 | 256515 | 28 | 0.67 | 5.4 | 3.82 | 3 |
Cooling | 6 | 440522 | 48 | 0.84 | 5.64 | 3.81 | 3 |
Cooling | 7 | 73663 | 8 | 0.77 | 5.62 | 3.56 | 2 |
Cooling | 8 | 620 | 0.1 | 0.30 | 11.25 | 6 | 4 |
State
Model | State | residences | % residences | R² | RMSE (stars) | Mean absolute error (stars) | Median absolute error (stars) |
---|---|---|---|---|---|---|---|
Star Rating | ACT | 10776 | 1.2 | 0.75 | 0.42 | 0.31 | 0.2 |
Star Rating | NSW | 351468 | 39 | 0.74 | 0.52 | 0.39 | 0.3 |
Star Rating | NT | 4162 | 0.5 | 0.57 | 0.53 | 0.41 | 0.3 |
Star Rating | QLD | 106773 | 12 | 0.60 | 0.65 | 0.5 | 0.4 |
Star Rating | SA | 26894 | 3.0 | 0.43 | 0.35 | 0.24 | 0.2 |
Star Rating | TAS | 21609 | 2.4 | 0.60 | 0.39 | 0.28 | 0.2 |
Star Rating | VIC | 358207 | 39 | 0.90 | 0.34 | 0.22 | 0.1 |
Star Rating | WA | 30358 | 3.3 | 0.47 | 0.63 | 0.43 | 0.3 |
Model | State | residences | % residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|---|---|
Heating | ACT | 10776 | 1.2 | 0.76 | 17.8 | 12.49 | 9 |
Heating | NSW | 351468 | 39 | 0.90 | 8.69 | 6.1 | 4 |
Heating | NT | 4162 | 0.5 | 0.88 | 5.73 | 2.37 | 0 |
Heating | QLD | 106773 | 12 | 0.77 | 5.83 | 3.77 | 3 |
Heating | SA | 26894 | 3.0 | 0.96 | 10.16 | 7.13 | 6 |
Heating | TAS | 21609 | 2.4 | 0.79 | 17.74 | 11.41 | 9 |
Heating | VIC | 358207 | 39 | 0.92 | 22.72 | 10.95 | 6 |
Heating | WA | 30358 | 3.3 | 0.60 | 12.14 | 7.78 | 5 |
Model | State | residences | % residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|---|---|
Cooling | ACT | 10776 | 1.2 | 0.68 | 6.65 | 4.43 | 3 |
Cooling | NSW | 351468 | 39 | 0.81 | 5.77 | 4.14 | 3 |
Cooling | NT | 4162 | 0.5 | 0.92 | 28.33 | 20.8 | 16 |
Cooling | QLD | 106773 | 12 | 0.93 | 8.92 | 6.13 | 4 |
Cooling | SA | 26894 | 3.0 | 0.78 | 6.84 | 5.02 | 4 |
Cooling | TAS | 21609 | 2.4 | 0.40 | 4.38 | 2.44 | 1 |
Cooling | VIC | 358207 | 39 | 0.76 | 5.31 | 3.45 | 2 |
Cooling | WA | 30358 | 3.3 | 0.88 | 9.72 | 5.24 | 3 |
Dwelling type
Model | Dwelling type | residences | % residences | R² | RMSE (stars) | Mean absolute error (stars) | Median absolute error (stars) |
---|---|---|---|---|---|---|---|
Star Rating | Apartment | 236283 | 26 | 0.74 | 0.58 | 0.44 | 0.3 |
Star Rating | House | 673964 | 74 | 0.81 | 0.43 | 0.29 | 0.2 |
Model | Dwelling type | residences | % residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|---|---|
Heating | Apartment | 236283 | 26 | 0.85 | 10.1 | 6.8 | 5 |
Heating | House | 673964 | 74 | 0.95 | 17.62 | 8.44 | 5 |
Model | Dwelling type | residences | % residences | R² | RMSE (MJ/m²) | Mean absolute error (MJ/m²) | Median absolute error (MJ/m²) |
---|---|---|---|---|---|---|---|
Cooling | Apartment | 236283 | 26 | 0.77 | 6.03 | 4.18 | 3 |
Cooling | House | 673964 | 74 | 0.95 | 6.68 | 4.22 | 3 |
Accuracy over predicted value range
The models have different accuracies depending on the predicted value: they are more accurate where there is more training data, and less accurate where there is less training data. The following charts show the mean absolute error obtained at different predicted output values.
Caveats and Recommendations
- The training data comprises 95% homes built post-2016, and 3.7% homes built pre-2016 and renovated post-2016, and 1.6% homes but pre-2016 with no post-2016 renovation. This results in reduced prediction accuracy on pre-2016 homes, and therefore the RapidRate model may not yet be fully representative of all Australian housing.
- The quality of the user input data affects the quality of the prediction.
- The model has not forced monotonicity of model inputs with respect to predicted output. Variation of model input values across a range may not result in a smoothly increasing or decreasing output prediction.
Last updated: 19/12/2024