RapidRate model

Model Details

Developed by research scientists at CSIRO, 2023

LightGBM regression model

Star Rating – Version 13; Heating (MJ/m²) – Version 11; Cooling (MJ/m²) – Version 11

Intended Use

RapidRateTM is a tool developed by CSIRO using machine learning techniques, that can quickly rate the energy efficiency of a dwelling using a relatively small number of inputs.

RapidRateTM generates an estimated Star Rating aligned with the Nationwide House Energy Rating Scheme (NatHERS), and estimates both heating and cooling energy load. The model provides an estimate and a prediction interval (which indicates the accuracy at the individual dwelling level).

RapidRateTM is ideal for energy assessments when limited data, expertise, or time for a detailed assessment is available. It offers benefits for a range of users:

  • Homeowners and renters could use RapidRateTM to assess the energy efficiency of their own homes and identify areas for improvement.
  • Urban energy modellers could deploy RapidRateTM to quickly assess urban level energy consumption.
  • Housing data providers could enrich their datasets by utilising RapidRateTM to generate energy efficiency attributes for individual dwellings.
  • Financial institutions could use RapidRateTM to gain a better understanding of the energy efficiency qualities of their housing portfolios.

Note: RapidRateTM scores are not suitable for National Construction Code certification.

Context

Heating and cooling measures are the annual thermal performance loads, measured in units of MJ/m². They estimate the annual energy required for heating and cooling a residence, based on assumptions about occupancy and thermal comfort.

Star Ratings (which range from 0 to 10) provide an overall measure of the thermal comfort of homes, based on how much heating and cooling is required to maintain a comfortable temperature and the geographical location of the residence.

Factors

Inputs include dwelling type (house or apartment); floor area; external wall area by orientation; window area by orientation (including area double-glazed); main wall, floor and roof materials (including insulation); and postcode.

InputDescriptionFeature importance
Dwelling typeHouse or apartmentLow
PostcodePostcode of the dwellingHigh
Floor area – conditionedConditioned floor area in square metersHigh
Floor area – unconditionedUnconditioned floor area in square metersMed
Floor area – garageGarage floor area in square metersLow/Med
External wall area by orientationExternal wall area in square meters (including any windows or doors set into the wall), by orientationLow/Med
Total external wall areaSum of external wall areas at all orientationsHigh
Window area by orientationWindow area in square metersHigh
Total window areaSum of window areas at all orientationsHigh
Window area % double glazed by orientationPercentage of window area that is double glazed at each orientationLow
Total double glazed window areaSum of window areas that are double glazed at all orientationsHigh
Main external wall construction typeMain wall materialsLow
Main floor construction typeMain floor materialsMed/High
Main roof construction typeMain roof materialsMed (roof type none)
Wall, floor, roof insulation levelsR value of insulation of wall, floor and roofHigh
Site exposureHow open or protected the area surrounding a dwelling isMed
Project typeReflects the age of the dwelling and whether it has been renovatedHigh

Training Data

The model is trained with data from the NatHERS Universal Certificates collected by CSIRO since June 2016. A Universal Certificate is the assessment pathway used by most new dwellings in Australia to comply with the energy efficiency requirements in Australia’s National Construction Code. Around 130,000 certificates are added to our database each year.

Evaluation Data

Test data is used to evaluate the model’s performance separately from the training data. The model, which has never encountered this test data before, is assessed for the accuracy of its predictions.

Metrics

Evaluation metrics include R², root mean square error, mean absolute error, and median absolute error. These indicate overall model performance.

  • R², or R-squared measures how well a regression prediction model fits the data. If R² is 1, the model’s predictions are perfect, explaining everything in the data. If R² is 0.8, this means the model can account for 80% of the trends and relationships in the data, which suggests a strong fit, while an R² of 0 means the model isn’t helping at all. R² can be between 0 and 1.
  • Root mean square error (RMSE) reflects the average deviation of predictions from actual values. If RMSE is low, the model’s predictions are close to the real values, and if it’s high the predictions are further from the actual value. RMSE values can range from 0 to 500 MJ/m² (Heating) / 212.5 MJ/m² (Cooling) / 5 Stars (Star Rating).
  • Mean and median absolute error measures the average distance between the model’s predictions and the actual values. It shows, on average, how much the predictions are off, with lower mean or median indicating more accurate predictions. Note that mean is the average of values, and median is the middle value in a sorted list of values. Mean and median absolute error values can range from 0 to 1000 MJ/m² (Heating) / 425 MJ/m² (Cooling) / 10 Stars (Star Rating).

Quantitative Analyses

Please see the Metrics section for explanation of the metrics displayed here (R², RMSE, mean absolute error, median absolute error), and their potential ranges.

The following tables provide model accuracy measure for the entire model:

ModelresidencesRMSE (stars)Mean absolute error (stars)Median absolute error (stars)
Star Rating9102470.790.480.330.2
ModelresidencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
Cooling9102470.946.64.23
Heating9102470.9516.58.15

The following tables provide model accuracy measures over subsets of the data, for profiling features of particular interest:

Project type

ModelProject typeresidences% residencesRMSE (stars)Mean absolute error (stars)Median absolute error (stars)
Star RatingExisting148691.60.840.590.430.3
Star RatingNew861264950.690.460.320.2
Star RatingRenovation341143.70.760.680.50.4
ModelProject typeresidences% residencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
HeatingExisting148691.60.7883.961.6245
HeatingNew861264950.969.396.364
HeatingRenovation341143.70.8039.4326.618
ModelProject typeresidences% residencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
CoolingExisting148691.60.7312.478.456
CoolingNew861264950.946.264.083
CoolingRenovation341143.70.838.835.614

National Construction Code climate zone

Each dwelling postcode was mapped to a National Construction Code climate zone. Please see https://www.abcb.gov.au/resources/climate-zone-map for climate zone map and climate zone descriptions.

ModelNCC climate zoneresidences% residencesRMSE (stars)Mean absolute error (stars)Median absolute error (stars)
Star Rating1181342.00.480.620.480.4
Star Rating2102119110.620.650.50.4
Star Rating311330.10.580.650.530.5
Star Rating4175411.90.650.430.290.2
Star Rating5256515280.700.550.410.3
Star Rating6440522480.870.370.250.2
Star Rating77366380.850.380.260.2
Star Rating86200.10.390.680.510.4
ModelNCC climate zoneresidences% residencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
Heating1181342.0-11.01*4.781.340
Heating2102119110.665.643.953
Heating311330.10.6815.0412.2111
Heating4175411.90.9414.49.467
Heating5256515280.708.195.955
Heating6440522480.9319.179.375
Heating77366380.9124.3213.538
Heating86200.1-0.19*71.6958.9655

*R² are usually between 0 and 1, but negative R² values can present when looking at subsets of the data, and indicate that the model performs poorly for these subsets. Here, we see poor model performance for subsets of National Construction Code climate zones 1 and 8. The negative R² for NCC climate zone 8 is caused by relatively few residences in that climate zone, and for NCC climate zone 1 the large negative R² is driven by the low need for heating in this climate zone, and the presence of a few outliers in the training dataset.

ModelNCC climate zoneresidences% residencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
Cooling1181342.00.9521.0314.811
Cooling2102119110.806.885.094
Cooling311330.10.4131.2623.8718
Cooling4175411.90.787.395.324
Cooling5256515280.675.43.823
Cooling6440522480.845.643.813
Cooling77366380.775.623.562
Cooling86200.10.3011.2564

State

ModelStateresidences% residencesRMSE (stars)Mean absolute error (stars)Median absolute error (stars)
Star RatingACT107761.20.750.420.310.2
Star RatingNSW351468390.740.520.390.3
Star RatingNT41620.50.570.530.410.3
Star RatingQLD106773120.600.650.50.4
Star RatingSA268943.00.430.350.240.2
Star RatingTAS216092.40.600.390.280.2
Star RatingVIC358207390.900.340.220.1
Star RatingWA303583.30.470.630.430.3
ModelStateresidences% residencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
HeatingACT107761.20.7617.812.499
HeatingNSW351468390.908.696.14
HeatingNT41620.50.885.732.370
HeatingQLD106773120.775.833.773
HeatingSA268943.00.9610.167.136
HeatingTAS216092.40.7917.7411.419
HeatingVIC358207390.9222.7210.956
HeatingWA303583.30.6012.147.785
ModelStateresidences% residencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
CoolingACT107761.20.686.654.433
CoolingNSW351468390.815.774.143
CoolingNT41620.50.9228.3320.816
CoolingQLD106773120.938.926.134
CoolingSA268943.00.786.845.024
CoolingTAS216092.40.404.382.441
CoolingVIC358207390.765.313.452
CoolingWA303583.30.889.725.243

Dwelling type

ModelDwelling typeresidences% residencesRMSE (stars)Mean absolute error (stars)Median absolute error (stars)
Star RatingApartment236283260.740.580.440.3
Star RatingHouse673964740.810.430.290.2
ModelDwelling typeresidences% residencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
HeatingApartment236283260.8510.16.85
HeatingHouse673964740.9517.628.445
ModelDwelling typeresidences% residencesRMSE (MJ/m²)Mean absolute error (MJ/m²)Median absolute error (MJ/m²)
CoolingApartment236283260.776.034.183
CoolingHouse673964740.956.684.223

Accuracy over predicted value range

The models have different accuracies depending on the predicted value: they are more accurate where there is more training data, and less accurate where there is less training data. The following charts show the mean absolute error obtained at different predicted output values.

Caveats and Recommendations

  • The training data comprises 95% homes built post-2016, and 3.7% homes built pre-2016 and renovated post-2016, and 1.6% homes but pre-2016 with no post-2016 renovation. This results in reduced prediction accuracy on pre-2016 homes, and therefore the RapidRate model may not yet be fully representative of all Australian housing.
  • The quality of the user input data affects the quality of the prediction.
  • The model has not forced monotonicity of model inputs with respect to predicted output. Variation of model input values across a range may not result in a smoothly increasing or decreasing output prediction.

Last updated: 19/12/2024

Nationwide House Energy Rating Scheme
Supported by data from the Nationwide House Energy Rating Scheme (NatHERS) www.nathers.gov.au