8 Limitations

Clustering is still an under-development field and many limitations arise from its use.
To begin with, it depends essentially on the input data, which must be reliable to have a good clustering quality. Yet, as discussed in Section 6.1, some QBuildings data, notably the hot water demand, require complete validation with field data. It appears all the more important as the demand for hot water has proved to be an essential criterion in the differentiation of districts.

Furthermore, the whole district concept considers the basic unit as the transformer to which the buildings are connected. A district is therefore made up of the catchment area of the transformer and all the buildings and endogenous resources it comprises. However, the transformers used are synthetic (Gupta, Sossan and Paolone, 2020) and, therefore, not necessarily representative of reality. While this does not change why one area is considered similar or different to another, it does prevent the validation of the case study. Indeed, it must have created differences that surely do not exist with “real” transformers. It is doubtful that the transformer serves only two buildings, for instance, a case that presented itself with the synthetic. This increases the likes of having extreme cases.

This last point leads to the most critical limitation: the difficulty of analysing and validating a clustering. As shown in Section 2, clustering can be extremely variable: by allowing the freedom to choose among 25 or 50 components, very different results are obtained. That is why stability is measured with the Rand index. However, this does not help in interpreting the result. As defined in the methodology, the basis for this interpretation was the results of the REHO optimisation on the districts. As discussed in Section 7.2, this could not be done due to the incompatibility of the current REHO model with some non-residential buildings. The original goal could not be tested, and the results confronted according to this goal. Because of this, the definitive preferred methodology could not be concluded.

Finally, the clustering was done on a case study which is the canton of Geneva for the amount of data available on the canton (notably for geothermal energy). It raises two questions:

  • To what extent are the typical districts defined in this case study extendable to other regions? For example, Switzerland is a very mountainous country, whereas the canton of Geneva has no such areas. Clustering could therefore not define patterns corresponding to a mountainous district. Moreover, by working only on a restricted region, the meteorological data taken into account does not vary, which could prevent the received irradiance from being more decisive than it is.

  • How much geothermal data is available for other regions so that the clustering can be redone in other places?

References

Gupta, R., Sossan, F. and Paolone, M. (eds) (2020) ‘Countrywide PV hosting capacity and energy storage requirements for distribution networks: The case of switzerland’, Applied Energy [Preprint]. doi:10.1016/j.apenergy.2020.116010.

© EPFL-IPESE 2022

Master thesis, Spring 2022

Joseph Loustau