Attribute | Type | Method | Formula | Reference | Comments |
---|---|---|---|---|---|
area_envelope_m2 | float64 | Computed | area_envelope_m2 = area_roof_m2 + area_facade_m2 + area_footprint_m2 | Computed area_envelope_m2 | |
area_facade_cadastre_m2 | float64 | Computed | area_facade_cadastre_m2 = length_footprint_m x height_sb3d_m | Computed area_facade_cadastre_m2 | |
area_facade_m2 | float64 | Computed | if length_facade_m > length_footprint_m (area_facade_m2 = area_facade_cadastre_m2) else area_facade_m2 = area_facade_solar_m2 + (length_footprint_m - length_facade_m) * height_sb3d_m | Computed if length_facade_m > length_footprint_m (area_facade_m2), otherwise calculated differently | |
area_facade_solar_m2 | float64 | Exterior data | Sonnendach | Exterior data | |
area_footprint_cadastre_m2 | float64 | Exterior data | TLM_footprint | Exterior data | |
area_roof_m2 | float64 | Produced | area_roof_m2 = max(area_footprint_m2, area_roof_solar_m2) | Produced area_roof_m2 | |
area_roof_solar_m2 | float64 | Exterior data | Sonnendach | Exterior data | |
coord_X | float64 | Produced | representative_point() | Produced representative X coordinate | |
coord_Y | float64 | Produced | representative_point() | Produced representative Y coordinate | |
coord_Z0 | float64 | Computed | min(z) | Computed minimum Z coordinate | |
coord_Z1 | float64 | Computed | mean(z >= z0 + (z2 - z0) / 2) | Computed mean Z coordinate | |
coord_Z2 | float64 | Computed | max(z) | Computed maximum Z coordinate | |
geometry | geometry | Produced | RegBL, SB3D, TLM_footprint | RegBL, SB3D, TLM_footprint | Produced Geometry object, polygons |
height_overall_sb3d_m | float64 | Produced | height_overall_sb3d_m = coord_Z2 - coord_Z0 | Produced maximum height of the building | |
height_sb3d_m | float64 | Computed | height_sb3d_m = coord_Z1 - coord_Z0 | Computed mean height of the building | |
id_building | object | Produced | UUID from Qbuildings by database | Produced UUID from Qbuildings by database | |
id_building3D | object | Exterior data | SB3D | SB3D | Exterior data |
length_facade_m | float64 | Exterior data | Sonnendach | Sonnendach | Exterior data |
length_footprint_cadastre_m | float64 | Exterior data | TLM_footprint | TLM_footprint | Exterior data |
2 Repository
A database is constructed through the python repository GBuildings. It is built with:
- python >=3.9: major release of the Python programming language
- SQLAlchemy: Python library allowing to interact with SQL databases
- geopandas: Python library to manipulate geographic data (geospatial extension of pandas)
The project is composed of a model folder where the database is built and a data analysis folder where additional analysis on the data obtained from the model can be performed.
2.1 Model
The model has a 5 folders:
- Aggregated/Processed/Smoothed: Those three corresponds to the construction of the corresponding database schemes.
- Postprocessing: Contains scripts to perform modifications of the database a posteriori from its construction.
- Scripts: Groups the rest with run examples, useful functions, etc
2.1.1 Aggregated
run.py : main file that uses the following dependencies to work
Settings
Class to read and import database.ini into the class “settings”, for each project
Dictionary | |
---|---|
CRS | projection |
input_directory | location of the input layers |
building | cadastre + 3D + roofs + facades |
building_register | RegBL + link table |
output_directory | location for export |
task | list of the tasks to be done |
output_dict | name of the columns for each table, corresponding to the ones of the db |
meteo | Contains temperature and irradiance for each hours of the year |
Meteonorm
Class to read meteofile as as function of “mask” location
attribute | type |
---|---|
hy | int64 |
m | int64 |
dm | int64 |
h | int64 |
G_Gh | int64 |
G_Dh | int64 |
G_Gk | int64 |
G_Dk | int64 |
G_Bn | int64 |
Ta | float64 |
Building
Class that import external layers on buildings characteristics (cadastre, building3d, roof, facade) and aggregated them into a ‘footprint’. That footprint is then used as the reference geometry to retrieve every data that falls into it.
Register (RegBL)
Class to read different version of RegBL/RCB and map with a reference database (status, class, period, system, energy source) and integrate an association table (link points with building, district, grid ID). Compute era and add height to building layer.
Attribute | Type | Method | Formula | Reference | Comments |
---|---|---|---|---|---|
area_era_m2 | float64 | Mixed | area_era_m2 = 0.93 x area_footprint_m2 x count_floor | RegBL; KHOURY, Jad. Assessment of Geneva multi-family building stock: main characteristics and regression models for energy reference area determination. 2016 | Either available in RegBL or estimated from RegBL footprint |
area_footprint_m2 | int64 | Exterior data | RegBL | ||
area_net_floor_m2 | float64 | Computed | area_net_floor_m2 = area_era_m2 / 1.245 | KHOURY, Jad. Assessment of Geneva multi-family building stock: main characteristics and regression models for energy reference area determination. 2016 | |
class | object | Exterior data | RegBL | RegBL building type classification | |
class_380/1 | object | Exterior data | SIA 380/1 | SIA building type classification | |
count_flat | float64 | Exterior data | RegBL | ||
count_floor | float64 | Exterior data | RegBL | ||
geometry | geometry | Exterior data | RegBL | Point | |
id_building | object | Produced | UUID from Qbuildings | ||
ID_egid | object | Exterior data | RegBL | ||
is_hotwater_system | bool | Computed | RegBL | ||
period | object | Exterior data | construction period | ||
period_renovation | object | Exterior data | |||
source_heating | object | Exterior data | RegBL | ||
source_hotwater | object | Exterior data | RegBL | ||
standard | object | Produced | |||
status | object | Exterior data | RegBL | ||
system_heating | object | Exterior data | RegBL | ||
system_hotwater | object | Exterior data | RegBL |
SIA 380/1
SIA standard that describes the heating, electrical and hotwater needs per building type.
Attribute | Unit | Type | Method | Formula | Reference | Comments |
---|---|---|---|---|---|---|
area_capita_380/1_m2/cap | m2/cap | float64 | Exterior data | - | SIA 380/1 | Estimated number of occupants for a given building type /m2 of ERA |
area_envelope_m2 | m2 | float64 | Computed | - | - | sum of the facades + roofs + footprint area |
area_era_m2 | m2 | float64 | Mixed | - | RegBL | - |
area_form_factor_380/1 | - | float64 | Computed | area_form_factor_380/1 = area_envelope_m2 / area_era_m2 | SIA 380/1 | useful for heating needs computation |
capita_cap | cap | float64 | Computed | capita_cap = area_era_m2 / area_capita_380/1_m2/cap | SIA 380/1 | Number of occupants |
capita_presence_380/1_h/d | h/d | float64 | Exterior data | - | SIA 380/1 | Occupancy for given building type, useful to compute heat gains from people |
class_380/1 | - | object | Exterior data | - | SIA 380/1 | Building type classification in SIA 380/1 |
electrical_factor_380/1 | - | float64 | Exterior data | - | SIA 380/1 | Factor of heat gain from appliances |
energy_el_380/1_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | - | SIA 380/1 | - |
energy_el_380/1_kWh/y | kWh/y | float64 | Computed | energy_el_380/1_kWh/y = energy_el_380/1_kWh/(m2.y) * area_era_m2 | - | - |
energy_gain_cap_380/1_kWh/y | kWh/y | float64 | Computed | energy_gain_cap_380/1_kWh/y = power_gain_cap_380/1_W x capita_presence_380/1_h/d *365/1000 | - | - |
energy_gain_el_380/1_kWh/y | kWh/y | float64 | Computed | energy_gain_el_380/1_kWh/y = energy_el_380/1_kWh/y x electrical_factor_380/1 | - | - |
energy_heating_380/1_target_ratio | - | float64 | Exterior data | - | SIA 380/1 | Conversion to heating_limit to heating_target |
energy_heating_base_380/1_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | - | SIA 380/1 | Base heating needs for a given building type |
energy_heating_delta_380/1_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | - | SIA 380/1 | Variation around the base |
energy_heating_limit_380/1_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | - | SIA 380/1 | Limit needs that the building should have |
energy_heating_target_380/1_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | - | SIA 380/1 | - |
energy_hotwater_380/1_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | - | SIA 380/1 | hotwater needed for a given building type (not used) |
flow_fresh_air_380/1_m3/(h.m2) | m3/(h.m2) | float64 | Exterior data | - | SIA 380/1 | - |
id_building | - | object | Produced | - | - | - |
ID_egid | - | object | Exterior data | - | RegBL | - |
power_gain_cap_380/1_W | W | float64 | Exterior data | - | SIA 380/1 | - |
power_gain_cap_380/1_W/cap | W/cap | float64 | Computed | - | - | - |
power_gain_cap_380/1_W/m2 | W/m2 | float64 | Computed | - | - | - |
standard | - | object | Produced | - | - | tells if the building is new or renovated |
temperature_correction_380/1 | - | float64 | Computed | temperature_correction_380/1_C = 1 + ((9.4 - temperature_exterior_mean_C) * 0.06)) | SIA 380/1 | temperature used to computed the limit heating value |
temperature_exterior_mean_C | C | float64 | Exterior data | - | Meteonorm | - |
temperature_interior_380/1_C | C | float64 | Exterior data | - | SIA 380/1 | temperature interior for the given building type |
SIA 2024
SIA standard by rooms. Used in GBuildings for day of use and simultaneity but could be used to double check the values.
Attribute | Unit | Type | Method | Formula | Reference | Comments |
---|---|---|---|---|---|---|
annual_simultaneity | float64 | Exterior data | SIA2024 | Use for hotwater | ||
area_capita_2024_m2/cap | m2/cap | float64 | Exterior data | SIA2024 | ||
area_glass_fraction_% | % | float64 | Exterior data | SIA2024 | Glass fraction according to the net floor area | |
area_net_floor_m2 | m2 | float64 | Exterior data | RegBL | reference area for SIA 2024 | |
capita_2024_cap | cap | float64 | Computed | capita_2024_cap = area_net_floor_m2 / area_capita_2024_m2/cap | SIA2024 | |
class_380/1 | - | object | Exterior data | Use to convert into class 2024 | ||
day_of_use_2024_d/y | d/y | float64 | Exterior data | SIA2024 | Number of day the hotwater is used | |
energy_el_2024_kWh/(m2.y) | kWh/(m2.y) | float64 | Computed | Somme de toutes les demandes électriques par type de bâtiments | ||
energy_el_2024_kWh/y | kWh/y | float64 | Computed | energy_el_2024_kWh/y = energy_el_2024_kWh/(y.m2) x area_era_m2 | ||
energy_el_ap_2024_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | SIA2024 | ||
energy_el_ap_2024_kWh/y | kWh/y | float64 | Exterior data | |||
energy_el_lt_2024_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | |||
energy_el_lt_2024_kWh/y | kWh/y | float64 | Exterior data | |||
energy_el_vt_2024_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | |||
energy_el_vt_2024_kWh/y | kWh/y | float64 | Exterior data | |||
energy_heating_2024_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | SIA2024 | Heating needs | |
energy_heating_2024_kWh/y | kWh/y | float64 | Computed | energy_heating_2024_kWh/y = energy_heating_2024_kWh/(y.m2) x area_era_m2 | ||
energy_hotwater_2024_kWh/(m2.y) | kWh/(m2.y) | float64 | Exterior data | SIA2024 | Hotwater needs | |
energy_hotwater_2024_kWh/y | kWh/y | float64 | Computed | energy_hotwater_2024_kWh/y = energy_hotwater_2024_kWh/(y.m2) x area_era_m2 | ||
flow_fresh_air_2024_m3/(h.m2) | m3/(h.m2) | float64 | Exterior data | SIA2024 | ||
flow_hotwater_2024_l/(cap.d) | l/(cap.d) | float64 | Exterior data | SIA2024 | ||
flow_infiltration_air_2024_m3/(h.m2) | m3/(h.m2) | float64 | Exterior data | SIA2024 | ||
flow_infiltration_air_2024_m3/h | m3/h | float64 | Exterior data | SIA2024 | ||
flow_water_2024_l/(cap.d) | l/(cap.d) | float64 | Exterior data | SIA2024 | ||
id_building | - | object | Produced | |||
ID_class_2024 | object | Exterior data | SIA2024 | Id of rooms in the building according to SIA 2024 classification | ||
ID_egid | - | object | Exterior data | |||
power_cooling_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_el_2024_W | W | float64 | Computed | |||
power_el_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_el_ap_2024_W | W | float64 | Computed | |||
power_el_ap_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_el_lt_2024_W | W | float64 | Computed | |||
power_el_lt_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_el_vt_2024_W | W | float64 | Computed | |||
power_el_vt_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_gain_ap_2024_W | W | float64 | Computed | |||
power_gain_ap_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_gain_cap_2024_W | W | float64 | Computed | |||
power_gain_cap_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_gain_intern_2024_W | W | float64 | Computed | |||
power_gain_intern_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_gain_lt_2024_W | W | float64 | Computed | |||
power_gain_lt_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_guain_extern_2024_W | W | float64 | Computed | |||
power_guain_extern_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
power_heating_2024_W | W | float64 | Computed | |||
power_heating_2024_W/m2 | W/m2 | float64 | Exterior data | SIA2024 | ||
solar_gain_factor_2024_-/m2 | float64 | Computed | solar_gain_factor_2024_-/m2 = area_facade_m2 x area_glass_fraction_fraction_2024_%/100 /area_net_floor_m2 x power_gain_glass_transmission_facotr_2024 x power_gain_solar_reduction_factor_2024 | SIA2024 | perc of facades wrt to floor area times the glass fraction for this kind of room times the transmission of the glazing (g-value) | |
standard | - | object | Exterior data | |||
standard_sia2024 | - | object | Exterior data |
Solar
Class to read the data from the Sonnendach and Sonnenfassade (different versions)
Roofs
Attribute | Unit | Type | Method | Formula | Reference | Comments |
---|---|---|---|---|---|---|
area_roof_solar_m2 | m2 | float64 | Exterior data | NA | Sonnendach | Estimated area available |
azimuth | degree | float64 | Exterior data | NA | Sonnendach | orientation of the roof |
egid | int | Exterior data | NA | RegBL | EGID of the building to which the roof belongs | |
heating_need_kWh_y | kWh/y | float64 | Exterior data | NA | Sonnendach | Estimated heating need by EGID, done by the Sonnendach project |
hotwater_need_kWh_y | kWh/y | float64 | Exterior data | NA | Sonnendach | Estimated hotwater need by EGID, done by the Sonnendach project |
id_roof | int | Exterior data | NA | Sonnendach | UUID from the project | |
lenght_roof_m | m2 | float64 | Exterior data | NA | Sonnendach | Length from geometry |
mean_annual_irr_kWh_m2_y | kWh/(m2.y) | float64 | Exterior data | NA | Sonnendach | |
roof_annual_irr_kWh_y | kWh/y | float64 | Exterior data | NA | Sonnendach | Total irradiation received on the available solar area |
roof_type | object | Exterior data | NA | Sonnendach | ||
tilt | degree | float64 | Exterior data | NA | Sonnendach | slope of the roof |
uuid_swissbuildings3d | object | Exterior data | NA | Sonnendach | UUID from SB3D to which the roof belongs |
Facades
Attribute | Unit | Type | Method | Formula | Reference | Comments |
---|---|---|---|---|---|---|
area_facade_solar_m2 | m2 | float64 | Exterior data | - | Sonnenfassade | Estimated area available |
azimuth | degree | float64 | Exterior data | - | Sonnenfassade | orientation of the facade |
egid | - | int | Exterior data | - | RegBL | EGID of the building to which the facade belongs |
facade_annual_irr_kWh_y | kWh/y | float64 | Exterior data | - | Sonnenfassade | Total irradiation received on the available solar area |
facade_type | - | object | Exterior data | - | Sonnenfassade | - |
heating_need_kWh_y | kWh/y | float64 | Exterior data | - | Sonnenfassade | Estimated heating need by EGID, done by the Sonnendach project |
hotwater_need_kWh_y | kWh/y | float64 | Exterior data | - | Sonnenfassade | Estimated hotwater need by EGID, done by the Sonnendach project |
id_facade | - | int | Exterior data | - | Sonnenfassade | UUID from the project |
lenght_facade_m | m2 | float64 | Exterior data | - | Sonnenfassade | Length from geometry |
mean_annual_irr_kWh_m2_y | kWh/(m2.y) | float64 | Exterior data | - | Sonnenfassade | - |
uuid_swissbuildings3d | - | object | Exterior data | - | Sonnenfassade | UUID from SB3D to which the facade belongs |
2.1.2 Processed
Settings
Similar to the one from Aggregated but this time, data are retrieved from the database and not external gpks.
Signature model
Use the data from the table “Aggregated”.buildings to compute the energy signature of the building.
Attribute | Unit | Type | Method | Formula | Reference | Comments |
---|---|---|---|---|---|---|
class | object | Produced | Girardin | Building type corresponding to Luc Girardin’s typification, translated from RegBL class | ||
duration_heating_signature_h/y | h/y | float64 | Computed | duration_heating_signature_h/y = sum(T_ext -<= temperature_threshold_heating_C) | Count the mb of hours where you have to heat | |
energy_cooling_signature_kWh/(m2.y) | kWh/(m2.y) | float64 | Computed | energy_cooling_signature_kWh/(m2.y) = sum(thermal_k1_cooling_W/(m2.K) x(T_ext - temperature_threshold_cooling_C) | Girardin | |
energy_gain_intern_2024_kWh/(m2.y) | kWh/(m2.y) | float64 | Computed | energy_gain_intern_2024_kWh/(m2.y) = power_gain_intern_W_m2/1000 x duration_heating_signature_h/y | ||
energy_heat_loss_signature_kWh/(m2.y) | kWh/(m2.y) | float64 | Computed | energy_heat_loss_signature_kWh/(m2.y) = sum(thermal_k1_heating_W/(m2.y) * (T_ext - temperature_interior_C) | ||
energy_heating_signature_kWh/(m2.y) | kWh/(m2.y) | float64 | Computed | energy_heating_signature_kWh/(m2.y) = sum(thermal_k1_heating_W/(m2.K) x(T_ext - temperature_threshold_heating_C) | Girardin | why not temperature_interior_C instead of threshold? Do we assume it is the same? |
energy_hotwater_signature_kWh/y | kWh/y | float64 | Computed | energy_hotwater_signature_kWh/y = flow_hotwater_l_cap_d x capita_cap x cp x rho x DT x day_of_use x annual_simultaneity | SIA 385/2 | Hotwater signature mixing data from SIA 2024, SIA 385/2 and the hottwater temperatures |
energy_loss_air_renewal_380/1_m3/(h.m2) | m3/(h.m2) | float64 | Computed | energy_loss_air_renewal_380/1_m3/(h.m2) = temperature_interior_C x flow_thermally_active_air_380/1_m3/(h.m2) x 364 x rhocp_a x 24/100 | SIA 380/1 | Ventilation loss according to formula from SIA 380/1, counting the air renewal |
energy_solar_irradiation_signature_kWh/m2 | kWh/m2 | float64 | Computed | energy_solar_irradiation_signature_kWh/m2 = sum(G_gh) when T_ext <= threshold_heating_C | Sum of irradiations when windows are closed | |
period | object | Produced | Girardin | Building period of construction, translated for Girardin’s typification | ||
power_cooling_signature_W/m2 | W/m2 | float64 | Computed | power_cooling_signature_W/m2 = thermal_k1_cooling_W/(m2.K) x temperature_cooling_dim_C + thermal_k2_cooling_W/m2 | Sizing cooling power | |
power_heating_signature_W/m2 | W/m2 | float64 | Computed | power_heating_signature_W/m2 = thermal_k1_heating_W/(m2.K) x temperature_heating_dim_C + thermal_k2_heating_W/m2 | Sizing heating power | |
power_hotwater_signature_W | kW | float64 | Computed | power_hotwater_signature_W = energy_hotwater_signature_kWh/y x 0.0003196456 x 1000 | Polysun | Hotwater sizing power, from polysun profiles. The coeff is the sum of the power divided by the max power |
rhocp_a | Wh/(m3.K) | float64 | Computed | rhocp_a = (1220 - 0.14 x z) / 3600 | SIA 380/1 | Volumetric thermal capacity |
solar_gain_factor_signature_-/m2 | float64 | Computed | solar_gain_factor_signature_-/m2 = -(energy_heating_signature_kWh/(m2.y) - energy_heat_loss_signature_kWh/(m2.y)) / energy_solar_irradiation_signature_kWh/m2 + energy_gain_intern_2024_kWh/(m2.y) | Part of the irradiation that covers for the heating needs | ||
temperature_cooling_dim_C | C | float64 | Mixed | max(T_ext) | Meteonorm | Sizing temperature for cooling technology |
temperature_cooling_return_C | C | float64 | Produced | Girardin | ||
temperature_cooling_supply_C | C | float64 | Produced | Girardin | ||
temperature_heating_dim_C | C | float64 | Mixed | min(T_ext) | Meteonorm | Sizing temperature for heating technology |
temperature_heating_supply_C | C | float64 | Produced | Girardin | ||
temperature_hotwater_return_C | C | float64 | Produced | Girardin | ||
temperature_hotwater_supply_C | C | float64 | Produced | Girardin | ||
temperature_threshold_cooling_K | C | float64 | Produced | Girardin | ||
temperature_threshold_heating_K | C | float64 | Produced | Girardin | ||
thermal_k1_cooling_W/(m2.y) | W/(m2.y) | float64 | Produced | Girardin | Heat transfer coefficient, for cooling | |
thermal_k1_heating_W/(m2.y) | W/(m2.y) | float64 | Produced | Girardin | Heat transfer coefficient, for heating | |
thermal_k2_cooling_W/m2 | W/m2 | float64 | Computed | thermal_k2_cooling_W/m2 = -thermal_k1_cooling_W/(m2.K) x temperature_threshold_cooling_C | ||
thermal_k2_heating_W/m2 | W/m2 | float64 | Computed | thermal_k2_heating_W/m2 = -thermal_k1_heating_W/(m2.K) x temperature_threshold_heating_C | ||
waste_kg/(cap.y) | kg/(cap.y) | float64 | Produced | Girardin | Solid waste production, only for residential buildings | |
waste_kg/y | kg/y | float64 | Computed | waste_kg/y = waste_kg/(cap.y) x capita_cap | ||
waste_kW | kW | float64 | Computed | waste_kW/y = waste_kg/(cap.y) x capita_cap x 16710.7 / 1000 /8760 | Girardin | Average power available over the year, from solid waste |
2.1.3 Smoothed
This layer is made up to filter the buildings containing incoherent data. There are many ways to define and treat outliers. Here, it is decided to reject first some buildings using ratio between values from different set of data. Then, columns are normalised by the ERA and are considered outliers values that are outside 3 std from the mean column value.
2.2 Data Analysis
For now, only one data analysis is built on GBuildings model, the district_archetypes.
2.2.1 District archetypes
This project’s goal is to propose a typification of districts, to help obtaining results from REHO at a larger scale. It uses 2 clustering approaches (Kmedoids and GaussianMixture) and gives some tools to compare their results.
The folder is composed of 5 parts:
- clustering_factory: the clustering model.
- data: personal additional data that can be used for the clustering, from some pre-treatments (e.g. the geothermal capacities for each districts in Geneva).
- plotting: library for plots of clustering results.
- postprocessing: methods to evaluates the clustering quality once it has been computed.
- scripts: scripts to run the model or the postprocessing part.
2.2.1.1 Clustering factory
Data recovery
The clustering retrieves the data from GBuildings.
= ClusterDistrict(db='Suisse', db_schema="Processed", ...)
cluster_class cluster_class.connect_database()
One of the tables defines what is a district. By default, it is the transformers table - that is one district corresponds to one low-voltage transformer as estimated by Gupta, Sossan, and Paolone (2020) - but you can add another table. For instance, if one wants to study the archetypes under administrative boundaries in the canton of Geneva, add the geo_girec table, link each building to a geo_girec districts and ask as reference table for the geo girec.
cluster_class.get_database_data(ref_table='geo_girec')
If not all the districts available in the database should be taken in account for the clustering, the user can filter them using any fields of the reference table. For instance, the transformers can be filtered by canton, cities, High-Voltage transformer to which they are connected, etc.
cluster_class.get_database_data(criteria={'id_canton': 2})
Finally, the user can have additional data out of the database.
Data treatment
Data that are not part of the reference districts table can be added. In this case, they are group by districts. Therefore, they should added through a dictionary that precises the name of the field and the method used to agglomerate them.
= {'egid': "count", "roof_annual_irr_kWh_y": "sum", "area_roof_solar_m2": "sum"} crit
Moreover, several methods are available to treat the data before the clustering, detailed in the table hereafter.
Method | Function |
---|---|
filter_3_std | Filters out the districts where value of any numeric attributes > |μ +/-3σ| of the dataset |
add_id_class | Adds the building’s class as a criteria for the clustering |
normalise_era | Normalises the numeric attributes by the total ERA in the district |
split_dataset | Divides the dataset randomly by smaller ones in case the original one is to big |
geothermal | Considers the geothermal capacity as an additional criteria (only if previously added in data) |
The methods to be used should be passed to the class via a dictionary with the method’s name as the key and a boolean as the value.
= {"filter_3_std": True, "geothermal": False,
methods "add_id_class": False, "normalise_era": True,
"split_dataset": False}
= {'egid': "count", "roof_annual_irr_kWh_y": "sum", "area_roof_solar_m2": "sum"}
crit
= ClusterDistrict(methods, crit, db='Suisse', db_schema="Processed", ...) cluster_class
Clustering algorithm
With the data prepared, the clustering can be performed. The user can choose which algorithm should be used (GaussianMixture or Kmedoids) and the number of iterations.
Some methods are also available on how the run should be performed, detailed on the table hereafter.
Method | Function |
---|---|
cmethod | Algorithm used |
find_cluster | Does the clustering with asked algorithm and number of iterations and keeps the best one (i.e the most stable) |
find_cluster_n | Finds the optimal number of clusters that explains the dataset provided |
plotting | Plots the results of the find_cluster_n research |
with_export | Saves the plots and the .csv of scores and occurrences |
save_clustering | Exports the districts with an additional column cluster |
The final call to the function should now be ready.
= {"filter_3_std": True, "geothermal": False,
methods "add_id_class": False, "normalise_era": True,
"split_dataset": False,
"find_cluster": False, "find_n_cluster": True,
"plotting": True, "save_clustering": True,
"with_export": True, "cmethod": "kmedoids"}
= {'egid': "count", "roof_annual_irr_kWh_y": "sum", "area_roof_solar_m2": "sum"}
crit = 8
cluster_number = 500
iter
= ClusterDistrict(methods, crit, iteration=iter, key_clustering_parameter=cluster_number, db='Suisse', db_schema="Processed")
cluster_class cluster_class.connect_database()
cluster_class.get_database_data(geo_girec=False)
= cluster_class.run_clustering() result
The search for the optimal number of clusters should be conducted with great care as the quality off the clustering depends a lot on it. The plotting is made to help that decision. The method is detailed in Loustau, Terrier, and Maréchal (2023), along with a case study conducted on Geneva.
2.2.1.2 Postprocessing
This part should contain everything that helps analysing the clustering result.
Currently the only script consists in gathering the results from REHO and compare the difference between the mean obtained with every districts and the result from the representative district. An example of this comparison is given in Figure 2.1.
The results should have been computed previously in REHO. This is not a stable script and the paths are yet to be defined properly.
2.2.1.3 Plotting
This folder groups some plotting functions that works with the clustering.