Finding the best source of coverage for state social housing: Comparing Kāinga Ora (Housing New Zealand) and census data

Finding the best source of coverage for state social housing: Comparing Kāinga Ora (Housing New Zealand) and census data research paper explores the potential of administrative (admin) sources to provide census type information for estimating the dwelling stock of state social housing in New Zealand. We focus on understanding data from the 2013 and 2018 Censuses.

Download the paper below, or read the summary of key points online.

Summary of key points

Potential for using Kāinga Ora data for census

This paper explores how Kāinga Ora data could supplement census data, or contribute to future development as part of an admin-based dwelling census. Kāinga Ora is the largest state social housing provider in New Zealand with over 60,000 houses, and therefore provides a key data source. Other social housing providers include, for example, community housing providers (CHPs) and local councils. In census data, social housing is part of the ‘sector of landlord’ variable where Kāinga Ora is one category. To better understand the usability of this admin data source we need to compare it with existing census data. This report focuses on coverage and extends previous research which compared Kāinga Ora and 2013 Census data for tenure of household, sector of landlord, number of bedrooms, and weekly rent paid (Bycroft et al, 2021).

Context for comparing census with admin data

A key assumption we made in this investigation is that Kāinga Ora has a record of all their dwellings and therefore we regarded this as the base dataset for the comparisons with an assumed 100 percent coverage of their stock. This assumption entails that as the agency responsible for management of Kāinga Ora properties, this dataset is the most reliable source for identifying their dwellings. The comparison with census, however, can be complicated due to changes over time in both sources. The changes in the census methodology and the use of admin data alongside statistical imputation led to an increase of Kāinga Ora dwellings from a large to a minimal undercount by 2018 as reported in the census quality assessments. In contrast, the changes in Kāinga Ora stock have shown the opposite movement in the same period, with the number of houses decreasing in 2018 compared with 2013. There has been, for example, property transfers where other CHPs have taken over the management of Kāinga Ora properties. In addition, census collects information on the dwellings that were occupied on census night which means if the residents are away then no information is collected. This distinction does not apply to admin data since the data exists for each month whether tenants are home or not. This context provided a starting point to our more in-depth investigations in understanding this subset of housing data.

Key results

The main findings from this coverage investigation are summarised below.

  • Based on admin data, the number of ‘available-occupied’ Kāinga Ora houses fell between 2013 and 2018 (from 63,507 to 60,006 dwellings as at March).
  • In contrast, based on census data, the number of Kāinga Ora rentals increased from 2013 (52,500 dwellings) to 2018 (63,105 dwellings). Instead of minimal undercount as reported in DataInfo+ metadata, the 2018 results in this paper indicate estimated overcoverage compared with the admin source.
  • There are also some differences in the number of dwellings at regional level.

Overall, the use of admin data in the 2018 Census has improved the coverage of sector of landlord. In addition, the 2018 Census used statistical imputation for a wider number of variables than in previous censuses, including sector of landlord. This methodology is applied when there are missing responses (Stats NZ, 2019a), hence, statistical imputation is important to consider alongside the survey responses and admin data to get the full picture of the total census counts and how they relate to the numbers obtained from the admin data.

Our research suggests that some differences between Kāinga Ora and census data can be explained by missing information, misclassification, or differing concepts. If there is no information around landlord type, dwellings would be included in residual categories in census. Misclassification can arise when a census respondent ticks the wrong tenure or landlord type or does not respond at all. These gaps may lead to dwellings being assigned into a different category compared with admin data. Note that subject populations differ conceptually between census and admin data. Census collects attribute information (such as landlord type and tenure) for private occupied dwellings only. In the census, dwellings where residents were away, or were unoccupied as at census night, are excluded.

Results when comparing linked data

We investigated possible misclassification in census data based on the comparisons made with admin data by using the linked datasets in IDI which can be compared at the address-level. This linking enabled comparison of the census classification of tenure of household and sector of landlord with Kāinga Ora data. This analysis showed that:

We were able to link 90 percent of all Kāinga Ora dwellings with respective Kāinga Ora dwellings in census in 2018 data, and 76 percent of all Kāinga Ora dwellings in 2013.

Some of the remaining 10 percent (in 2018) and 24 percent (in 2013) of Kāinga Ora dwellings were found elsewhere in census indicating either:

  • evidence of misclassification in census with a mismatch of the tenure of household or sector of landlord information compared with admin data
  • linking issues which can occur when there is a problem with an address ID (such as incorrect information, or when there is more than one dwelling at an address)
  • both issues combined.

Overall, differences in coverage and classification can be due to a combination of reasons, such as insufficient information provided; non-response; response, processing, statistical imputation, and linking errors; or a dwelling being unoccupied.

Kāinga Ora data provides a good source of dwellings for this category of sector of landlord

Our analysis supports that Kāinga Ora provides an important source of data for this landlord category in the census context, for future censuses, and for other housing information needs. The importance of using and understanding the Kāinga Ora as a source is highlighted by the comparisons with past censuses where a key change for this variable was going from too few to too many properties. Based on the findings of this research, we recommend prioritising Kāinga Ora landlord information (from admin data) over the people’s responses in the census forms to achieve better accuracy of data and to reduce coverage differences and risk of misclassification errors for this subset. We also note that additional considerations towards statistical imputation are required in this context. If Kāinga Ora data is used to determine the sector of landlord (for rental tenure), statistical imputation should be avoided for this category as we would have a complete set of Kāinga Ora dwellings and therefore adding imputation might lead to an overcount. Note the recommendation to prioritise the admin data for sector of landlord applies to Kāinga Ora dwellings only as we do not have complete datasets for other landlord categories. It is also important to note that this conclusion applies to the identification of this category of dwelling in census or other statistical collections.

For the context of 2023 Census, this recommendation differs from the high-level design principle which is to respect people’s intended response as a first source of information. Our recommended approach is supported by a source which we regard (based on the assumptions of coverage, findings, and investigations supported by metadata) to be accurate for this small, yet important, subset of dwelling and household data.

ISBN 978-1-99-104980-3

/Stats NZ Public Release. View in full here.