On the realistic validation of photometric redshifts

Abstract Two of the main problems encountered in the development and accurate validation of photometric redshift (photo-z) techniques are the lack of spectroscopic coverage in the feature space (e.g. colours and magnitudes) and the mismatch between the photometric error distributions associated with...

Full description

Saved in:

Bibliographic Details
Published in	Monthly notices of the Royal Astronomical Society Vol. 468; no. 4; pp. 4323 - 4339
Main Authors	Beck, R., Lin, C.-A., Ishida, E. E. O., Gieseke, F., de Souza, R. S., Costa-Duarte, M. V., Hattab, M. W., Krone-Martins, A.
Format	Journal Article
Language	English
Published	Oxford University Press 01.07.2017
Subjects	methods: statistical methods: data analysis catalogues techniques: photometric galaxies: distances and redshifts
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract Two of the main problems encountered in the development and accurate validation of photometric redshift (photo-z) techniques are the lack of spectroscopic coverage in the feature space (e.g. colours and magnitudes) and the mismatch between the photometric error distributions associated with the spectroscopic and photometric samples. Although these issues are well known, there is currently no standard benchmark allowing a quantitative analysis of their impact on the final photo-z estimation. In this work, we present two galaxy catalogues, Teddy and Happy, built to enable a more demanding and realistic test of photo-z methods. Using photometry from the Sloan Digital Sky Survey and spectroscopy from a collection of sources, we constructed data sets that mimic the biases between the underlying probability distribution of the real spectroscopic and photometric sample. We demonstrate the potential of these catalogues by submitting them to the scrutiny of different photo-z methods, including machine learning (ML) and template fitting approaches. Beyond the expected bad results from most ML algorithms for cases with missing coverage in the feature space, we were able to recognize the superiority of global models in the same situation and the general failure across all types of methods when incomplete coverage is convoluted with the presence of photometric errors – a data situation which photo-z methods were not trained to deal with up to now and which must be addressed by future large-scale surveys. Our catalogues represent the first controlled environment allowing a straightforward implementation of such tests. The data are publicly available within the COINtoolbox (https://github.com/COINtoolbox/photoz_catalogues).
ISSN:	0035-8711 1365-2966
DOI:	10.1093/mnras/stx687