Iris liveness detection methods have been developed to overcome the vulnerability of iris biometric systems to spoofing attacks. In the literature, it is typically assumed that a known attack modality will be perpetrated. Then liveness models are designed using labelled samples from both real/live and fake/spoof distributions, the latter derived from the assumed attack modality. In this work it is argued that a comprehensive modelling of the spoof samples is not possible in a real-world scenario where the attack modality cannot be known with a high degree of certainty. In fact making this assumption will render the liveness detection system more vulnerable to attacks that were not included in the original training. To provide a more realistic evaluation, this work proposes: a) testing the binary models with unknown spoof samples that were not present in the training step; b) the use of a single-class classification designing the classifier by modelling only the distribution of live samples. The results obtained support the assertion that many evaluation methods from the literature are misleading and may lead to optimistic estimates of the robustness of liveness detection in practical use cases.