Emotion-modulated startle is a frequently used method in affective science. Although there is a growing literature on the reliability of this measure, it is presently unclear how many startle responses are necessary to obtain a reliable signal. The present study therefore evaluated the reliability of startle responding as a function of number of startle responses (NoS) during a widely used threat-of-shock paradigm, the NPU-threat task, in a clinical (N = 205) and non-clinical (N = 92) sample. In the clinical sample, internal consistency was also examined independently for healthy controls vs. those with panic disorder and/or major depression and retest reliability was assessed as a function of NoS. Although results varied somewhat by diagnosis and for retest reliability, the overall pattern of results suggested that six startle responses per condition were necessary to obtain acceptable reliability in clinical and non-clinical samples during this threat-of-shock paradigm in the present study.