Recent years have witnessed a growing focus on the reliablity of cognitive tasks, driven in part by the reliability paradox. This paradox stems from the observation that while cognitive tasks yield consistent experimental effects, they do not exhibit the same reliability when assessing individual differences. Here we investigate the reliability of the Self Perceptual Matching Task (SPMT), a widely used tool for investigating the cognitive processes underlying the self-prioritization effect (SPE), a effect that people perform better when stimuli are associated to the self than when they are to others. In this preregistered study, we evaluated the reliability of 24 SPE measures from 17 datasets (N = 805), all utilizing the SPMT. We calculated Monte-Carlo based split-half reliability (r) and intraclass correlation coefficient (ICC2) for each SPE measure. Our findings revealed a robust group-level SPE effect across datasets. However, when it comes to individual differences, SPE measures derived from reaction times (RT) and Efficiency exhibited relatively higher, compared to other SPE measures, but still unsatisfied split-half reliability (approximately 0.6). Similarly, for the reliability across multiple time points, as assessed by ICC2, RT and Efficiency demonstrated low levels of test-retest reliability (close to 0.5). These outcomes uncover the presence of a reliability paradox in the context of SPMT-based SPE assessments. While nearly all the measures of SPE displayed robust experimental effects, their reliability are low as a measurement of individual differences. We discussed the implications of the current study for future studies.