DOI:
https://doi.org/10.64539/sjer.v1i3.2025.314Keywords:
Positive-Unlabeled Learning, Label Noise Robustness, Bounded Loss Functions, Weak Supervision, Risk EstimationAbstract
Positive-Unlabeled (PU) learning has become a pivotal tool in scenarios where only positive samples are labeled, and negative labels are unavailable. However, in practical applications, the labeled positive data often contains noise such as mislabeled or outlier instances that can severely degrade model performance. This issue is exacerbated using traditional surrogate loss functions, many of which are unbounded and overly sensitive to mislabeled examples. To address this limitation, we propose a robust PU learning framework that integrates bounded loss functions, including ramp loss and truncated logistic loss, into the non-negative risk estimation paradigm. Unlike conventional loss formulations that allow noisy samples to disproportionately influence training, our approach caps each instance’s contribution, thereby reducing the sensitivity to label noise. We mathematically reformulate the PU risk estimator using bounded surrogates and demonstrate that this formulation maintains risk consistency while offering improved noise tolerance. A detailed framework diagram and algorithmic description are provided, along with theoretical analysis that bounds the influence of corrupted labels. Extensive experiments are conducted on both synthetic and real-world datasets under varying noise levels. Our method consistently outperforms baseline models such as unbiased PU (uPU) and non-negative PU (nnPU) in terms of classification accuracy, area under the receiver operating characteristic curve (ROC AUC), and precision-recall area under the curve (PR AUC). The ramp loss variant exhibits particularly strong robustness without sacrificing optimization efficiency. These results demonstrate that incorporating bounded losses is a principled and effective strategy for enhancing the reliability of PU learning in noisy environments.
References
[1] J. Bekker and J. Davis, “Learning from positive and unlabeled data: a survey,” Mach Learn, vol. 109, no. 4, pp. 719–760, Apr. 2020, doi: 10.1007/s10994-020-05877-5.
[2] E. e Oliveira, M. Rodrigues, J. P. Pereira, A. M. Lopes, I. I. Mestric, and S. Bjelogrlic, “Unlabeled learning algorithms and operations: overview and future trends in defense sector,” Artif Intell Rev, vol. 57, no. 3, p. 66, Feb. 2024, doi: 10.1007/s10462-023-10692-0.
[3] K. Jaskie and A. Spanias, “Positive And Unlabeled Learning Algorithms And Applications: A Survey,” in 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA), IEEE, Jul. 2019, pp. 1–8. doi: 10.1109/IISA.2019.8900698.
[4] J. Bekker, P. Robberechts, and J. Davis, “Beyond the Selected Completely at Random Assumption for Learning from Positive and Unlabeled Data,” 2020, pp. 71–85. doi: 10.1007/978-3-030-46147-8_5.
[5] Z. Zhu et al., “Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction,” in Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, New York, NY, USA: ACM, Aug. 2023, pp. 3663–3673. doi: 10.1145/3580305.3599491.
[6] K. Song, C. Liu, and D. Jiang, “A positive-unlabeled learning approach for industrial anomaly detection based on self-adaptive training,” Neurocomputing, vol. 647, p. 130488, Sep. 2025, doi: 10.1016/j.neucom.2025.130488.
[7] Y. Zhao, Q. Xu, Y. Jiang, P. Wen, and Q. Huang, “Dist-PU: Positive-Unlabeled Learning from a Label Distribution Perspective,” 2019.
[8] W. Hu et al., “Predictive Adversarial Learning from Positive and Unlabeled Data,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 9, pp. 7806–7814, May 2021, doi: 10.1609/aaai.v35i9.16953.
[9] Q. Wang, Y. Ma, K. Zhao, and Y. Tian, “A Comprehensive Survey of Loss Functions in Machine Learning,” Annals of Data Science, vol. 9, no. 2, pp. 187–212, Apr. 2022, doi: 10.1007/s40745-020-00253-5.
[10] D. Hao, L. Zhang, J. Sumkin, A. Mohamed, and S. Wu, “Inaccurate Labels in Weakly-Supervised Deep Learning: Automatic Identification and Correction and Their Impact on Classification Performance,” IEEE J Biomed Health Inform, vol. 24, no. 9, pp. 2701–2710, Sep. 2020, doi: 10.1109/JBHI.2020.2974425.
[11] Y. Liu, “Understanding Instance-Level Label Noise: Disparate Impacts and Treatments,” 2021.
[12] J. Wilton, A. M. Y. Koay, R. K. L. Ko, M. Xu, and N. Ye, “Positive-Unlabeled Learning using Random Forests via Recursive Greedy Risk Minimization,” 2022. [Online]. Available: https://github.com/puetpaper/PUExtraTrees.
[13] H. Wang et al., “Unbiased Recommender Learning from Implicit Feedback via Weakly Supervised Learning,” 2025.
[14] X. Zhu, H. Zhang, R. Zhu, Q. Ren, and L. Zhang, “Classification with noisy labels through tree-based models and semi-supervised learning: A case study of lithology identification,” Expert Syst Appl, vol. 240, p. 122506, Apr. 2024, doi: 10.1016/j.eswa.2023.122506.
[15] H. Song, M. Kim, D. Park, Y. Shin, and J.-G. Lee, “Learning From Noisy Labels With Deep Neural Networks: A Survey,” IEEE Trans Neural Netw Learn Syst, vol. 34, no. 11, pp. 8135–8153, Nov. 2023, doi: 10.1109/TNNLS.2022.3152527.
[16] L. Ju et al., “Improving Medical Images Classification With Label Noise Using Dual-Uncertainty Estimation,” IEEE Trans Med Imaging, vol. 41, no. 6, pp. 1533–1546, Jun. 2022, doi: 10.1109/TMI.2022.3141425.
[17] X. Chen et al., “Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training,” 2020. [Online]. Available: https://github.com/
[18] J. Liu, R. Li, and C. Sun, “Co-Correcting: Noise-Tolerant Medical Image Classification via Mutual Label Correction,” IEEE Trans Med Imaging, vol. 40, no. 12, pp. 3580–3592, Dec. 2021, doi: 10.1109/TMI.2021.3091178.
[19] S. He, W. K. Ao, and Y.-Q. Ni, “A Unified Label Noise-Tolerant Framework of Deep Learning-Based Fault Diagnosis via a Bounded Neural Network,” IEEE Trans Instrum Meas, vol. 73, pp. 1–15, 2024, doi: 10.1109/TIM.2024.3374322.
[20] Y.-G. Hsieh, G. Niu, and M. Sugiyama, “Classification from Positive, Unlabeled and Biased Negative Data,” 2019.
[21] A. Ghosh, H. Kumar, and P. S. Sastry, “Robust Loss Functions under Label Noise for Deep Neural Networks,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1, Feb. 2017, doi: 10.1609/aaai.v31i1.10894.
[22] P. Li et al., “Improved Categorical Cross-Entropy Loss for Training Deep Neural Networks with Noisy Labels,” 2021, pp. 78–89. doi: 10.1007/978-3-030-88013-2_7.
[23] Y.-T. Chou, G. Niu, H.-T. Lin, and M. Sugiyama, “Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels,” 2020.
[24] R. Kiryo, G. Niu, M. C. Du Plessis, and M. Sugiyama, “Positive-Unlabeled Learning with Non-Negative Risk Estimator,” 2017.
[25] Y.-F. Li, L.-Z. Guo, and Z.-H. Zhou, “Towards Safe Weakly Supervised Learning,” IEEE Trans Pattern Anal Mach Intell, pp. 1–1, 2019, doi: 10.1109/TPAMI.2019.2922396.
[26] C. Gong, J. Yang, J. You, and M. Sugiyama, “Centroid Estimation With Guaranteed Efficiency: A General Framework for Weakly Supervised Learning,” IEEE Trans Pattern Anal Mach Intell, vol. 44, no. 6, pp. 2841–2855, Jun. 2022, doi: 10.1109/TPAMI.2020.3044997.
[27] G. Algan and I. Ulusoy, “Image classification with deep learning in the presence of noisy labels: A survey,” Knowl Based Syst, vol. 215, p. 106771, Mar. 2021, doi: 10.1016/j.knosys.2021.106771.
[28] A. Mao, M. Mohri, and Y. Zhong, “A Universal Growth Rate for Learning with Smooth Surrogate Losses,” Jul. 2024.
[29] Y. Shi, P. Wei, K. Feng, D.-C. Feng, and M. Beer, “A survey on machine learning approaches for uncertainty quantification of engineering systems,” Machine Learning for Computational Science and Engineering, vol. 1, no. 1, p. 11, Jun. 2025, doi: 10.1007/s44379-024-00011-x.
[30] B. Han et al., “A Survey of Label-noise Representation Learning: Past, Present and Future,” Feb. 2021.
[31] F. S. Aktaş, Ö. Ekmekcioglu, and M. Ç. Pinar, “Provably optimal sparse solutions to overdetermined linear systems with non-negativity constraints in a least-squares sense by implicit enumeration,” Optimization and Engineering, vol. 22, no. 4, pp. 2505–2535, Dec. 2021, doi: 10.1007/s11081-021-09676-2.
[32] V. Krishnan, A. Alrahman, A. Makdah, and F. Pasqualetti, “Lipschitz Bounds and Provably Robust Training by Laplacian Smoothing,” 2020.
[33] A. E. Boroojeny, H. Sundaram, and V. Chandrasekaran, “TRAINING ROBUST ENSEMBLES REQUIRES RETHINK-ING LIPSCHITZ CONTINUITY,” 2025. [Online]. Available: https://github.com/Ali-E/LOTOS.
[34] J. Teng, J. Ma, and Y. Yuan, “Towards Understanding Generalization via Decomposing Excess Risk Dynamics,” Mar. 2022.
[35] D.-C. Li, S. C. Hu, L.-S. Lin, and C.-W. Yeh, “Detecting representative data and generating synthetic samples to improve learning accuracy with imbalanced data sets,” PLoS One, vol. 12, no. 8, p. e0181853, Aug. 2017, doi: 10.1371/journal.pone.0181853.
[36] K. Mohammad Alfadli and A. Omran Almagrabi, “Feature-Limited Prediction on the UCI Heart Disease Dataset,” Computers, Materials & Continua, vol. 74, no. 3, pp. 5871–5883, 2023, doi: 10.32604/cmc.2023.033603.
[37] S. Garg, Y. Wu, A. Smola, S. Balakrishnan, and Z. C. Lipton, “Mixture Proportion Estimation and PU Learning: A Modern Approach.” [Online]. Available: https://github.com/acmi-lab/PU_learning
[38] Y.-Y. Qian, Y. Bai, Z.-Y. Zhang, P. Zhao, and Z.-H. Zhou, “Handling New Class in Online Label Shift,” IEEE Trans Knowl Data Eng, vol. 37, no. 9, pp. 5257–5270, Sep. 2025, doi: 10.1109/TKDE.2025.3583138.
[39] X. Chen et al., “Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training,” Proc Mach Learn Res, 2020, [Online]. Available: https://github.com/
[40] C. Zhang, X. Du, and Y. Zhang, “A Quantum-Inspired Direct Learning Strategy for Positive and Unlabeled Data,” International Journal of Computational Intelligence Systems, vol. 16, no. 1, p. 194, Dec. 2023, doi: 10.1007/s44196-023-00373-9.
[41] Y.-G. Hsieh, G. Niu, and M. Sugiyama, “Classification from Positive, Unlabeled and Biased Negative Data,” in International conference on machine learning, May 2019, pp. 2820–2829.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Lalit Awasthi, Eric Danso

This work is licensed under a Creative Commons Attribution 4.0 International License.

