A Procedure for the Correction of Back-to-Front Degradations in Archival Manuscripts with Preservation of the Original Appearance
Abstract
Virtual restoration of digital copies of the human documental heritage is crucial for facilitating both the traditional work of philologists and paleographers and the automatic analysis of the contents. Here we propose a practical and fast procedure for the correction of the typically complex background of recto–verso historical manuscripts. The procedure has two main, distinctive features: it does not need for a preliminary registration of the two page sides, and it is non-invasive, as it does not alter the original appearance of the manuscript. This makes it suitable for the routinary use in the archives, and permits an easier fruition of the manuscripts, without any information being lost. In the first stage, the detection of both the primary text and the spurious strokes is performed via soft segmentation, based on the statistical decorrelation of the two recto and verso images. In the second stage, the noisy pattern is substituted with pixels that simulate the texture of the clean surrounding background, through an efficient technique of image inpainting. As shown in the experimental results, evaluated both qualitatively and quantitatively, the proposed procedure is able to perform a fine and selective removal of the degradation, while preserving other informative marks of the manuscript history.
References
- 1. , ICDAR 2017 competition on Document Image Binarization (DIBCO 2017), in Proc. 14th IAPR Int. Conf. Document Analysis and Recognition (2017), pp. 1396–1403. Google Scholar
- 2. , ICFHR 2018 competition on Handwritten Document Image Binarization (H-DIBCO 2018), in Proc. 16th Int. Conf. Frontiers in Handwriting Recognition (ICFHR) (2018), pp. 489–493. Crossref, Google Scholar
- 3. , Adaptive thresholding algorithm: Efficient computation technique based on intelligent block detection for degraded document images, Pattern Recognit. 43 (2010) 3177–3187. Crossref, Google Scholar
- 4. , Document image binarization using recurrent neural networks, in Proc. 13th IAPR Int. Workshop Document Analysis Systems (DAS2018) (2018), pp. 263–268. Crossref, Google Scholar
- 5. , Document image binarization with fully convolutional neural networks, in Proc. 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR 2017) (2017), pp. 99–104. Crossref, Google Scholar
- 6. , Binarization of degraded document images based on hierarchical deep supervised network, Pattern Recognit. 74 (2018) 568–586. Crossref, Google Scholar
- 7. ,
Restoring ink bleed-through degraded document images using a recursive unsupervised classification technique , Document Analysis Systems VII,Lecture Notes in Computer Science , Vol. 3872 (Springer, 2006), pp. 27–38. Crossref, Google Scholar - 8. , Blind bleed-through removal for scanned historical document image with conditional random fields, IEEE Trans. Image Process. (2016) 5702–5712. Crossref, Google Scholar
- 9. , A non-parametric framework for document bleed-through removal, in Proc. CVPR (2013), pp. 2954–2960. Crossref, Google Scholar
- 10. , User assisted ink-bleed reduction, IEEE Trans. Image Process. 19 (2010) 2646–2658. Crossref, Google Scholar
- 11. , A non-stationary density model to separate overlapped texts in degraded documents, Signal Image Video Process. 9 (2015) 155–164. Crossref, Google Scholar
- 12. , Non-local sparse image inpainting for document bleed-through removal, J. Imag. 4 (2018) 68. Crossref, Google Scholar
- 13. , Non-rigid recto–verso registration using page outline structure and content preserving warps, in Proc. 2nd Int. Workshop on Historical Document Imaging and Processing (2013), pp. 8–13. Crossref, Google Scholar
- 14. , Non-rigid registration and restoration of double-sided historical manuscripts, in Proc. Int. Conf. Document Analysis and Recognition (ICDAR) (2011), pp. 1374–1378. Crossref, Google Scholar
- 15. , Digital restoration of ancient color manuscripts from geometrically misaligned recto–verso pairs, J. Cult. Herit. 19 (2016) 511–521. Crossref, Google Scholar
- 16. , Bleed-through cancellation in non-rigidly misaligned recto–verso archival manuscripts based on local registration, Int J. Doc. Anal. Recognit. 22 (2019) 163–176. Crossref, Google Scholar
- 17. , Independent component analysis for document restoration, Int J. Doc. Anal. Recognit. 7 (2004) 17–27. Crossref, Google Scholar
- 18. , Restoration of recto–verso color documents using correlated component analysis, EURASIP J. Adv. Signal Process. 1 (2013) 58. Crossref, Google Scholar
- 19. , Fast correction of bleed-through distortion in grayscale documents by a Blind Source Separation technique, Int. J. Doc. Anal. Recognit. 10 (2007) 17–25. Crossref, Google Scholar
- 20. , Using non-negative matrix factorization for removing show-through, in Proc. LVA/ICA (2010), pp. 482–489. Crossref, Google Scholar
- 21. , Show-through cancellation in scanned images using blind source separation techniques, in Proc. Int. Conf. Image Processing ICIP, Vol. III (2007), pp. 233–236. Crossref, Google Scholar
- 22. , A routinary procedure for the correction of back-to-front damaged archival manuscripts, in Computational Collective Intelligence, Proc. IWCIM-ICCCI 2020, eds. N. T. Nguyen, B. H. Hoang, C. P. Huynh, D. Hwang, B. Trawinski and G. Vossen (Springer, 2020), pp. 838–849. Google Scholar
- 23. ,
A ground-truth bleed-through document image database , in Theory and Practice of Digital Libraries,Lecture Notes in Computer Science , eds. P. Zaphiris, G. Buchanan, E. Rasmussen and F. Loizides , Vol. 7489 (Springer, 2012), pp. 185–196. Crossref, Google Scholar - 24. Irish Script On Screen Project (2012), www.isos.dias.ie. Google Scholar
- 25. , Binarization of degraded document images based on contrast enhancement, Int. J. Doc. Anal. Recognit. 21 (2018) 123–135. Crossref, Google Scholar
- 26. , Adaptive Blind Signal and Image Processing (Wiley, New York, 2002). Crossref, Google Scholar
- 27. , Independent Component Analysis (Wiley, New York, 2001). Crossref, Google Scholar
- 28. , Adaptive document image binarization, Pattern Recognit. 33 (2000) 225–236. Crossref, Google Scholar
- 29. , Region filling and object removal by exemplar-based image inpainting, IEEE Trans. Image Process. 13 (2004) 1200–1212. Crossref, Google Scholar
- 30. , Hierarchical super-resolution-based-inpainting, IEEE Trans. Image Process. 22 (2013) 3779–3790. Crossref, Google Scholar