Analysis And Features Extraction Of Scanned Documents

  • Dr. Baidaa Abdul khaliq Atya , Dr. Ali Adel Saeid , Dr. Abdul-Wahab Sami Ibrahim , Prof. Dr. Abdul Monem S. Rahma4

Abstract

Scanned or printed documents play an essential role in today’s life due to the increase using of portable devices everywhere. Many people tend to capture the images of the important documents using their cams and store or share it on the internet. However, most word processing program deals with these documents as a whole image and do not have the ability to distinguish and processing the main contents of these documents. The proposed method in this research presented a structured way of scanned image analysis by dividing the classification process into three steps. In the first, the document is separated into text and images by converting it into binary image and using a threshold values and the sum of the array to determine its type. In the second, the text is separated into typed and handwritten text by extracting from the binary image a value gained by dividing largest column value by largest roe value and compared with a threshold determined previously to specify its type. In the final step, the text is separated into English and Arabic languages by computing the summation of zero that surrounded each character to determine its type. We note that the results obtained vary from one method to another, in addition to their difference according to the different analyzed documents. For example, the success rate of the first method was 99%, while the second method had a success rate of 95, while the third method had a success rate of 85%.

After the implementation of the proposed methods on randomly selected group of data set, the results show an average success rate reach 95%.

Published
2021-11-01
How to Cite
Dr. Baidaa Abdul khaliq Atya , Dr. Ali Adel Saeid , Dr. Abdul-Wahab Sami Ibrahim , Prof. Dr. Abdul Monem S. Rahma4. (2021). Analysis And Features Extraction Of Scanned Documents. Design Engineering, 9919 - 9936. Retrieved from http://www.thedesignengineering.com/index.php/DE/article/view/6042
Section
Articles