Machine learning to extract invoice data from pictures of Invoices

  Posted at : 6 months ago
An application using OCR and Machine Learning needs to be built which will capture invoice , receipts, bills data from clicked photo images in various formats which include png, bmp, jpeg , pdf and other open image formats.

sample Images shared on drive :

B.1) OCR and Machine Learning Features

1. Extract data from the clicked pictures and extract data from them.
2. Data must include meta data and line level items.
3. Create a neutral format comma separated master file for every company for every voucher type mentioned above.
4. These neutral files must be synced with latest data regularly. Only changes and new additions must be synced in these files. The time stamp for every line item must be set.
5.Meta Data must pull out the following information
1. Company Name
2. Company GSTIN no
3. Date of Document
4. SGST Amount
5. CGST Amount
6. IGST Amount
7. Total Amount
8. HSN Code of the item.
9. Item Name
10. Invoice No
11. Invoice Date
6. Line Level pull out must contain
1. Item Name
2.HSN Code
3. Line level SGST Amount
4. Line level CGST Amount
5. Line level IGST Amount
6. Line level Amount
7.Line level HSN Code