SharePoint Hybrid (OCR) Search
This idea is to enable SharePoint search to perform optical character recognition of images (incl. scanned PDF documents) when they are crawled by the SharePoint hybrid crawler. This will make it possible to search for text inside images (and scanned PDF documents) and find these documents more easily. Today, it is not possible to search for text inside such images or documents.
Dan Gøran Lunde commented
OCR can be supported via content enrichment. The pre-built CEWS Pipeline Toolkit already includes integrations with OCR vendors.
Lawrence Dwight commented
Given the need to classify and protect data given GDPR and other data privacy laws and regulations this is becoming increasingly important to enable automatic scanning including using Azure Information Protection Scanner.