Auto Classification of Documents During Indexing
Currently we have a large amount of file share and SharePoint data that contains Personally Identifiable Information and Sensitive Data. It would be ideal to be able to auto tag documents at the index level to filter out those documents in Search results. Additionally, it would also allow for tagging documents in a positive way, for more refined searching.
Thank you for your feedback! Your idea sounds interesting, but we would like more information to better understand, What changes to you suggest to improve the admin experience for sensitive documents? To improve the end user experience?
Dan Gøran Lunde commented
This is typically solved via the Content Enrichment Web Service. It is pretty trivial to implement functionality that recognizes PII and then tags the document.