Masking sensitive information in a document

The exemplary embodiments disclose a method, a computer program product, and a computer system for protecting sensitive information. The exemplary embodiments may include using an inverted text index for evaluating one or more statistical measures of an index token of the inverted text index, using...

Full description

Saved in:
Bibliographic Details
Main Authors Grasselt, Mike W, Maier, Albert, Bremer, Lars, Saillet, Yannick, Baessler, Michael
Format Patent
LanguageEnglish
Published 10.09.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The exemplary embodiments disclose a method, a computer program product, and a computer system for protecting sensitive information. The exemplary embodiments may include using an inverted text index for evaluating one or more statistical measures of an index token of the inverted text index, using the one or more statistical measures for selecting a set of candidate tokens, extracting metadata from the inverted text index, associating the set of candidate tokens with respective token metadata, tokenizing at least one document resulting in one or more document tokens, comparing the one or more document tokens with the set of candidate tokens, selecting a set of document tokens to be masked, selecting at least part of the set of document tokens that comprises sensitive information according to the associated token metadata, masking the at least part of the set of document tokens, and providing one or more masked documents.
Bibliography:Application Number: US202017073436