Quantcast
Channel: XML, System.Xml, MSXML and XmlLite forum
Viewing all articles
Browse latest Browse all 935

Data Scrubbing Utility Performance

$
0
0

I have written a utility that pulls data from each table in a database, loops through the data and cleanses it of NPI/PII information and then saves the cleansed data to .sql files in the form of insert statements. The utility is written in C# and uses a SqlDataReader to pull the data into the application. Currently the biggest problem that I am having is one of the tables has an XML column (the column itself is defined as nvarchar(max) not XML) and the XML packets have NPI/PII data in them. In order to wipe the NPI/PII data I have a .txt file that contains all of the possible XPaths that contain this type of data, I read the .txt file into memory and then for each row I loop through the XPaths and cleanse whatever data is found. Needless to say, this takes time, a lot of time because the table with the XML is rather large. Is there a better way to do this to up the performance of the utility?

The code that cleanses the packet is:

        private static XmlNode CleanTxn(XmlDocument node, string[] xPaths)
        {
            XmlNamespaceManager nsmgr = new XmlNamespaceManager(node.NameTable);
            nsmgr.AddNamespace("a", "http://schemas.bankerssystems.com/2004/ExpereTxn");
            XmlElement root = node.DocumentElement;

            foreach (XmlNode xn in from xPath in xPaths select xPath.Replace("/", "/a:") into xp select root?.SelectNodes(xp, nsmgr) into nodeList where nodeList?.Count > 0 from XmlNode xn in nodeList select xn)
            {
                xn.InnerText = Utilities.CleanString(xn.InnerText);
            }

            return node;
        }

Viewing all articles
Browse latest Browse all 935

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>