Detailed advantages of data cleansing
We collected some more reasons for why you should also start data cleansing with our solution: Our solution can be…
Real-time validation is the brand new method of data cleansing that, in a preventive way, monitors and corrects master data prior to entry and storage into the database. Quality Monitor is a proprietary tool based on step-by-step methodology that offers entry opportunities for companies situated on different levels from a data quality point of view. The solution puts an end to the idea that data cleansing activity needs to be performed again and again from time to time. As a unique tool on the market, it offers a solution to the creation of perfect quality data.
Detecting and correcting (or removing) corrupt or inaccurate records from a record set, table or database with algorithmic tests (where data is inferred using mathematical algorithms), as well as comparing data to a valid reference database and determining the correct value based on the comparison.
Fixing incorrectly entered names and consolidating incorrect formats helps to achieve even higher data quality.
In our solution, full names are divided into elements (prefix, first name, last name, etc.) and then each item is individually corrected according to its format rules.
During address cleaning, addresses are first converted to a uniform format, then various correcting processes are run: bugs and typing errors are corrected, defects are supplemented, abbreviations are resolved, old street names are replaced by the current one. With this service, the shortcuts and obsolescence of the address lists can be eliminated.
E-mail addresses stored in the database might be flawed due to recording or typing errors.
To repair e-mail addresses, both algorithmic processing and comparison with a reference database can be performed.
Currently our solution is optimised for Hungarian identifiers, such as personal identification documents (identity card number, passport number), other identifiers (tax identification number), business identifiers (tax number, company registration number) and bank account numbers, which can be partially verified algorithmically with the help of a reference dictionary.
Using a dedicated algorithm, we find records that happen to describe the same entity: person, product, etc. We assign a duplicate identifier for each of these duplicate groups. We select the record to be retained in each duplicate group, and the rest will be eliminated. The selection of the record to be retained is made by identifying the record with highest overall data quality.
Are you interested in the detailed operation of our data cleansing & data validation solution? If you would like to know more about how we can improve your company’s data quality, request our whitepaper by clicking on “Learn more about our solution” button.
If you are just looking for some help on how to achieve success in data cleansing independently or in-house, don’t miss to download our FREE guideline with great ideas and best practices in this topic. Get this helpful document by clicking on “Download free best practices” button.
DSS Consulting Ltd. has been providing IT services for large enterprises for 20 years. Our customers consist of both Hungarian and other, international partners (German, Swedish, Swiss, American). Our most significant competences include data cleansing, custom software development, data warehouse development and BI, software quality assurance, IT consulting and GDPR solutions. Read more about our services on www.dss.hu
We collected some more reasons for why you should also start data cleansing with our solution: Our solution can be…
Total Information Quality Management (TIQM) TIQM (previously TQdM - Total Quality data Management) the information quality assurance methodology for data…
One of the most important goals of data cleaning is to detect and process duplicates. When we talk about duplication…
How do we do it? Would you like to know more? Read our blog entries!
© 2018 All Rights Reserved | Made with ❤ by DSS Consulting Kft | Data protection declaration