Opensource.com - https://opensource.com/article/18/5/command-line-data-auditing
I work part-time as a data auditor. Think of me as a proofreader who works with tables of data rather than pages of prose. The tables are exported from relational databases and are usually fairly modest in size: 100,000 to 1,000,000 records and 50 to 200 fields.
I haven’t seen an error-free data table, ever. The messiness isn’t limited, as you might think, to duplicate records, spelling and formatting errors, and data items placed in the wrong field. I also find: