Talk:Data profiling

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Six Sigma[edit]

The introduction states that one of the purposes of data profiling is to be able to apply six sigma methodologies to enterprise data. Six Sigma is a commercial product. This is innapropriate in a definition of the term. Countersubject 11:02, 5 September 2006 (UTC)[reply]

References to Cite[edit]

Here are some good references to pull from that should help legitimize and clean up the content here.

  • Data Quality and Data Profiling by David Loshin, 2008 - [1]
  • The Practioner's Guide to Data Profiling by David Loshin, SAS/Dataflux - [2]
  • Three-Dimensional Data Analysis by Ed Lindsey, 2008 - [3]
  • Data Analysis with Open Source Tools - [4] — Preceding unsigned comment added by Paulboal (talkcontribs) 13:33, 14 May 2013 (UTC)[reply]

Data Profiling is Not Just for Tabular Data[edit]

One or more of the authors implied that Data Profiling is only used on tabular data. However, data profiling also works on graph and document structures. For example JSON and XML data can be profiled.

Not just for Data Warehousing[edit]

The references to Kimball are good, but there should be indications that data profiling is for a much wider use than just warehouses. It is equally applicable to OLTP systems, Big Data, and machine learning/predictive analytics that are not warehouse driven (Among others) 205.214.190.129 (talk) 17:13, 2 August 2016 (UTC)[reply]