Data quality is one of the biggest problems with data science projects,<\/p>\n
I’ll be talking about these at the\u00a0#AutomationGuild<\/strong>, here’s a quick list:<\/p>\n – Accuracy. Is the data accurate in the context to be used<\/p>\n – Validity. Is the data fresh enough, still valid?<\/p>\n – Consistency. Data from different sources \/ time frames matches<\/p>\n – Completeness. No parts of data are truncated \/ missing<\/p>\n – Uniqueness. Enough data to uniquely identify records<\/p>\n – Timeliness. Data being collected at the right time & processed in a timely fashion (efficient enough)<\/p>\n More on the conference here:<\/p>\n http:\/\/amp.gs\/Dydu<\/p>\n #QsDaily<\/strong>\u00a0#BigData<\/strong>\u00a0#DataScience<\/strong>\u00a0#Testing<\/strong><\/p>\n<\/div>