GSL Blog - Dirty Data


Dark background, the word 'data' in green Courier font, splattered with what looks like wet mud. Anyone who has ever tried to do analysis on data which was collected without any kind of validation knows the misery of trying to tidy it up so the results can be used without too much pain.

Issues that prevent instant analysis include: ... the list goes on!

To save a lot of time and heartache, it's important to make sure your data collection as is clean as it can be - even if it doesn’t feel necessary at the time.

Perhaps you have a spreadsheet shared amongst a team for tracking something, maybe using Google Forms (other forms automation solutions are available) to get booking information for an event, you might even have custom web forms which send data into a SQL database.

You might want to use that data in a way you don't currently intend one day.
If saving from web pages into a database it is vitally important that you protect against the various text field attack vectors (SQL injection is a notable example).

Our advice when implementing your data collection system (the best system being some sort of form) is to provide a pick list for people to choose instead of free text.

There are several more solutions for helping with data entry which have the added benefit of saving keystrokes including date pickers and mandatory fields.

If you are struggling with cleaning up a dataset, need help adding data validation to your solution or help building a solution from scratch, please let us know - we're looking forward to helping you.

#data #people #automation #sql #database #building #forms #dataquality #cleandata


© Groves Solutions Ltd 2023 - All Rights Reserved