Stop Talking & Start Digging: The Importance of Getting Dirty with Data

Today’s world can be characterized by increasing speed, complexity, disorder, and interconnectedness. For organizations trying to understand their operating environment, develop products, improve services, advance their mission, identify gaps, and support overall decision-making and strategic planning, this presents a wide array of challenges. As a result, organizational processes should be focused on overcoming these challenges and should be driven by the desire for solutions – forward-looking solutions that better understanding, improve productivity, increase efficiencies, and maximize the chance for success.

Finding or creating a solution to a complex problem requires careful planning and thought. We must beak down the problem into simpler, manageable components, identify and characterize root causes, and involve relevant stakeholders in discussions and feedback sessions. We must look across our sources of data, identify any real limitations and gaps, and plan how to execute some analytical methods across the data to extract insights. The problem is, in a world of accelerating information, needs, and problems, it’s just too easy to get caught in the planning and thinking stage. We need to get down and dirty with our data.

In the year 2011, we are surrounded by resources, libraries, catalogs, tools, and software – much of it open source and/or freely available for our own personal (AND collective) use. We must learn to access and leverage these resources efficiently, not only to perform cleansing and synthesis functions, but also to inform our collection and analysis processes to make them better as time goes on. Armed with these resources and tools, we must feel comfortable jumping right into our data with the confidence that insights will be gained that otherwise would have been lost in time.

Slicing is a helpful example of this. When faced with a high-dimension data set, usually with poorly described variables, start by slicing the data into a manageable chunk with high-powered variables – time, location, name, category, score, etc. Use a data visualization program to understand order, geospatial distribution, or categorical breakdowns. Describe the data and ask questions about how collection processes led to any gaps that exist. Simple slicing and dicing separate from the root analysis can often chart a potentially workable path forward.

The bottom line is that whether it’s dirty data or larger-scale, socially-complex problems, we sometimes need to shorten the discussion of the problem itself and get our hands dirty. Sometimes we need to create a little chaos upfront in order to shake things loose and find our intended order, structure, and path forward. After all, planning your dive is important, but sometimes you need to just dive in and see where it leads you.

Advertisements