Stop Talking & Start Digging: The Importance of Getting Dirty with Data

Today’s world can be characterized by increasing speed, complexity, disorder, and interconnectedness. For organizations trying to understand their operating environment, develop products, improve services, advance their mission, identify gaps, and support overall decision-making and strategic planning, this presents a wide array of challenges. As a result, organizational processes should be focused on overcoming these challenges and should be driven by the desire for solutions – forward-looking solutions that better understanding, improve productivity, increase efficiencies, and maximize the chance for success.

Finding or creating a solution to a complex problem requires careful planning and thought. We must beak down the problem into simpler, manageable components, identify and characterize root causes, and involve relevant stakeholders in discussions and feedback sessions. We must look across our sources of data, identify any real limitations and gaps, and plan how to execute some analytical methods across the data to extract insights. The problem is, in a world of accelerating information, needs, and problems, it’s just too easy to get caught in the planning and thinking stage. We need to get down and dirty with our data.

In the year 2011, we are surrounded by resources, libraries, catalogs, tools, and software – much of it open source and/or freely available for our own personal (AND collective) use. We must learn to access and leverage these resources efficiently, not only to perform cleansing and synthesis functions, but also to inform our collection and analysis processes to make them better as time goes on. Armed with these resources and tools, we must feel comfortable jumping right into our data with the confidence that insights will be gained that otherwise would have been lost in time.

Slicing is a helpful example of this. When faced with a high-dimension data set, usually with poorly described variables, start by slicing the data into a manageable chunk with high-powered variables – time, location, name, category, score, etc. Use a data visualization program to understand order, geospatial distribution, or categorical breakdowns. Describe the data and ask questions about how collection processes led to any gaps that exist. Simple slicing and dicing separate from the root analysis can often chart a potentially workable path forward.

The bottom line is that whether it’s dirty data or larger-scale, socially-complex problems, we sometimes need to shorten the discussion of the problem itself and get our hands dirty. Sometimes we need to create a little chaos upfront in order to shake things loose and find our intended order, structure, and path forward. After all, planning your dive is important, but sometimes you need to just dive in and see where it leads you.

Advertisements

Knowns, Unknowns, and Aether Abound

Much of our lives is about problems and solutions. Faced with a barrier, we find a way to knock it down. Presented with a challenge, we work to overcome it. Our collective problems bring us together, and our collective solutions make us safer, stronger, and happier.

These problems come in many shapes and sizes: math problems, career problems, logistical problems, emotional problems, physical problems. Rather than maintain special problem solving techniques for individual problem types, we can expand our methods into a global group, and learn from one type what may be helpful for another. Sticking with math – a common language and underlying framework of nature and intellect – we can relate our methods for solving math problems to the rest of the world around (and above us).

There is an innate simplicity to many math problems: there are knowns and there are unknowns. The solutions often reside in the application of methods and operators to the knowns to determine one (or all) of the unknowns. Therefore, the first step is often determining what is known and what is unknown. Although, this notion has been most popularly¬† represented by set theory (a foundational system of mathematics that deals with collections of objects), this concept has been the spark for other applied methods and disciplines through which many more complex problems are tackled in today’s society.

In game theory, a player’s strategy can be represented by differentiating the sets of moves that could make positive gains versus those that could make negative gains, given the possible situations at each stage of the game. Closely related is decision theory, where we look for the pros and cons, uncertainties, and rationalities behind potential decisions to determine an optimal course of action. In chaos theory, we define initial conditions and explore how the behaviors of some dynamical systems change as those knowns vary or as unknowns are introduced into the system.

To delve deeper into the questions of known-unknown identification, set theory, and related applied methods, we can think about a problem that began on day one, has no end in sight, yet has made incredible progress over centuries in terms of approaching a solution: what’s above us? What’s with the sky, the planets, the stars, the universe – the aether that surrounds us?

Is the total set of knowns and unknowns about the universe infinite? Does a new known always present us with a new unknown? Is the same true for every problem, or just some? For which types of problems might this be true? Are we better at approaching a solution collectively or as individuals? How can this be determined at the onset of a problem? If the set of unknowns has no limit or boundary, is the solution intelligently impossible? Does a single element of randomness deny a complete solution from every being possible? Are we better off existing without a solution? Or would we be complete with a world of all knowns?

I fundamentally believe that we find meaning in life through the unknowns, not the knowns. The set of unknowns is infinite, and it is our drive to understand unknowns and, in general, the curiosity into the mysterious world that provides completeness. The knowns give safety, guidance, comfort, and pleasure.

For all problems we face, and as with the aether abound, we can continue to move forward, learn what we know, and question that which we don’t know. We can start with sets – knowns and unknowns – and move from there. Problem solving can be simple, if you start simple. As for the things we don’t know we don’t know – the unknown unknowns – well, we better stay curious with the mysterious, and just be happy for that.

Spectrum Logic

The visual representation of information is critical for both learning and teaching. To put something on paper and organize the information as to make visual sense – in words, lines, colors, and curves – is to recognize some understanding and to create a basis for new insight and discovery.

Logic is the study of reasoning, the systematic approach to reaching a conclusion, or the examination of competing arguments with regards to a central issue or question. Logic can be broken down into deductive and inductive reasoning, one drawing conclusions from specific examples and the other drawing conclusions from definitions or axioms. Logic can also be broken down into analysis and synthesis, one examining individual component parts and the other combining component parts into a whole. In any event, logic is a way to get from questions to answer, disbelief to belief, and data to insight.

One such type of logic is visual logic, or what I’ll call “spectrum logic”. It’s the combination of the visual representation of information and the many realms of logic. The reason I use the term “spectrum” is two-fold. First of all, it’s by definition the representation of a full range of possible values/conditions for a given topic. And second of all, it suggests continuity along its range and therefore implies a high level of seamlessness and efficiency.

So in the world of analysis and problem solving, how do we apply spectrum logic? Well, just follow every possible visual path from any origin within your visual space and try to optimize your path to the result. Place your problem in the center of a sphere/cube and run the full spectrum of paths to that center point. Left to right and right to left, bottom-up and top-down, outside in and inside out, spiral inward and spiraling out. Think about the component parts that make up the visual space, and the conditions that fall along each path. Why is your problem so complex? What makes it so complex? Can you qualify your problem in color, words, shape, and text? Can you quantify it and its components? Is it made up of many unknown dimensions or a few known ones? Picture your problem, logically break it apart, and put it back together. Take a diverse set of paths to and from your problem, and find out which one gives you an optimal set of insights in return. Hopefully, if the answers and conclusions are not clear, you’ll at least have learned something in the process.

The Intersection Of Expertise

As I begin my job search (25 applications in 2 days so far!) I keep asking myself how to describe what I’m looking for in a job and in what realm do I wish to work? There is no specific job title that describes my experience and education (e.g. “doctor” or “software engineer”) and there is no one department in which I’ve worked or wish to work (e.g. “Operations” or “Logistics” ). Yes, I have an academic background in mathematics & statistics yet it’s difficult to communicate why I have that academic background. I do not necessarily want to become a statistician but rather I fully understand the quantitative nature of things and the power that numbers, math, and quantitative methods have in all aspects of business, government, and life.

So where does this leave me? Well, unemployed and confused, for one. But that’s okay with me. I’m confident that with my capabilities, no matter how hard they may be to communicate in an application or even to a recruiter, I’ll find the position that leverages my abilities and motivation.

That being said, I think I’m at least getting close to describing where I stand, and in real-world terms. It’s at an intersection of sorts – between quantitative methods, scientific and technological realms, and the human element. It’s interdisciplinary – can fit within any group or team or stand alone as an independent researcher or consultant. It’s also dynamic – parallels the speed with which modern business operates and the flexibility required to optimally support the needs and requirements of many types of personnel.

I’ve used a similar image a few times, in posts on knowledge innovation and math in 2010 and beyond. Here I’ve intersected three main topics while including some of my strengths in the middle. Now if I could only match those to a job title…

At what intersection do you operate?