Some Statistics on Civil Wars

The Economist: “How to Stop the Fighting, Sometimes” (11/09/2013)

File:Ongoing military conflicts.png

I’m always very impressed with how The Economist weaves statistics (seemingly very credible ones) into its articles. In some cases they are the baseline from which the writer provides opinion, context, and insight. In other cases, they are the context – supporting a strong central theory with some quantitative love.

The recent article on civil wars was both informative on the subject of civil wars as well as a prime example of how powerful statistics can provide essential context to a central theory. Some of the most interesting statistics extracted from the article include:

  • Of the 150 large intrastate wars since 1945, fewer than 10 are ongoing.
  • The rate at which civil wars start is the same today as it has been for 60 years; they kick off every year in 1-2% of countries.
  • The average length of civil wars dropped from 4.6 to 3.7 years after 1991.
  • From 1945-1989, civil war afflicted 18% of the world’s nations.
  • Since 1989, victory for one side has only occurred in 13% of cases (compared to 58% before 1989).
  • Since 1989, negotiated endings have occurred in almost 40% of cases (compared to 10% before 1989).
  • Leadership changes are a factor in the termination of between 25-40% of civil wars.
  • Since its founding, the UN has completed 53 peacekeeping missions. The 15 ongoing ones employ almost 100,000 in uniform.

It seems a good chunk of data came from the Centre for the Study of Civil War at the Peace Research Institute Oslo (PRIO). Very cool. Image is from Wikipedia and shows ongoing military conflicts around the world (major in red, minor in orange).

A Universal Concept Classification Framework (UCCF)

Background

Whether it’s for building the perfect chapter title, analyzing existing literature, or maybe just a personal etymological adventure,  there is usefulness in providing quantitative context to words and concepts. Should such a framework exist, it should be easy-to-understand and broadly applicable for authors, students, and other individuals alike.

The Universal Concept Classification Framework (UCCF) proposed below involves five categories in which any word/concept can be scored. Each category’s score has range [0,20], spanning the full spectrum of possible values across each category. Where possible, the highest possible score for each category (20) should represent the more complex end of the spectrum (see below). The individual scores can then be summed to give a combined UCCF Score with range [0,100].

The individual category scores as well as the combined UCCF Score provide an easy way for readers and writers to understand and analyze the relative impact of certain words/concepts on readers, among other applications.

Universal Concept Classification Framework (UCCF)

  • Get (Concrete=0, Abstract=20): Low scores represent words/concepts that are concrete, tangible, well-defined, and easy to understand. High scores represent words/concepts that are abstract and open to interpretation.
  • Act (Controllable=0, Uncontrollable=20): Low scores represent words/concepts that are controllable, created, and/or driven by an individual, group, or machine. High scores represent words/concepts that are by nature uncontrollable.
  • Dim (Independent=0, Dependent=20): Low scores represent words/concepts that are independent of other words/concepts and can stand alone for meaning and interpretation. High scores represent words/concepts that are complex, very dependent upon other words/concepts, and are often very interconnected to support interpretation.
  • Set (Known=0, Changing/Unknown=20): Low scores represent words/concepts that are very well known and not subject to change in meaning or interpretation across time, language, and society. High scores represent words/concepts that change rapidly or may be universally undefined across time, language, and society.
  • Rad (Plain=0, Intriguing=20): Low scores represent words/concepts that are plain and without dimension. High scores represent words/concepts that are multidimensional, mysterious, and full of intrigue.

Limitations/Applications

No framework is without fault, and especially in the measurement of unstructured information, the UCCF certainly has limitations. However, it’s a quick and easy way to begin to better understand words/concepts, and I believe this type of methodology has broad applications.

One example is in the building of book titles and chapters, where authors may want to represent a broad spectrum of word types. One type of chapter may want to maximize combined UCCF Scores, others may want to keep combined UCCF Scores to a minimum, and a third type may want to have words that cover the widest range of combined UCCF Scores.

Another application may be in the analysis of certain authors, languages, or successful books in general. Do authors write about similar concepts according to the UCCF? Is there a correlation between successful books and the UCCF Scores represented by certain titles? These types of questions could be investigated using a new quantitative approach.

In general, applying simple quantitative methods to abstract ideas can provide a new way for thinking and contextualizing decisions, such as choosing book titles, and analyzing content and content creators, such as popular authors/bloggers.

The Future of Analytics and Operations Research (WINFORMS Panel Discussion)

Program/Title: The Future of Analytics and Operations Research
Organization: Washington, DC Chapter of the Institute for Operations Research and the Management Sciences (WINFORMS)
Date/Time: Tue February 21, 2012 1800-2030 EST
Description: The exponential explosion in the amount of data available has spawned a new field: “analytics.” This recent arriviste is forcing the operations research (OR) community to reconsider how we work, with both clear benefits and risks – not only in areas like data integrity, but the very foundations of statistical problem-solving. How do we define analytics, and how does analytics relate to OR? What is the future of analytics? We’ll ask these provocative questions and others to three of our best OR intellectuals in the Washington DC area.

General Notes / Topics of Discussion

  • The difference between having an “outcomes focus” versus a “process focus”
  • Scope of similar disciplines – analytics and operations research – are they competing or allied?
  • Communication to decision makers critical – how are these skills being developed in both disciplines?
  • Philosophy of science / having the “soft skills” – is this taught, learned, or experienced?
  • When to shy away from problems – lack of customer support, intended answer, etc.
  • The difference between problems and messes… which is worse?
  • Defining constraints/limitations and discussing assumptions (e.g. acceptable solutions under certain budget constraints)
  • The importance of defining (and redefining) the problem. Critical in today’s business climate.
  • Ideal skills: Hacker skills, subject matter expertise, communication skills, ability to listen, wargaming, organizational psychology, humility, natural curiosity
  • Other related disciplines: Data Science, Statistics, Business Analytics, Big Data, etc. – how do these affect the operations research community?

Further Reading / Related Links

World Statistics Day and the Importance of Statistics in Government

The most recent issue of Amstat News features a wonderful summary of the first ever World Statistics Day, which just occurred on October 20, 2010. The article features a series of quotes from the chief statisticians at various U.S. government agencies, all of which serve as a great overview of the critical importance, broad applicability, and growing need for statistics and statistics professionals in the U.S. and around the world. Collectively, we must embrace not only the numbers, data, methods, analyses, and reports, but also the conversations and the debate around such components. In a world heavily fueled by data, I’m very glad that statistics is gaining more international awareness and recognition so that all our lives can be bettered by more informed decisions and debates.

“Statistics produced by the federal government inform public and private decisionmakers in shaping policies, managing and monitoring programs, identifying problems and opportunities for improvement, tracking progress, and measuring change. The programs of our statistical system furnish key information to guide decisionmakers as they respond to pressing challenges, including those associated with the economy, agriculture, crime, education, energy, the environment, health, science, and transportation. In a very real sense, these statistics provide data users with a lens to focus the myriad activities of our society into a more coherent picture of the status, progress, and trends in our nation. The ability of governments, businesses, and individuals to make appropriate decisions about budgets, employment, investments, taxes, and a host of other important matters depends critically on the ready availability of relevant, accurate, and timely federal statistics. Our economy’s complexity, growth, and rapid structural changes require that public and private leaders have unbiased, relevant information on which to base their decisions.”
– Katherine Wallman, Chief Statistician, Office of Management and Budget and Past President of the ASA

A few more important (and relevant) statistics resources can be found at:

Focus, Balance, and Strength

One is for focus, two for balance, and three for strength. From the most basic sequence of integers we can understand critical characteristics and qualities that, in a sense, provide a backbone by which we can be happy, learn, and grow.

One is one. There is nothing to surround it, there is nothing to be bent. It’s the focal point of many, and the starting spot for all. Above one comes everything else and into one everything comes.

Our society puts a lot of focus on one. We like to see a single result and hear a single voice. We want to find our soul mate and discover the holy grail. We seek to structure our world by its basic individual units, the atoms and nodes. We break down our problems into individually digestible chunks. One is the basic unit of math, the center of gravity, the perfect result. One is the focus and concentration of everything else.

But one stands alone. Where one is one, one is only one. One would be none if no two came from one.

Two is the balance of ones, the pairs of nature, the couplets of science, the squares of math, the rhythm and meter of poetry. Two is evenness and congruence. Two is good and evil, hot and cold, yes and no, high and low, winners and losers, protons and electrons, male and female, life and death. From two we can find harmony and bliss and make connections not previously seen by focusing on one. Two is love. Love is two. Two is the threading of life and the creator of balance within the cosmos. Two is the secret order within disorder, through connections and relationships that make us more than one.

But two still lacks shape. Where two is two, there is only one view of two. Two would be one if no three came from two.

Three is the unit of strength, the shape of our space. It represents our current (most common) perception of spatial dimensions. Three is triangulation, inflection, exponentiation, and curvature. Three is the operation and its result – a combination of the whole picture. Threes provide motion and non-linearity, a dynamic quality of life. Threes make twos unique and unbounded while making stronger our threads. Three is two and one together, forging balance and focus for strength.

Three is the strongest number. Geometrically, the triangle is the only shape that cannot be deformed without changing the length of one of its sides. Spatially, three provides dimension and perception. Three is our basic unit of existence and reality, and well, most of our buildings too.

Three also represents complexity in knowledge. If two is the threads, three is the knots. Three is multiple connections – knowledge with shape. Tie two threads together and you’re building new shapes, discovering new binds, making new questions for answers worth seeking.

And triplets are an optimization of our minds. Remember two things and you could have remembered a third. Try to remember four things and you are likely to leave one out. Triplets are an innate unit of the human mind, something by which we are all naturally bound.

Focus, balance, and strength. With three we find strength, and from three we derive balance and focus. Three qualities that make us better individuals, partners, and citizens. Three qualities that, if we learn to utilize and optimize through our life, will surely better our professional, personal, and spiritual lives.

And at the end of the day, numbers are an underlying language of life. We can look to numbers to represent many aspects of life – both physical and philosophical – to help understand how we interact, how we grow, and how to succeed. Looking at a simple sequence of numbers can provide insights that are easier to understand in a world of infinite space and color. Numbers help provide shape to our thoughts and can thread our understanding across cultures and generations. Now did somebody say math is boring? 🙂

Math Tricks, Negative Space, and Simple Beauty

Once again we start with two of my favorite things: soccer and math. I’ve talked about them both at length, for example in my “geometry in soccer” post from March 2009. Both are related by a similar underlying, structured framework. Both have rules, methods, and strategies for finding success, whether that’s solving a problem or winning a game.

What most non-players don’t understand is that despite the rules that govern both math and soccer, there are tricks to the game as well. These are the visions and insights that exist not within the simple rules and methods of an operation or a play, but rather in the negative space – the non-obvious space surrounding the operations and plays. You may find, more often than not, that recognizing these tricks in all aspects of life can provide the competing advantage necessary for happiness and success.

The soccer tricks will have to wait until after some knee surgery, so for now, I’ll stick with the math. There are thousands of known tricks in math, and probably an infinitesimal amount of unknown tricks waiting for an epiphany of recognition. Here’s an example:

Squaring Any Number Ending in “5”

Although this works for any number that ends in 5, it’s probably most practical for two digit numbers when no calculator is present. Let’s use 65 as an example, where we try to quickly compute 65 squared, or 65^2.

All you have to do is look at the number to the left of the “5” in the ones place. For our example, we have a “6”. Multiply this number by the number that follows it sequentially, which is “7” for our example. We get 6*7=42. To find our final answer of 65^2, all we have to do is take the result of our multiplication and append a “25” to the end of it, recognizing that the last two digits of the square of any number ending in “5” will always be “25”.

So for our example, we have “42” + “25” which gives us 4,225. The square of 65 is 4,225. Pretty neat, huh? Try it with some others…

25^2:     2*3=6,             “6” + “25”        = 625
95^2:     9*10=90,         “90” + “25”     = 9,025
475^2:   47*48=2,256, “2256” + “25” = 225,625

For a proof, I’ve looked to Dr. Math at the MathForum.org website. Here goes:

Let’s generalize a two-digit number ending in “5” by the representation X5, where X could be 1, 2, .., 8, or 9. Essentially, X5 is really a shorthand notation for the integer represented by

10*X + 5

Let’s go ahead and square X5:

(X5)^2 = (10*X + 5)^2 = (10*X + 5)*(10*X + 5) = 100*X^2 + 100*X + 25

Now factor our the 100 and an X from the first two terms:

= 100(X^2 + X) + 25 = 100*X*(X+1) + 25

Looking at this closely, you can see that this is exactly the product of X and the next sequential integer (X+1) with “25” appended to the end. Pretty cool, huh?

Notice that this trick works for squaring any integer that ends in “5”, not just two-digit numbers. Dr. Math shows us that for the the larger proof would have to be modified a bit (since all integers that end in “5” cannot be represented by 10*X + 5).

Seemingly Complex, But Beautifully Simple

Although the rules and structure of math may at times seem complex and chaotic, in the negative space of math we can find a beautiful simplicity through which things can fall in place. The same can be true for soccer, language, love, astronomy, cooking, and all aspects of life. Sometimes we’ve defined a framework (or have had it defined for us) of rules and methods to follow. But if we take a step back, look between the numbers and think outside the box, maybe we’ll find a simpler route to happiness and success.

A List of Some Web Data Sources

Well I needed to pull together a listing of publicly available data sources for a project, so I figured I’d post them here as well. Some descriptions and tag lines have been taken directly from the website, and some I quickly created on my own. This list is by no means comprehensive (I probably have about 100 links in the “Data” folder of my bookmarks…) but it’s a quick snapshot at some useful data sources on the web. That being said, there are a lot of considerations when targeting a data set and tomorrow’s need for data will most likely differ from today’s need for data. Build and execute a target data strategy using the vast sets of search engines, libraries, and social networks on the web and you’ll be just fine.

AggData – The advantage of AggData is that the data is collected into one file that is very raw and portable, which makes it easy to integrate into any application or website. You can browse free data sets or purchase any of the many data sets from public and private organizations for a relatively small fee.

The Association of Religion Data Archives – The ARDA Data Archive is a collection of surveys, polls, and other data submitted by researchers and made available online by the ARDA. There are nearly 500 data files included in the ARDA collection. You can browse files by category, alphabetically, view the newest additions, most popular files, or search for a file. Once you select a file you can preview the results, read about how the data were collected, review the survey questions asked, save selected survey questions to your own file, and/or download the data file.

Census.gov American FactFinder – In American FactFinder you can obtain data in the form of maps, tables, and reports from a variety of Census Bureau sources. Click here for a good listing of available data sets, visualizations, and search functionalities.

CIA World Factbook – Contains a lot of country-level metrics/statistics, although they are not very easily exportable and/or available in table format.

City Population – Gazetteer of global geographic data and limited demographic statistics per location.

Data360 – This is essentially a wiki for data. Data360 is an open-source, collaborative and free website.  The site hosts a common and shared database, which any person or organization, committed to neutrality and non-partisanship (meaning “let the data speak”), can use for presentation of reports and visualizations about the data.

Data.gov – The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. Although the initial launch of Data.gov provides a limited portion of the rich variety of Federal datasets presently available, we invite you to actively participate in shaping the future of Data.gov by suggesting additional datasets and site enhancements to provide seamless access and use of your Federal data. Visit today with us, but come back often. With your help, Data.gov will continue to grow and change in the weeks, months, and years ahead. For more information, view our How to Use Data.gov guide.

Data Marketplace – Buy and/or sell data. You can request data sets for others to build and provide for a small fee.

DBpedia – DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data.

EconoMagic – A directory of data sets specific to US states.

Factual – Factual is a platform where anyone can share and mash open data on any subject. Factual was founded to provide open access to better structured data.

FedStats – Provides access to all federal statistical agencies (by geographic scope or listed alphabetically) with a search function to discover available data sets across all US federal statistical agencies.

GapminderA non-profit venture that, through a interactive viz tool accompanied by a listing of available data tables, aims to “unveal the beauty of statistics for a fact based world view”.

GeoCommons Finder! – Upload, organize and share your Geographic Data. Then you can use their built in application called Maker! to map/visualize it.

GeoNames – The GeoNames geographical database covers all countries and contains over eight million placenames that are available for download free of charge.

Global Airport Database – Comprehensive set of global airport data (download available for free).

Global Health Facts – Search global data by health topic and/or country. You can also interactively compare data for up to five countries at a time.

Google Public Data – In addition to plainly using the main Google search engine to search for a specific data set, Google has a public data library with some valuable sets available for free.

Guardian.co.uk Data Store – Governments around the globe are opening up their data vaults – allowing you to check out the numbers for yourself. This is the Guardian’s gateway to that information. Search for government data here from the UK (including London), USA, Australia and New Zealand – and look out for new countries and places as we add them. Read more about this on the Datablog. Full list of government data sites here.

Harvard Geographic Information Systems – Contains a highly credible listing of various national and international data providers and data sources, with a strong focus on geographic data.

International Civil Aviation Organization (ICAO) – Global air traffic data available for a fee.

Infochimps – Request data sets, search for existing data sets, or post and sell your own data sets.

International Statistical Agencies
US Census Bureau: http://www.census.gov/aboutus/stat_int.html
US Bureau of Labor Statistics: http://www.bls.gov/bls/other.htm
United Nations: http://unstats.un.org/unsd/methods/inter-natlinks/sd_intstat.htm

MelissaData – Buy comprehensive zip code data for about $150. Tailored for businesses with use in marketing.

NationMaster – NationMaster is a massive central data source and a handy way to graphically compare nations. NationMaster is a vast compilation of data from such sources as the CIA World Factbook, UN, and OECD. Using the form above, you can generate maps and graphs on all kinds of statistics with ease.

National Association of Counties (NACO) – Includes a US county data library.

Numbrary – Numbrary is a free online service dedicated to finding, using and sharing numbers on the web.

OECD Stat Extracts – OECD.Stat includes data and metadata for OECD (Organization for Economic Cooperation and Development) countries and selected non-member economies.

QuickFacts (US Census Bureau Site) – Quick, easy access to facts about people, business, and geography.

StateMaster – StateMaster is a unique statistical database which allows you to research and compare a multitude of different data on US states. We have compiled information from various primary sources such as the US Census Bureau, the FBI, and the National Center for Educational Statistics. More than just a mere collection of various data, StateMaster goes beyond the numbers to provide you with visualization technology like pie charts, maps, graphs and scatterplots. We also have thousands of map and flag images, state profiles, and correlations.

United Nations Development Programme (UNDP) – Includes UN Human Development reports and statistics such as the Human Development Index.

USA Counties (US Census Bureau Site) – A directory of data tables for US states and individual counties. Includes over 6,500 data items.

Weather Underground – Provides free access to historical weather data for cities around the globe.

Wolfram|Alpha – Deemed a “computational knowledge engine”, the W|A search and discovery tool is mathematically-based and tries to turn queries (term-based or data-driven) into actionable knowledge with visualization of in-house data sets and information relevant to your query.

World Gazetteer – The World Gazetteer provides a comprehensive set of population data and related statistics.

World Port Source – Contains extensive data on global sea ports, characterized by size and searchable by shipping liners and other various data fields.

Curiosity, Passion, and Quantifying Human Characteristics

“You can’t light the fire of passion in someone else if it doesn’t burn in you to begin with.” – Thomas Friedman

In his The World Is Flat, Friedman speaks to the growing need for curiosity and passion in today’s job market. Core intelligence, as historically measured by the Intelligence Quotient (IQ), is and will always be important, but in a flat world it’s the curiosity and passion that will matter most.

Friedman references a Curiosity Quotient (CQ) and a Passion Quotient (PQ) that purportedly parallel the common IQ framework for scoring a person’s intelligence. More specifically, he expresses a comparative relationship between the three variables: CQ + PQ > IQ. But can curiosity and passion be measured like intelligence? More generally, can other individual characteristics be measured?

Traditional measurement is the process of obtaining a magnitude for a quantity. Things are measured by counting, and not by observation or estimation. It’s supported by strong criteria that support that measured value, such as a universal frame or scale of reference. By traditional measurement, we cannot really find CQ, PQ, or even IQ. However, there are other types of measurement…

In representational theory, measurement is defined as a correlation of numbers with entities that are not numbers. In information theory, measurement is actually a component of estimation with the uncertainty reduced infinitesimally to zero. Measurement means estimating through support of any number of measurable or unmeasurable parameters, and reducing uncertainty through various means until reaching a high-confidence end value. By the extended definitions of measurement, we can practically quantify anything!!!

So what do we get by measuring traditionally-unmeasurable human characteristics, emotions, abilities, and qualities? What do we get by identifying any new particular Qualitative Quotient (QQ) such as the CQ or PQ? Well, Friedman is on the right track here. We become smarter by surpassing our current understanding of intelligence. And as our QQs surpass the IQ, so does our ability to flatten the world, innovate, grow and succeed as a civilization and society.

The process of trying to quantify characteristics helps us realize the underlying factors that contribute to a specific quality. What makes someone passionate? How can we tell if someone is curious? Is it genetic, demontrated by experience, and exhibited sub-consciously? Can it be determined through the collective interpretation of dreams? Examining the underpinnings of qualities makes us more intelligent as individuals, organizations, and societies. Once quantified, we can look for patterns and trends in our data across different geographies, demographics, and slices of traditionally-measurable data.

What we’ll learn then, well, I’m curious to find out.

The Power of Anticipation

In today’s society, gaining an inch can be like gaining a mile.

Soccer takes a lot of skill and athleticism. You need to be able to dribble, pass, shoot, tackle, communicate, see, sprint, etc. But as I’ve stated before (“mind bend it like beckham” – 2/11/2009) it’s just as much a mental game as it is a physical one. You need to think like your opponent and play somewhat of a guessing game, connecting dots before there’s any visible relationship between them. You need to forecast outcomes, intellectually seeing into the future guided by the data that’s available.

This sort of anticipation is an imperative ability for success in the future – within any endeavor. In business, anticipation means a gaining a leading edge on the competition. For defense, it means preparation and contingency plans for what might be likely to occur. In decision-making its gaining threshold confidence in your decision – using as much relevant information to guide a range of actions, opinion,s and ultimately, outcomes. And not to mention, it helps us grab our umbrella when running out the door.

Predictive analytics, although a seemingly new, hot topic today, has been around forever. Prophets, Mayans, Nostradamus, Pythia, lunar calendars, and the Akashwani – in a historical sense the predictions were informed by a variety of sensory stimuli coupled with intuition and a variety of other external factors. Nowadays, it’s really not that different. Today, we have data and semi-sophisticated mathematical processes that parallel conscious perception and intuition. We can quantify much of what could not have been quantified in the past.

“Predictive analytics encompasses a variety of techniques from statistics, data mining and game theory that analyze current and historical facts to make predictions about future events.

In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities. Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions.” (Wikipedia)

It’s imperative that people embrace predictive analytics to inform decision-making. Math doesn’t have to make the decision – that’s mostly for humans – but the math can give a comprehensive picture that outlines components of the decision and also tells us what the decision may lead to (or may have led to in the past) in terms of primary, secondary, and tertiary outcomes. Bruce Bueno de Mesquita is a great example of this, using computer algorithms to predict world events of the future – war, proliferation, conflict, etc. Decisions are not made by computer models, but humans are briefed of probable scenarios in order to make better-informed decisions.

I’ve said this before – math can be simple when it’s made to be simple. It’s a toolbox of problem-solving techniques and thought processes to help guide real-world decisions and understanding. It’s important to not be afraid of the math – start small and grow your mathematical toolbox over time. Take it head on and don’t be overwhelmed. We all have something to learn and we all have something to gain by embracing prediction and anticipation.

So whether it’s sport, meteorology, national security, or adding garlic to the pan, find a way to anticipate. In doing so, my prediction is that you’ll be better off…

Links