Some Statistics on Constitutions

Constitution: “A body of fundamental principles or established precedents according to which a state or other organization is acknowledged to be governed.” (New Oxford American Dictionary)

Last week The Economist had an interesting article referencing Constitute, a project (and pretty slick web application) that aims to provide the world’s constitutions for people to read, search, and compare. At the most basic level, the site breaks down 189 national constitutions into common topics, themes, and provisions for easy comparison of the most powerful governing documents across the world. It also ranks the constitutions by overall scope, executive power, legislative power, and judicial independence. Below is a quick graphic comparing the constitutions of 19 countries in Central and South America.

Some interesting statistics (from a variety of sources referenced at bottom):

Every year around 5 new constitutions are written and between 30-40 constitutions are amended or revised. Since 1789, more than 900 constitutions have been written. Only about half of all written constitutions last more than 19 years (this was predicted by Thomas Jefferson in 1789 that a constitution “naturally expires at the end of 19 years”).
The longest constitution is India’s at over 146,000 words (117,000+ using English-language translation). The shortest is Jordan’s at 2,270 words using the English-language translation. The U.S. Constitution has 4,543 words (original, unamended) and 7,762 (full text).
The oldest written set of documents still governing a sovereign nation is San Marino’s “Leges Statutae Republicae Sancti Marini”, written in 1600. The oldest surviving one-document constitutional text governing a sovereign nation is widely considered to be the U.S. Constitution, written in 1789.
There are 27 amendments to the U.S. Constitution. Since 1789 there have been over 11,500 measures proposed to amend the U.S. Constitution. That’s over 50 measures per year (the rate has actually been closer to 100 measures per year more recently). Over 500 of the 11,500 measures have been proposed to amend the Electoral College. (Senate.gov)

References/Links

A Universal Concept Classification Framework (UCCF)

Background

Whether it’s for building the perfect chapter title, analyzing existing literature, or maybe just a personal etymological adventure, there is usefulness in providing quantitative context to words and concepts. Should such a framework exist, it should be easy-to-understand and broadly applicable for authors, students, and other individuals alike.

The Universal Concept Classification Framework (UCCF) proposed below involves five categories in which any word/concept can be scored. Each category’s score has range [0,20], spanning the full spectrum of possible values across each category. Where possible, the highest possible score for each category (20) should represent the more complex end of the spectrum (see below). The individual scores can then be summed to give a combined UCCF Score with range [0,100].

The individual category scores as well as the combined UCCF Score provide an easy way for readers and writers to understand and analyze the relative impact of certain words/concepts on readers, among other applications.

Universal Concept Classification Framework (UCCF)

Get (Concrete=0, Abstract=20): Low scores represent words/concepts that are concrete, tangible, well-defined, and easy to understand. High scores represent words/concepts that are abstract and open to interpretation.
Act (Controllable=0, Uncontrollable=20): Low scores represent words/concepts that are controllable, created, and/or driven by an individual, group, or machine. High scores represent words/concepts that are by nature uncontrollable.
Dim (Independent=0, Dependent=20): Low scores represent words/concepts that are independent of other words/concepts and can stand alone for meaning and interpretation. High scores represent words/concepts that are complex, very dependent upon other words/concepts, and are often very interconnected to support interpretation.
Set (Known=0, Changing/Unknown=20): Low scores represent words/concepts that are very well known and not subject to change in meaning or interpretation across time, language, and society. High scores represent words/concepts that change rapidly or may be universally undefined across time, language, and society.
Rad (Plain=0, Intriguing=20): Low scores represent words/concepts that are plain and without dimension. High scores represent words/concepts that are multidimensional, mysterious, and full of intrigue.

Limitations/Applications

No framework is without fault, and especially in the measurement of unstructured information, the UCCF certainly has limitations. However, it’s a quick and easy way to begin to better understand words/concepts, and I believe this type of methodology has broad applications.

One example is in the building of book titles and chapters, where authors may want to represent a broad spectrum of word types. One type of chapter may want to maximize combined UCCF Scores, others may want to keep combined UCCF Scores to a minimum, and a third type may want to have words that cover the widest range of combined UCCF Scores.

Another application may be in the analysis of certain authors, languages, or successful books in general. Do authors write about similar concepts according to the UCCF? Is there a correlation between successful books and the UCCF Scores represented by certain titles? These types of questions could be investigated using a new quantitative approach.

In general, applying simple quantitative methods to abstract ideas can provide a new way for thinking and contextualizing decisions, such as choosing book titles, and analyzing content and content creators, such as popular authors/bloggers.

A Simple Method for Analyzing Books

A recent Pew Research Center study found the following:

Americans 18 and older read on average 17 books each year. 19% say they don’t read any books at all. Only 5% say they read more than 50.
Fewer Americans are reading books now than in 1978.
64% of respondents said they find the books they read from recommendations from family members, friends, or co-workers.
The average reader of e-books read 24 books (the mean number) in the past 12 months; the average non-e-book consumer read an average of 15.

The first bullet above is pretty remarkable. Using 17 books/year with, let’s say, 40 years of reading (above the age of 18), that’s 680 books read in adulthood. That’s a lot.

This got me thinking about how we decide which books to buy and how our decisions on which books to buy adapt with each book that we read. Are we in tune with our changing desires and interests and is our feedback loop from both positive and negative reading experiences, well, accurate and efficient?

Some time ago, I began collecting data on my book reading experiences to allow me to analyze exactly that. Given the Pew study, I figure I’ll share my methodology in hopes it makes sense to someone else. Star ratings such as that on Amazon are certainly helpful, but my hope is to perfectly understand what works for me as to make my decisions on reading material accurate, efficient, and part of a lifelong journey for knowledge and inspiration.

Known Data Elements (Both Categorical and Quantitative)

Author
Type (Non-Fiction vs Fiction)
Genre (Thrillers/Suspense, Science/Technology, Current Affairs & Politics, etc.)
Number of Pages (using hardcover as a standard)
Date Published

Personal Data Inputs (upon book completion)

Date Completed
Tags/Notes
Readability, Flow, & Structure (RFS) – A score ranging from [0.0, 5.0] subjectively assigned to a book based on ease-of-read and the overall structure of the book.
Thought-Provoking, Engagement, & Educational Value (TEV) – A score ranging from [0.0, 5.0] subjectively assigned to a book based on how mentally stimulating it was in terms of knowledge and thought.
Entertainment, Suspense, & Likeability (ESL) – A score ranging from [0.0, 5.0] subjectively assigned to a book based on the entertainment value and overall likeability of the story, characters, and/or information presented.

Those three metrics (RFS, TEV, ESL) allow one to create a overall score for the book. My overall score is a simple sum of the three metrics, divided by the maximum possible score (15.0), and expressed as a percentage (ranging from 0% to 100%). Although I have not yet conducted any correlation studies or categorical analyses using my data (which I have for 42 books starting in Aug 2004), below is a snapshot. As for my next book, it’ll probably be a self-help guide to drop the data obsession. 🙂

Title	Author	Pages	RFS [0,5]	TEV [0,5]	ESL [0,5]	SCORE [0,100%]
A Short History of Nearly Everything	Bill Bryson	560	4.5	5.0	4.5	93%
The Alchemist	Paulo Coelho	208	4.5	4.5	4.5	90%
Life of Pi	Yann Martel	336	4.5	4.0	4.5	87%
Moneyball: The Art of Winning an Unfair Game	Michael Lewis	288	4.0	4.5	4.0	83%
Born to Be Good: The Science of a Meaningful Life	Dacher Keltner	352	4.0	4.5	3.5	80%
The Tipping Point: How Little Things Can Make a Big Difference	Malcolm Gladwell	288	4.0	4.0	4.0	80%
The Next 100 Years: A Forecast for the 21st Century	George Friedman	272	4.0	4.5	3.5	80%
Super Freakonomics: Global Cooling, Patriotic Prostitutes, and Why Suicide Bombers Should Buy Life Insurance	Steven Levitt; Stephen Dubner	288	4.0	4.0	4.0	80%
Super Crunchers: Why Thinking-By-Numbers is the New Way To Be Smart	Ian Ayres	272	4.0	4.0	4.0	80%
The Art of Strategy: A Game Theorist’s Guide to Success in Business & Life	Avinash Dixit; Barry Nalebuff	512	4.0	4.5	3.5	80%
The Long Tail: Why the Future of Business is Selling Less of More	Chris Anderson	256	4.0	4.0	3.5	77%
Outliers: The Story of Success	Malcolm Gladwell	309	4.0	4.0	3.5	77%
Body of Lies	David Ignatius	352	4.5	3.0	4.0	77%
A Walk in the Woods: Rediscovering America on the Appalachian Trail	Bill Bryson	284	3.5	4.0	3.5	73%
Kill Alex Cross	James Patterson	464	4.5	2.5	4.0	73%
The Increment	David Ignatius	400	4.0	2.5	4.5	73%
A Whole New Mind: Why Right-Brainers Will Rule the Future	Daniel Pink	272	4.0	4.0	3.0	73%
Blink: The Power of Thinking Without Thinking	Malcolm Gladwell	288	3.5	4.0	3.0	70%
Physics of the Impossible: A Scientific Exploration into the World of Phasers, Force Fields, Teleportation, and Time Travel	Michio Kaku	352	3.5	4.0	3.0	70%
The Bourne Dominion	Eric van Lustbader	432	3.5	2.5	4.5	70%
Fortune’s Formula: The Untold Story of the Scientific Betting System That Beat the Casinos and Wall Street	William Poundstone	400	3.0	4.0	3.5	70%
The Godfather	Mario Puzo	448	3.5	2.5	4.5	70%
The Sicilian	Mario Puzo	410	3.5	2.5	4.5	70%
The Invention of Air: A Story of Science, Faith, Revolution, and the Birth of America	Steven Johnson	272	3.0	4.0	3.0	67%
The Drunkard’s Walk: How Randomness Rules Our Lives	Leonard Mlodinow	272	3.0	3.5	3.5	67%
Cross Fire	James Patterson	432	4.0	1.5	4.5	67%
The Social Animal: The Hidden Sources of Love, Character, and Achievement	David Brooks	448	3.5	4.5	2.0	67%
The Golden Ratio: The Story of PHI, the World’s Most Astonishing Number	Mario Livio	294	3.0	4.0	2.5	63%
Physics for Future Presidents: The Science Behind the Headlines	Richard Muller	354	3.0	3.5	3.0	63%
The Future of Everything: The Science of Prediction	David Orrell	464	3.0	3.5	3.0	63%
The Department of Mad Scientists	Michael Belfiore	320	3.0	3.0	3.5	63%
For the President’s Eyes Only: Secret Intelligence and the American Presidency from Washington to Bush	Christopher Andrew	672	3.0	3.5	3.0	63%
Born Standing Up: A Comic’s Life	Steve Martin	209	4.0	2.0	3.0	60%
Science is Culture: Conversations at the New Intersection of Science + Society	Adam Bly (Seed Magazine)	368	2.5	3.5	3.0	60%
1491: New Revelations of the Americas Before Columbus	Charles Mann	480	2.5	3.5	2.5	57%
The Curious Incident of the Dog in the Night-Time	Mark Haddon	226	3.0	3.0	2.0	53%
Group Theory in the Bedroom, and Other Mathematical Diversions	Brian Hayes	288	2.0	3.5	2.0	50%
Euclid in the Rainforest: Discovering Universal Truth in Logic and Math	Joseph Mazur	352	2.0	3.0	2.5	50%
This is Your Brain on Music: The Science of a Human Obsession	Daniel Levitin	320	2.5	3.0	1.5	47%

Ten Technological Concepts Compressing the Analytical Timeline

Today’s difficult economic climate continues to cause increased competition for all organizations. Shrinking budgets are placing government departments and agencies under more pressure to increase operating efficiencies and cost-effectiveness of programs and technologies. Across industry, fragile markets have caused organizations to consider the need for every project, person, and printer to reduce operating costs. In the non-profit sector, slimming funding streams have caused an increased pressure to demonstrate value through concrete, measurable results.

In order to stay competitive within their particular domains, markets, and user communities – and to ultimately achieve growth and sustainability in any economic climate – all organizations must find ways to increase operating efficiencies, eliminate programmatic redundancies, and produce measurable results. Lucky for these organizations, several technological concepts have emerged over the past decade which help support these practices. In that regard, the acknowledgement, understanding, and implementation of these concepts across organizational units, programs, and processes will compress the analytical timeline and allow organizations to learn, control, adapt, and anticipate over time.

Here’s a quick look at some of the technological concepts/trends that are compressing the analytical timeline, allowing organizations to act on insights more quickly, more effectively, and more accurately:

Data Collection Mechanisms – It’s not just about collecting more data, although volume (in many cases) helps. It is about collecting more types of data (image, audio, video, open source media, social media) and collecting more tactical data. The growth of the mobile and tablet markets, the ease-of-use of such devices and their decreasing costs, and the expansion of mobile network infrastructure around the world are helping organizations collect more diverse, tactical, and (ultimately) valuable data.
Data Cleansing/Processing – Rather than ignoring unstructured data, we are beginning to embrace it. Many COTS, GOTS, and even open source technologies exist that cleanse and process unstructured data to ensure it can be used to support relevant use cases. Where unstructured data was formerly omitted from the analytical landscape, these technologies are now bringing new value and context to insights and decisions. Within this I want to also add the data storage/warehousing and processing capabilities that support big data analytics and data mining, which provides a quicker means by which the vast amount of data can be combed for relevant patterns and insights.
Logical Data Structures – It seems we are finally learning that a little thought and planning up front does wonders for the types of analysis needed to support operations research, performance measurement, marketing, and other organizational practices. By building logical data structures, we can quantify things otherwise unquantifiable and ultimately make timely, informed decisions otherwise made by intuition alone.
Data Standards/Models – In conjunction with building supportive, internal data structures, we are beginning to understand how data models within domains, across communities of interest, and for specific problem sets can do wonders for our analytical practices. By developing and/or adopting a standard, we can bring consistency to these analytical practices over time, even through personnel changes. No more one-off studies/reports, but rather repeatable and communicable analysis.
Data Source Registries/Catalogs – It is slowly being understood that ubiquitous access to raw data sets is far from a reality. However, organizations are beginning to realize that data source catalogs (registries) across organizational units and/or communities of interest is a step that can quickly facilitate more effective data sharing practices. Rather than focus on the exposure of raw data, the data source catalog first involves the exposure of data source metadata – information about the data, but not the data itself. This data sharing approach is more strongly rooted in trust and visibility and, ultimately, can provide a platform by which analysts can gain quicker access to more relevant data.
Social Networks – The social network movement has done many things to compress the analytical timeline, to include, but not limited to: driving more collaboration and interaction between data owners, analysts, end users, and ordinary people; driving a new means by which more tactical data can be accessed and collected; and facilitating the development of new platforms, applications, and technologies to glean insights from data.
Identity Management, Access Control, & Cyber Security – Knocking down stovepipes can support better access to data which in turn can support less time collecting data and more time analyzing it. However, stovepipes provide organizations with another layer of security to prevent data breaches. Despite this contradiction, better identity management, access control, and security technologies are being developed to maintain a high level of control while still ensuring users can more easily access data traditionally hidden within stovepipes. In turn, the time spent accessing and integrating data is decreased and individuals can spend more time analyzing disparate data and delivering quality insights.
Cloud Computing – The movement of information systems and applications to the cloud is transforming the analyst from being a thick-client-loving info hog to being a platform-agnostic, collaborative participant. With more data and tools exposed to individuals, no longer constrained by a single hard drive or device, analysts can more effectively and efficiently access, collect, integrate, visualize, analyze, share, and report on data and insights.
Network Infrastructure – The expansion of existing connected and wireless networks as well as the development of new, quicker, more accessible, and more secure networks will continue to compress the time it takes for analysts to provide valuable insights.
Customizable & User-Defined Interactions – Allowing individuals to define how they wish to visualize, analyze, and interact with relevant data provides analysts with the ability to focus on developing solutions rather than setting up problems. The “user-defined” movement provides flexibility and adaptability to the individual and allows a wider set of individuals to become analysts by owning their own workspaces and interactions. It also provides an interactive medium through which results can be presented, making the reporting and dissemination process interactive rather than a drawn out one-way street.

I do want to note that this list is by no means comprehensive. Even more importantly, it only focuses on technological concepts and does not address the numerous cultural and political factors that affect the analytical timeline. Although technology shall continue to be a major focus area in supporting quicker and more effective analytical practices, it is the cultural and political aspects that will be more difficult to overcome and their interdependence on the technological aspects should never be overlooked.

The Origins of Opportunity

I’ve been deep into future studies / futurology over the past few weeks. It’s an intriguing field for me at both a professional/academic level and a personal level. How can we better understand the future? Are there core methodologies that we can employ to optimize our current positioning and decision-making? How can we be better prepared for the future? What’s inevitable and what’s not? Where lies the line between info-driven forecasting and innate intuition?

Although the netweb has helped to grow and organize both the futurological information and the community through which that information is developed and shared, it seems as though the field itself remains cloudy. I must hope that at some level I can build upon the existing thoughts of others and contribute new thoughts of my own so that at least the window to the future becomes more clear.

One cornerstone of future studies is in how to evaluate, filter though, and create opportunity from statements about the future. And so I wonder: from where does opportunity arise, and how can this recognition be leveraged to inform (and in some cases, influence) the future? In general, how can we characterize the origins of opportunity?

Is it through early recognition? This is not beating others to the finish line, but rather beating others to the starting line. Can we identify gaps sooner than others?
Is it through resourceful timing? Often opportunity arises in not being first, not being last, but somewhere in between. The earliest adopter may have his/her vision obscured through too many details, the latest adopter may be left with the crumbs. And often we find much opportunity in the failure of others – developing the right trials from the errors of others.
Is it through pure knowledge and intelligence? Can brute force brainpower create the most opportunity? Or is it more dependent on the ability to apply one’s knowledge, no matter how limited it may be? Is it about having the right skill sets and tools, tactics and strategies?
Is it though pure luck? Can being in the right place at the right time govern our ability to find and harness opportunity? Is pure luck within our beyond our control?

It’s simplest to think that opportunity may arise as a result of any combination of these factors. Therefore, to maximize our opportunities, we should focus on being in the right places, having the right tools, being with the right people, understanding timing as an approach, building the right knowledge base, and building an overall recognition for the many faces of opportunity.

If we can learn to recognize opportunity and better understand where it may arise, we can begin to gain a better picture of the future. Then, we can work to inform that picture with data and models to ensure that we take full advantage of those opportunities to better our self, our communities, our world, and that of tomorrow.

Principles of Forecasting

I just finished reading a couple books about future studies and the nature of predictions and forecasts: (1) Future Savvy, by Adam Gordon and (2) The Future of Everything, by David Orrell. From the former of the two, I wanted to pull a good portion of the content from Chapter 11 and structure it here for use in future posts and projects. In Chapter 11 of his book, Gordon outlines the important questions to ask of any forecast. As decision makers and leaders, analysts and synthesizers, and organizations and citizens, it’s critical that we learn to properly evaluate and filter statements about the future so that we can optimize our decisions and, ultimately, our positioning for the future.

With that as a quick intro, here are the questions we should ask of any prediction or forecast. As Gordon states of forecasts: “they are not in themselves valuable, they are only valuable alongside a clear way to separate the wheat from the chaff”.

Purpose

What is the purpose of the forecast? Is the forecast upfront about its purpose?
Is the forecast future-aligning or future-influencing?
- Is the forecast widely publicized?
- Does it specify action to take in the external world?
- Is it a forecast of extremes?

Specificity

Is he forecast mode predictive – spelling out what will happen – or speculative, illuminating possible alternatives?
Is there too much certainty?
Is there enough certainty? Is the forecast hedging?
Is the forecast clear about the pace of change? Does it specify timelines or does it leave the question hazy?

Information Quality

How extensive and how good is the base data?
- Is the data up to date?
- Does the forecast use secondary data?
- Is the data real or a projection?

Interpretation and Bias

Are the forecast’s biases natural or intentional?
What is the reputation of the forecaster and forecast organization? Does the forecaster have anything to lose by being wrong?
Are bias-prone contexts at hand?
- Is the forecast sponsored?
- Is self-interest prominent?
- Are ideology and idealism prominent?
- Does the forecast focus on a “single issue” future?
- Is editorial oversight bypassed?

Methods and Models

Does the forecast specify its methods?
Does the forecaster imply the method is too complex, too arcane, or too proprietary to share?
Do forecast proponents trumpet their unique or “new and improved” methods?

Quantitative Limits

Is the use of quantitative methods appropriate?
Is a machine doing the thinking?

Managing Complexity

Does the forecast oversimplify the world?
Does the forecast acknowledge systemic feedback?
Does the forecast anticipate things that could speed up the future, or push it off track? Does it account for triggers and tipping points?
Does the forecast expect exponential change?

Assumptions and Paradigm Paralysis

Has adequate horizon scanning been done?
Are the assumptions stated? Is the forecaster aware of his or her own assumptions? Is the forecaster willing to entertain alternative assumptions?
Do the forecaster’s assumptions appear valid and reasonable?

Zeitgeist and Groupthink

Is the zeitgeist speaking through the forecaster?
Is the forecast jumping on the bandwagon?
Does the forecast rely on “experts”?
Does the forecast do stretch thinking? Does it allow us to break free from the “official future”?

Drivers and Blockers

Are change drivers and enablers identified? Or are trends simply projected?
Are blocking forces identified and fully accounted for? Is friction factored in?
- Have utility questions been asked and adequately answered?
- Are there proposing or opposing stakeholders, particularly powerful individuals and powerful organizations?
- Does the forecast challenge social, cultural, or moral norms?
- Whose side is the law on?
- Is the forecaster in love with the technology?
- Does the forecast underestimate the time to product emergence? Does it overestimate the pace at which people’s habits change?
- Does the forecaster assume change? Does the forecast underestimate the full hump change must overcome? Does the forecaster recognize what doesn’t change?

Spectrum Logic

The visual representation of information is critical for both learning and teaching. To put something on paper and organize the information as to make visual sense – in words, lines, colors, and curves – is to recognize some understanding and to create a basis for new insight and discovery.

Logic is the study of reasoning, the systematic approach to reaching a conclusion, or the examination of competing arguments with regards to a central issue or question. Logic can be broken down into deductive and inductive reasoning, one drawing conclusions from specific examples and the other drawing conclusions from definitions or axioms. Logic can also be broken down into analysis and synthesis, one examining individual component parts and the other combining component parts into a whole. In any event, logic is a way to get from questions to answer, disbelief to belief, and data to insight.

One such type of logic is visual logic, or what I’ll call “spectrum logic”. It’s the combination of the visual representation of information and the many realms of logic. The reason I use the term “spectrum” is two-fold. First of all, it’s by definition the representation of a full range of possible values/conditions for a given topic. And second of all, it suggests continuity along its range and therefore implies a high level of seamlessness and efficiency.

So in the world of analysis and problem solving, how do we apply spectrum logic? Well, just follow every possible visual path from any origin within your visual space and try to optimize your path to the result. Place your problem in the center of a sphere/cube and run the full spectrum of paths to that center point. Left to right and right to left, bottom-up and top-down, outside in and inside out, spiral inward and spiraling out. Think about the component parts that make up the visual space, and the conditions that fall along each path. Why is your problem so complex? What makes it so complex? Can you qualify your problem in color, words, shape, and text? Can you quantify it and its components? Is it made up of many unknown dimensions or a few known ones? Picture your problem, logically break it apart, and put it back together. Take a diverse set of paths to and from your problem, and find out which one gives you an optimal set of insights in return. Hopefully, if the answers and conclusions are not clear, you’ll at least have learned something in the process.

The Power of Anticipation

In today’s society, gaining an inch can be like gaining a mile.

Soccer takes a lot of skill and athleticism. You need to be able to dribble, pass, shoot, tackle, communicate, see, sprint, etc. But as I’ve stated before (“mind bend it like beckham” – 2/11/2009) it’s just as much a mental game as it is a physical one. You need to think like your opponent and play somewhat of a guessing game, connecting dots before there’s any visible relationship between them. You need to forecast outcomes, intellectually seeing into the future guided by the data that’s available.

This sort of anticipation is an imperative ability for success in the future – within any endeavor. In business, anticipation means a gaining a leading edge on the competition. For defense, it means preparation and contingency plans for what might be likely to occur. In decision-making its gaining threshold confidence in your decision – using as much relevant information to guide a range of actions, opinion,s and ultimately, outcomes. And not to mention, it helps us grab our umbrella when running out the door.

Predictive analytics, although a seemingly new, hot topic today, has been around forever. Prophets, Mayans, Nostradamus, Pythia, lunar calendars, and the Akashwani – in a historical sense the predictions were informed by a variety of sensory stimuli coupled with intuition and a variety of other external factors. Nowadays, it’s really not that different. Today, we have data and semi-sophisticated mathematical processes that parallel conscious perception and intuition. We can quantify much of what could not have been quantified in the past.

“Predictive analytics encompasses a variety of techniques from statistics, data mining and game theory that analyze current and historical facts to make predictions about future events.

In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities. Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions.” (Wikipedia)

It’s imperative that people embrace predictive analytics to inform decision-making. Math doesn’t have to make the decision – that’s mostly for humans – but the math can give a comprehensive picture that outlines components of the decision and also tells us what the decision may lead to (or may have led to in the past) in terms of primary, secondary, and tertiary outcomes. Bruce Bueno de Mesquita is a great example of this, using computer algorithms to predict world events of the future – war, proliferation, conflict, etc. Decisions are not made by computer models, but humans are briefed of probable scenarios in order to make better-informed decisions.

I’ve said this before – math can be simple when it’s made to be simple. It’s a toolbox of problem-solving techniques and thought processes to help guide real-world decisions and understanding. It’s important to not be afraid of the math – start small and grow your mathematical toolbox over time. Take it head on and don’t be overwhelmed. We all have something to learn and we all have something to gain by embracing prediction and anticipation.

So whether it’s sport, meteorology, national security, or adding garlic to the pan, find a way to anticipate. In doing so, my prediction is that you’ll be better off…

Links

Technology And Intelligence In The Next Decade

The below is an essay I wrote for my Technology and Intelligence class in early 2008 (STIA-432 at Georgetown University). It is meant to describe a few of the current problems faced and the nature of those problems, but not to offer up solutions. In the past year we have certainly seen the continuation of existing challenges coupled with the emergence of new ones. Today’s scientific and technological paradigm is by no means a simple one. But I do believe that with the collaboration of bright minds and the continued objective to ride and guide the progressive technological waves of the 21st century, substantial risks will be mitigated.

If History Could Tell

Since the establishment of the Office of Strategic Services in 1942 and subsequently the Central Intelligence Agency in 1947 (via the National Security Act), a core mission has been the collection and analysis of strategic, actionable information. This process has always required technology in the form of communications equipment, navigational tools, security systems, listening devices, and many more. Historically, the Intelligence Community as a whole has been way ahead of the technological curve, and in most cases, has established and controlled the curve. With information security and access to federal funds, various agencies have been given the ability to turn novel ideas into useful instruments for collection, analysis, and dissemination. However, history has become the past, and no longer dictates the way in which the world of technological development can move forward. Federal and international regulations, advancement in information theory, collaborative networks, and the global information age via the internet have all contributed to rapid, world-wide technological development that is no longer behind the IC on the tech curve. In the next decade, the Intelligence Community has the potential to fall even or behind any lines of global technological development, and as a result will find new struggles in all sources of intelligence, whether clandestine or not. Some arguments state that the IC, with some elements of special authority granted to preserve national security interests, will flourish as a developing technical lab for operations. However, the best and the brightest technical and analytical minds are not necessarily organized within the IC anymore, but rather are connected without boundary via the internet. Open-source development and the speed at which the commercial world can access capital may eventually move the IC technical approach to the back of the line.

The Whole is Greater Than the Sum of the Parts

Collaborative technologies have particularly flourished in the past five years. Social networking sites such as MySpace, Facebook, and Flickr, knowledge management platforms such as Microsoft SharePoint and TheBrain Technologies, and the entire blogosphere have accelerated communications without any distance barriers to get around. Information is passed, shared, and edited with the click of a button. SourceForge, an online network for open-source software development, has brought a vast array of new technologies to a market that never before existed. This lack of predictability for the technological market puts the IC in “catch-up” mode. Wikipedia, as well as other information warehouses, accelerates knowledge consumption for the individual – not just a business or state entity. With a horizontal, access-free, organizational structure, these applications have few barriers. Although the IC works to chase these technologies with A-Space and Intellipedia, an accompanying hierarchical structure and tiered-access system could truly dampen collaboration on a technological front.

Getting Small Could Lead To…

As the world grows in size and energy, the capability to pack information, data, and logic into smaller and smaller units continues to develop. Through nanotechnology and quantum computing, academic research groups as well as large corporations have minimized size requirements and increased processing speed in the same products. The associated power that now exists in these products outside of the Intelligence Community weakens the IC’s ongoing ability to leverage such products for foreign surveillance tactics with communications, imagery, measurements, and signals collection.

…A Much Bigger Problem

In the next decade, the IC and the United States as a whole will face incredible security and technological challenges. The tension will be increased as national policy will have to deal with finding a balance between civil liberties and national security interests. With recent information warfare events such as hacks into Pentagon computers, developmental advantages can change in an instant. International policies will also affect development within the U.S. government and could unfortunately give an edge to non-governmental organizations that have easier ability to practice CBRN weapons testing (with high-tech delivery instruments), removed from many international regulations. Unfortunately, if the Intelligence Community is to drift toward a more reactionary state, the technological and security risks become increasingly more serious.

A (New) Final Thought

It’s just as important to anticipate the wave as it is to ride and guide the wave. Surfers find waves through reaction AND proaction. The same goes for the collection, analysis, and technological development. There is more historical and real-time data than ever before. Deterministic and probabilistic models are more advanced than ever before. We can do something with all this data to find patterns and indications of technological risk. At the same time, we have more intellectual and psychological understanding of cultures around the world, and the associated mechanisms of travel, prayer, consumption, loyalty, and desire than ever before. Pairing one with the other gives us the connect-the-dot power that can truly shape our understanding and awareness of the world and the technological risks that threaten our security and sustainability as people.

Happy Planet Index vs Human Development Index

With my post on “Everything is Connected” I thought I’d investigate a bridge between happiness and the level of development in a country…

The Happy Planet Index (HPI)

Link: http://www.happyplanetindex.org/
Dimensions: Life expectancy, life satisfaction, ecological footprint
Total Countries: 143
Range: [0,100]

“The HPI is an innovative measure that shows the ecological efficiency with which human well-being is delivered around the world. It is the first ever index to combine environmental impact with well-being to measure the environmental efficiency with which country by country, people live long and happy lives.”

The Human Development Index (HDI)

Link: http://hdr.undp.org/en/statistics/indices/hdi/
Dimensions: Life expectancy at birth, knowledge and education, standard of living.
Total Countries: 178
Range: [0,1]

“The first Human Development Report (1990) introduced a new way of measuring development by combining indicators of life expectancy, educational attainment and income into a composite human development index, the HDI. The breakthrough for the HDI was the creation of a single statistic which was to serve as a frame of reference for both social and economic development. The HDI sets a minimum and a maximum for each dimension, called goalposts, and then shows where each country stands in relation to these goalposts, expressed as a value between 0 and 1.”

Thoughts and Hypotheses

There are two relationships we will want to consider:

Correlation: Is there any direct relationship (positive or negative) between the values of the HDI and HPI?
Clustering: By region (or other characteristic field) can we find any clusters in the data?

Since these are composite indices of several weighted variable inputs, hopefully this top-level approach can identify some possible matches and mismatches between underlying data fields too. Related to the HDI, I bet the UN’s HPI (Human Poverty Index) has a bridge to happiness… or most likely, unhappiness.

Data/Discussion

There seems to be a connection between deviations in the data. When there exists a large deviation, for a specific region, for the HDI, there seems to also be a large deviation of values for the HPI. Notice that Africa, Australasia, and the Middle East all have similar double-digit deviations. What does this tell us about the range of development and happiness within a specific region? Perhaps this could be tested across many country-level metrics to see if the similar deviations occur more frequently.
As with the above note, since we have these metrics on a same scale/range, let’s combine them to see who has the highest composite score. In alphabetical order we have: 84, 125, 138, 137, 133, 134, 126, 134, 119, 117, 119. There seem to be three groups here: High (>130), Medium (100-130), Low (<100). Depending on a user need, algorithms can be created to join metrics to provide a big picture representation of economic, political, sociological, etc metrics, and flexibility can be built to dig into the weeds on the underlying data. This would be a nice comprehensive framework for understanding how countries (and regions as a whole) change over time.

Looking at the scatter plot, it is clear that some clusters may exist, for example with Africa (blue). Caribbean (orange), Europe (green), and Russia and Central Asia (purple) also show some quick visual clustering, while the Middle East (red) shows the opposite. What could this mean? That regional trade, policy, weather, etc are good supplementary foundations for providing happiness and development?
We could add trend lines and quickly check for any linear (or logarithmic) relationships. If any relationship does exist as a whole or with a region, it is certainly not a directly proportional or inversely proportional one. This was expected as these metrics are quite different (despite the overlap in life expectancy as an input dimension).

Moving forward, the methodologies and underlying dimensions (with their sources) should be compared. Data is always good, but with good data one still must be careful. That being said, this is a good start for a much larger investigation into the connections between different country-level metrics, especially if they are to be used in international and national policy.

kevin berardinelli

Tag: analysis