Some Statistics on Constitutions

Constitution: “A body of fundamental principles or established precedents according to which a state or other organization is acknowledged to be governed.” (New Oxford American Dictionary)

Last week The Economist had an interesting article referencing Constitute, a project (and pretty slick web application) that aims to provide the world’s constitutions for people to read, search, and compare. At the most basic level, the site breaks down 189 national constitutions into common topics, themes, and provisions for easy comparison of the most powerful governing documents across the world. It also ranks the constitutions by overall scope, executive power, legislative power, and judicial independence. Below is a quick graphic comparing the constitutions of 19 countries in Central and South America.

Some interesting statistics (from a variety of sources referenced at bottom):

Every year around 5 new constitutions are written and between 30-40 constitutions are amended or revised. Since 1789, more than 900 constitutions have been written. Only about half of all written constitutions last more than 19 years (this was predicted by Thomas Jefferson in 1789 that a constitution “naturally expires at the end of 19 years”).
The longest constitution is India’s at over 146,000 words (117,000+ using English-language translation). The shortest is Jordan’s at 2,270 words using the English-language translation. The U.S. Constitution has 4,543 words (original, unamended) and 7,762 (full text).
The oldest written set of documents still governing a sovereign nation is San Marino’s “Leges Statutae Republicae Sancti Marini”, written in 1600. The oldest surviving one-document constitutional text governing a sovereign nation is widely considered to be the U.S. Constitution, written in 1789.
There are 27 amendments to the U.S. Constitution. Since 1789 there have been over 11,500 measures proposed to amend the U.S. Constitution. That’s over 50 measures per year (the rate has actually been closer to 100 measures per year more recently). Over 500 of the 11,500 measures have been proposed to amend the Electoral College. (Senate.gov)

References/Links

A Universal Concept Classification Framework (UCCF)

Background

Whether it’s for building the perfect chapter title, analyzing existing literature, or maybe just a personal etymological adventure, there is usefulness in providing quantitative context to words and concepts. Should such a framework exist, it should be easy-to-understand and broadly applicable for authors, students, and other individuals alike.

The Universal Concept Classification Framework (UCCF) proposed below involves five categories in which any word/concept can be scored. Each category’s score has range [0,20], spanning the full spectrum of possible values across each category. Where possible, the highest possible score for each category (20) should represent the more complex end of the spectrum (see below). The individual scores can then be summed to give a combined UCCF Score with range [0,100].

The individual category scores as well as the combined UCCF Score provide an easy way for readers and writers to understand and analyze the relative impact of certain words/concepts on readers, among other applications.

Universal Concept Classification Framework (UCCF)

Get (Concrete=0, Abstract=20): Low scores represent words/concepts that are concrete, tangible, well-defined, and easy to understand. High scores represent words/concepts that are abstract and open to interpretation.
Act (Controllable=0, Uncontrollable=20): Low scores represent words/concepts that are controllable, created, and/or driven by an individual, group, or machine. High scores represent words/concepts that are by nature uncontrollable.
Dim (Independent=0, Dependent=20): Low scores represent words/concepts that are independent of other words/concepts and can stand alone for meaning and interpretation. High scores represent words/concepts that are complex, very dependent upon other words/concepts, and are often very interconnected to support interpretation.
Set (Known=0, Changing/Unknown=20): Low scores represent words/concepts that are very well known and not subject to change in meaning or interpretation across time, language, and society. High scores represent words/concepts that change rapidly or may be universally undefined across time, language, and society.
Rad (Plain=0, Intriguing=20): Low scores represent words/concepts that are plain and without dimension. High scores represent words/concepts that are multidimensional, mysterious, and full of intrigue.

Limitations/Applications

No framework is without fault, and especially in the measurement of unstructured information, the UCCF certainly has limitations. However, it’s a quick and easy way to begin to better understand words/concepts, and I believe this type of methodology has broad applications.

One example is in the building of book titles and chapters, where authors may want to represent a broad spectrum of word types. One type of chapter may want to maximize combined UCCF Scores, others may want to keep combined UCCF Scores to a minimum, and a third type may want to have words that cover the widest range of combined UCCF Scores.

Another application may be in the analysis of certain authors, languages, or successful books in general. Do authors write about similar concepts according to the UCCF? Is there a correlation between successful books and the UCCF Scores represented by certain titles? These types of questions could be investigated using a new quantitative approach.

In general, applying simple quantitative methods to abstract ideas can provide a new way for thinking and contextualizing decisions, such as choosing book titles, and analyzing content and content creators, such as popular authors/bloggers.

A Simple Method for Analyzing Books

A recent Pew Research Center study found the following:

Americans 18 and older read on average 17 books each year. 19% say they don’t read any books at all. Only 5% say they read more than 50.
Fewer Americans are reading books now than in 1978.
64% of respondents said they find the books they read from recommendations from family members, friends, or co-workers.
The average reader of e-books read 24 books (the mean number) in the past 12 months; the average non-e-book consumer read an average of 15.

The first bullet above is pretty remarkable. Using 17 books/year with, let’s say, 40 years of reading (above the age of 18), that’s 680 books read in adulthood. That’s a lot.

This got me thinking about how we decide which books to buy and how our decisions on which books to buy adapt with each book that we read. Are we in tune with our changing desires and interests and is our feedback loop from both positive and negative reading experiences, well, accurate and efficient?

Some time ago, I began collecting data on my book reading experiences to allow me to analyze exactly that. Given the Pew study, I figure I’ll share my methodology in hopes it makes sense to someone else. Star ratings such as that on Amazon are certainly helpful, but my hope is to perfectly understand what works for me as to make my decisions on reading material accurate, efficient, and part of a lifelong journey for knowledge and inspiration.

Known Data Elements (Both Categorical and Quantitative)

Author
Type (Non-Fiction vs Fiction)
Genre (Thrillers/Suspense, Science/Technology, Current Affairs & Politics, etc.)
Number of Pages (using hardcover as a standard)
Date Published

Personal Data Inputs (upon book completion)

Date Completed
Tags/Notes
Readability, Flow, & Structure (RFS) – A score ranging from [0.0, 5.0] subjectively assigned to a book based on ease-of-read and the overall structure of the book.
Thought-Provoking, Engagement, & Educational Value (TEV) – A score ranging from [0.0, 5.0] subjectively assigned to a book based on how mentally stimulating it was in terms of knowledge and thought.
Entertainment, Suspense, & Likeability (ESL) – A score ranging from [0.0, 5.0] subjectively assigned to a book based on the entertainment value and overall likeability of the story, characters, and/or information presented.

Those three metrics (RFS, TEV, ESL) allow one to create a overall score for the book. My overall score is a simple sum of the three metrics, divided by the maximum possible score (15.0), and expressed as a percentage (ranging from 0% to 100%). Although I have not yet conducted any correlation studies or categorical analyses using my data (which I have for 42 books starting in Aug 2004), below is a snapshot. As for my next book, it’ll probably be a self-help guide to drop the data obsession. 🙂

Title	Author	Pages	RFS [0,5]	TEV [0,5]	ESL [0,5]	SCORE [0,100%]
A Short History of Nearly Everything	Bill Bryson	560	4.5	5.0	4.5	93%
The Alchemist	Paulo Coelho	208	4.5	4.5	4.5	90%
Life of Pi	Yann Martel	336	4.5	4.0	4.5	87%
Moneyball: The Art of Winning an Unfair Game	Michael Lewis	288	4.0	4.5	4.0	83%
Born to Be Good: The Science of a Meaningful Life	Dacher Keltner	352	4.0	4.5	3.5	80%
The Tipping Point: How Little Things Can Make a Big Difference	Malcolm Gladwell	288	4.0	4.0	4.0	80%
The Next 100 Years: A Forecast for the 21st Century	George Friedman	272	4.0	4.5	3.5	80%
Super Freakonomics: Global Cooling, Patriotic Prostitutes, and Why Suicide Bombers Should Buy Life Insurance	Steven Levitt; Stephen Dubner	288	4.0	4.0	4.0	80%
Super Crunchers: Why Thinking-By-Numbers is the New Way To Be Smart	Ian Ayres	272	4.0	4.0	4.0	80%
The Art of Strategy: A Game Theorist’s Guide to Success in Business & Life	Avinash Dixit; Barry Nalebuff	512	4.0	4.5	3.5	80%
The Long Tail: Why the Future of Business is Selling Less of More	Chris Anderson	256	4.0	4.0	3.5	77%
Outliers: The Story of Success	Malcolm Gladwell	309	4.0	4.0	3.5	77%
Body of Lies	David Ignatius	352	4.5	3.0	4.0	77%
A Walk in the Woods: Rediscovering America on the Appalachian Trail	Bill Bryson	284	3.5	4.0	3.5	73%
Kill Alex Cross	James Patterson	464	4.5	2.5	4.0	73%
The Increment	David Ignatius	400	4.0	2.5	4.5	73%
A Whole New Mind: Why Right-Brainers Will Rule the Future	Daniel Pink	272	4.0	4.0	3.0	73%
Blink: The Power of Thinking Without Thinking	Malcolm Gladwell	288	3.5	4.0	3.0	70%
Physics of the Impossible: A Scientific Exploration into the World of Phasers, Force Fields, Teleportation, and Time Travel	Michio Kaku	352	3.5	4.0	3.0	70%
The Bourne Dominion	Eric van Lustbader	432	3.5	2.5	4.5	70%
Fortune’s Formula: The Untold Story of the Scientific Betting System That Beat the Casinos and Wall Street	William Poundstone	400	3.0	4.0	3.5	70%
The Godfather	Mario Puzo	448	3.5	2.5	4.5	70%
The Sicilian	Mario Puzo	410	3.5	2.5	4.5	70%
The Invention of Air: A Story of Science, Faith, Revolution, and the Birth of America	Steven Johnson	272	3.0	4.0	3.0	67%
The Drunkard’s Walk: How Randomness Rules Our Lives	Leonard Mlodinow	272	3.0	3.5	3.5	67%
Cross Fire	James Patterson	432	4.0	1.5	4.5	67%
The Social Animal: The Hidden Sources of Love, Character, and Achievement	David Brooks	448	3.5	4.5	2.0	67%
The Golden Ratio: The Story of PHI, the World’s Most Astonishing Number	Mario Livio	294	3.0	4.0	2.5	63%
Physics for Future Presidents: The Science Behind the Headlines	Richard Muller	354	3.0	3.5	3.0	63%
The Future of Everything: The Science of Prediction	David Orrell	464	3.0	3.5	3.0	63%
The Department of Mad Scientists	Michael Belfiore	320	3.0	3.0	3.5	63%
For the President’s Eyes Only: Secret Intelligence and the American Presidency from Washington to Bush	Christopher Andrew	672	3.0	3.5	3.0	63%
Born Standing Up: A Comic’s Life	Steve Martin	209	4.0	2.0	3.0	60%
Science is Culture: Conversations at the New Intersection of Science + Society	Adam Bly (Seed Magazine)	368	2.5	3.5	3.0	60%
1491: New Revelations of the Americas Before Columbus	Charles Mann	480	2.5	3.5	2.5	57%
The Curious Incident of the Dog in the Night-Time	Mark Haddon	226	3.0	3.0	2.0	53%
Group Theory in the Bedroom, and Other Mathematical Diversions	Brian Hayes	288	2.0	3.5	2.0	50%
Euclid in the Rainforest: Discovering Universal Truth in Logic and Math	Joseph Mazur	352	2.0	3.0	2.5	50%
This is Your Brain on Music: The Science of a Human Obsession	Daniel Levitin	320	2.5	3.0	1.5	47%

Stop Talking & Start Digging: The Importance of Getting Dirty with Data

Today’s world can be characterized by increasing speed, complexity, disorder, and interconnectedness. For organizations trying to understand their operating environment, develop products, improve services, advance their mission, identify gaps, and support overall decision-making and strategic planning, this presents a wide array of challenges. As a result, organizational processes should be focused on overcoming these challenges and should be driven by the desire for solutions – forward-looking solutions that better understanding, improve productivity, increase efficiencies, and maximize the chance for success.

Finding or creating a solution to a complex problem requires careful planning and thought. We must beak down the problem into simpler, manageable components, identify and characterize root causes, and involve relevant stakeholders in discussions and feedback sessions. We must look across our sources of data, identify any real limitations and gaps, and plan how to execute some analytical methods across the data to extract insights. The problem is, in a world of accelerating information, needs, and problems, it’s just too easy to get caught in the planning and thinking stage. We need to get down and dirty with our data.

In the year 2011, we are surrounded by resources, libraries, catalogs, tools, and software – much of it open source and/or freely available for our own personal (AND collective) use. We must learn to access and leverage these resources efficiently, not only to perform cleansing and synthesis functions, but also to inform our collection and analysis processes to make them better as time goes on. Armed with these resources and tools, we must feel comfortable jumping right into our data with the confidence that insights will be gained that otherwise would have been lost in time.

Slicing is a helpful example of this. When faced with a high-dimension data set, usually with poorly described variables, start by slicing the data into a manageable chunk with high-powered variables – time, location, name, category, score, etc. Use a data visualization program to understand order, geospatial distribution, or categorical breakdowns. Describe the data and ask questions about how collection processes led to any gaps that exist. Simple slicing and dicing separate from the root analysis can often chart a potentially workable path forward.

The bottom line is that whether it’s dirty data or larger-scale, socially-complex problems, we sometimes need to shorten the discussion of the problem itself and get our hands dirty. Sometimes we need to create a little chaos upfront in order to shake things loose and find our intended order, structure, and path forward. After all, planning your dive is important, but sometimes you need to just dive in and see where it leads you.

Building Blocks, Foundations, & Enterprise Architectures

Languages (spoken, visual, mathematical, etc.) exist because they are the building blocks for communication, understanding, and ultimately, relationships. Relationships form the foundation for social networks, communities, strategic partnerships, and more complex systems. These systems, and the interaction of within and across such systems, is a basis for life and living.

The problem is, the definition and conceptual understanding of these building blocks, foundations, and higher-level systems often does not exist. As a result, technology development efforts, strategic partnerships, marketing campaigns, and the like suffer from a lack of true coordination and comprehension.

In general, identifying building blocks, establishing foundations, and defining more complex systems and interactions is critical to advancement in this world. In most cases, establishing these foundations is a much needed platform for coordination and comprehension that supports achievement of a higher objective. In other cases, attempting to define abstract concepts and inherently complex systems is a fruitful exercise in itself, driving constructive debate, new questions, and lessons learned for the primary stakeholders involved.

With this in mind, I seek to outline some building blocks and establish a simple foundation for enterprise architectures. My hope is that by initiating this exercise, it may provide some conceptual clarity to non-technical folks and demonstrate a framework through which other systems can be defined and explored.

The Building Blocks of Enterprise Architectures

In general, an enterprise represents people, information, and technology joined by common needs, objectives, and/or behaviors. An enterprise architecture helps define the structure of the enterprise to enable the people, information, and technology to interact in an efficient, effective, relevant, and sustainable manner.

People – Represents individuals or the various organizational constructs that contain individuals, such as a program, agency, domain, or community of interest.
Information – Represents all consumable data, products, and knowledge that is collected or created by other elements of the enterprise.
Technology – Represents the infrastructure components, networks, capabilities, systems, and programs that support other elements of the enterprise.

The Foundation for Enterprise Architectures

Now that the puzzle pieces have been broadly defined and we have a simple lexicon to work with, we seek to: (1) outline how these building blocks might fit together to support various operational needs, analytical use cases, and other tasks/functions; and (2) identify the logical connections, interactions, processes, and/or relationships between and amongst the building blocks.

The diagram below begins to define this foundation, logically placing enterprise elements (people, information, technology) to support coordination and comprehension. This would then support the examination of each possible pair of building blocks (e.g. people and information) to define the enterprise architecture and identify critical interdependencies within the system.

Enterprise Architectures: Technology Focus

To this point, establishing definitions and diagrams provides us with a core foundation for understanding end user requirements, identifying security implications, pinpointing system interdependencies, and supporting system analysis efforts. Focusing in on the technological components of our enterprise architecture, we have categorized them into three logical tiers:

Top Tier (Front-End) – Represents the technologies that support end-user interactions (data access, analysis, visualization, collaboration, input, personalization, etc) with information/data and other stakeholders.
Middle Tier – Represents the utilities, services, and support components that optimize system interactions amongst all people and information.
Bottom Tier (Back-End) – Represents the core information architecture, system security, and access / identify management components to support a secure, efficient, and effective operation.

The bottom line is that defining building blocks and outlining foundations is a critical first step to support coordination and comprehension. Sometimes just putting words and diagrams on paper saves valuable design and development hours or at least drives valuable discussion. Particularly in the world of enterprise architectures, this process is critical to align stakeholders up front and to put development efforts in perspective. Whether it’s boxes, lines, definitiosn, or discussions, sometimes a little language goes a long way.

Ten Technological Concepts Compressing the Analytical Timeline

Today’s difficult economic climate continues to cause increased competition for all organizations. Shrinking budgets are placing government departments and agencies under more pressure to increase operating efficiencies and cost-effectiveness of programs and technologies. Across industry, fragile markets have caused organizations to consider the need for every project, person, and printer to reduce operating costs. In the non-profit sector, slimming funding streams have caused an increased pressure to demonstrate value through concrete, measurable results.

In order to stay competitive within their particular domains, markets, and user communities – and to ultimately achieve growth and sustainability in any economic climate – all organizations must find ways to increase operating efficiencies, eliminate programmatic redundancies, and produce measurable results. Lucky for these organizations, several technological concepts have emerged over the past decade which help support these practices. In that regard, the acknowledgement, understanding, and implementation of these concepts across organizational units, programs, and processes will compress the analytical timeline and allow organizations to learn, control, adapt, and anticipate over time.

Here’s a quick look at some of the technological concepts/trends that are compressing the analytical timeline, allowing organizations to act on insights more quickly, more effectively, and more accurately:

Data Collection Mechanisms – It’s not just about collecting more data, although volume (in many cases) helps. It is about collecting more types of data (image, audio, video, open source media, social media) and collecting more tactical data. The growth of the mobile and tablet markets, the ease-of-use of such devices and their decreasing costs, and the expansion of mobile network infrastructure around the world are helping organizations collect more diverse, tactical, and (ultimately) valuable data.
Data Cleansing/Processing – Rather than ignoring unstructured data, we are beginning to embrace it. Many COTS, GOTS, and even open source technologies exist that cleanse and process unstructured data to ensure it can be used to support relevant use cases. Where unstructured data was formerly omitted from the analytical landscape, these technologies are now bringing new value and context to insights and decisions. Within this I want to also add the data storage/warehousing and processing capabilities that support big data analytics and data mining, which provides a quicker means by which the vast amount of data can be combed for relevant patterns and insights.
Logical Data Structures – It seems we are finally learning that a little thought and planning up front does wonders for the types of analysis needed to support operations research, performance measurement, marketing, and other organizational practices. By building logical data structures, we can quantify things otherwise unquantifiable and ultimately make timely, informed decisions otherwise made by intuition alone.
Data Standards/Models – In conjunction with building supportive, internal data structures, we are beginning to understand how data models within domains, across communities of interest, and for specific problem sets can do wonders for our analytical practices. By developing and/or adopting a standard, we can bring consistency to these analytical practices over time, even through personnel changes. No more one-off studies/reports, but rather repeatable and communicable analysis.
Data Source Registries/Catalogs – It is slowly being understood that ubiquitous access to raw data sets is far from a reality. However, organizations are beginning to realize that data source catalogs (registries) across organizational units and/or communities of interest is a step that can quickly facilitate more effective data sharing practices. Rather than focus on the exposure of raw data, the data source catalog first involves the exposure of data source metadata – information about the data, but not the data itself. This data sharing approach is more strongly rooted in trust and visibility and, ultimately, can provide a platform by which analysts can gain quicker access to more relevant data.
Social Networks – The social network movement has done many things to compress the analytical timeline, to include, but not limited to: driving more collaboration and interaction between data owners, analysts, end users, and ordinary people; driving a new means by which more tactical data can be accessed and collected; and facilitating the development of new platforms, applications, and technologies to glean insights from data.
Identity Management, Access Control, & Cyber Security – Knocking down stovepipes can support better access to data which in turn can support less time collecting data and more time analyzing it. However, stovepipes provide organizations with another layer of security to prevent data breaches. Despite this contradiction, better identity management, access control, and security technologies are being developed to maintain a high level of control while still ensuring users can more easily access data traditionally hidden within stovepipes. In turn, the time spent accessing and integrating data is decreased and individuals can spend more time analyzing disparate data and delivering quality insights.
Cloud Computing – The movement of information systems and applications to the cloud is transforming the analyst from being a thick-client-loving info hog to being a platform-agnostic, collaborative participant. With more data and tools exposed to individuals, no longer constrained by a single hard drive or device, analysts can more effectively and efficiently access, collect, integrate, visualize, analyze, share, and report on data and insights.
Network Infrastructure – The expansion of existing connected and wireless networks as well as the development of new, quicker, more accessible, and more secure networks will continue to compress the time it takes for analysts to provide valuable insights.
Customizable & User-Defined Interactions – Allowing individuals to define how they wish to visualize, analyze, and interact with relevant data provides analysts with the ability to focus on developing solutions rather than setting up problems. The “user-defined” movement provides flexibility and adaptability to the individual and allows a wider set of individuals to become analysts by owning their own workspaces and interactions. It also provides an interactive medium through which results can be presented, making the reporting and dissemination process interactive rather than a drawn out one-way street.

I do want to note that this list is by no means comprehensive. Even more importantly, it only focuses on technological concepts and does not address the numerous cultural and political factors that affect the analytical timeline. Although technology shall continue to be a major focus area in supporting quicker and more effective analytical practices, it is the cultural and political aspects that will be more difficult to overcome and their interdependence on the technological aspects should never be overlooked.

World Statistics Day and the Importance of Statistics in Government

The most recent issue of Amstat News features a wonderful summary of the first ever World Statistics Day, which just occurred on October 20, 2010. The article features a series of quotes from the chief statisticians at various U.S. government agencies, all of which serve as a great overview of the critical importance, broad applicability, and growing need for statistics and statistics professionals in the U.S. and around the world. Collectively, we must embrace not only the numbers, data, methods, analyses, and reports, but also the conversations and the debate around such components. In a world heavily fueled by data, I’m very glad that statistics is gaining more international awareness and recognition so that all our lives can be bettered by more informed decisions and debates.

“Statistics produced by the federal government inform public and private decisionmakers in shaping policies, managing and monitoring programs, identifying problems and opportunities for improvement, tracking progress, and measuring change. The programs of our statistical system furnish key information to guide decisionmakers as they respond to pressing challenges, including those associated with the economy, agriculture, crime, education, energy, the environment, health, science, and transportation. In a very real sense, these statistics provide data users with a lens to focus the myriad activities of our society into a more coherent picture of the status, progress, and trends in our nation. The ability of governments, businesses, and individuals to make appropriate decisions about budgets, employment, investments, taxes, and a host of other important matters depends critically on the ready availability of relevant, accurate, and timely federal statistics. Our economy’s complexity, growth, and rapid structural changes require that public and private leaders have unbiased, relevant information on which to base their decisions.”
– Katherine Wallman, Chief Statistician, Office of Management and Budget and Past President of the ASA

A few more important (and relevant) statistics resources can be found at:

FedStats.gov – provides access to the full range of official statistical information produced by the Federal Government (html)
Overview and Coordination of U.S. Federal Statistics System – 2007 (OMB, 2007, pdf)
Federal Statistics Programs – FY2009 (OMB, 2009, pdf)
“Top Statistician Explains What All Those Numbers Mean” – Interview with OMB’s Katherine Wallman (ScienceNews, Nov 2010, html)

Postulating Possible, Probable, and Preferable Futures

With regards to futurology and future studies, I’ve recently posted on the principles of forecasting as well as the origins of opportunity – two distant yet related topics that exemplify the breadth and depth of the field.

As my own futures research has progressed, I’ve found various sources that have proven to be quite valuable in guiding my curiosity and conjecturing. I think some of these are worth passing along:

Acceleration Studies Foundation (ASF) – ASF is an educational 501(c)(3) nonprofit engaged in outreach, education, research, and selective advocacy with respect to issues of accelerating change.
Futurology (Wikipedia)
Institute for Alternative Futures – The Institute for Alternative Futures (IAF) is a nonprofit research and educational organization founded in 1977. IAF and its for-profit subsidiary, Alternative Futures Associates (AFA), specialize in aiding organizations and individuals to more wisely choose and create their preferred futures. IAF works with clients to create forecasts, scenarios, goals and strategies that are the essential tools for transforming organizations to succeed in times of rapid change.
Institute For The Future (IFTF)
Principles of Forecasting (ForPrin)
Shaping Tomorrow – Online community of futurists and futures research
Shaping Tomorrow (Ning Network)
“The Time Lords” (Financial Times, 1/30/2007)
World Future Society (WFS) – The World Future Society is a nonprofit, nonpartisan scientific and educational association of people interested in how social and technological developments are shaping the future. The Society was founded in 1966 and is chartered as a nonprofit educational and scientific organization in Washington, D.C.
World Futures Studies Federation (WFSF) – The World Futures Studies Federation (WFSF) is a global NGO that was founded in the 1960s to encourage and promote the development of futures studies as a transdisciplinary academic and professional field in all parts of the world. WFSF operates as a global network of practicing futurists – researchers, teachers, scholars, policy analysts, activists and others from approximately 60 countries.

In a nutshell, futures studies is exactly that – studies of the future. It’s about the determining the total set of possibilities for tomorrow, finding the most probable of outcomes for tomorrow, and leveraging advanced knowledge to even shape the future.

Wikipedia, as spread across the spectrum of credibility as it may be, provides a pretty decent explanation of the mindset needed to be truly forward-looking. What qualities are required of a futurist, within any organization or for any requirement or need, to fully anticipate the unknown using the set of resources at his/her disposal (data, intuition, logic, technology, vision, science, etc.)?

Trend Assessment – The competency to understand trend directions, weak signals and wildcards, assess their likely impact and effect on one another and respond in a timely and appropriate manner
Pattern Recognition – The ability to see patterns rather than individual factors
System Perspective: The capability to envision the entire system rather than the isolated components
Anticipation: To anticipate short and long term consequences over time, novel situations and geography
Instinct & Logic: To rely on a combination of instincts and logic rather than purely rational analysis

Yes, there is a plethora of philosophical, political, religious, scientific, and even incomprehensible factors that may give shape to the future (or take that shape away). But that should not deter us from facing it head on. We all need to work smarter not harder, avoid surprises, exploit new opportunities, plug weaknesses, and (where possible) influence the future.

A List of Some Web Data Sources

Well I needed to pull together a listing of publicly available data sources for a project, so I figured I’d post them here as well. Some descriptions and tag lines have been taken directly from the website, and some I quickly created on my own. This list is by no means comprehensive (I probably have about 100 links in the “Data” folder of my bookmarks…) but it’s a quick snapshot at some useful data sources on the web. That being said, there are a lot of considerations when targeting a data set and tomorrow’s need for data will most likely differ from today’s need for data. Build and execute a target data strategy using the vast sets of search engines, libraries, and social networks on the web and you’ll be just fine.

AggData – The advantage of AggData is that the data is collected into one file that is very raw and portable, which makes it easy to integrate into any application or website. You can browse free data sets or purchase any of the many data sets from public and private organizations for a relatively small fee.

The Association of Religion Data Archives – The ARDA Data Archive is a collection of surveys, polls, and other data submitted by researchers and made available online by the ARDA. There are nearly 500 data files included in the ARDA collection. You can browse files by category, alphabetically, view the newest additions, most popular files, or search for a file. Once you select a file you can preview the results, read about how the data were collected, review the survey questions asked, save selected survey questions to your own file, and/or download the data file.

Census.gov American FactFinder – In American FactFinder you can obtain data in the form of maps, tables, and reports from a variety of Census Bureau sources. Click here for a good listing of available data sets, visualizations, and search functionalities.

CIA World Factbook – Contains a lot of country-level metrics/statistics, although they are not very easily exportable and/or available in table format.

City Population – Gazetteer of global geographic data and limited demographic statistics per location.

Data360 – This is essentially a wiki for data. Data360 is an open-source, collaborative and free website. The site hosts a common and shared database, which any person or organization, committed to neutrality and non-partisanship (meaning “let the data speak”), can use for presentation of reports and visualizations about the data.

Data.gov – The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. Although the initial launch of Data.gov provides a limited portion of the rich variety of Federal datasets presently available, we invite you to actively participate in shaping the future of Data.gov by suggesting additional datasets and site enhancements to provide seamless access and use of your Federal data. Visit today with us, but come back often. With your help, Data.gov will continue to grow and change in the weeks, months, and years ahead. For more information, view our How to Use Data.gov guide.

Data Marketplace – Buy and/or sell data. You can request data sets for others to build and provide for a small fee.

DBpedia – DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data.

EconoMagic – A directory of data sets specific to US states.

Factual – Factual is a platform where anyone can share and mash open data on any subject. Factual was founded to provide open access to better structured data.

FedStats – Provides access to all federal statistical agencies (by geographic scope or listed alphabetically) with a search function to discover available data sets across all US federal statistical agencies.

Gapminder – A non-profit venture that, through a interactive viz tool accompanied by a listing of available data tables, aims to “unveal the beauty of statistics for a fact based world view”.

GeoCommons Finder! – Upload, organize and share your Geographic Data. Then you can use their built in application called Maker! to map/visualize it.

GeoNames – The GeoNames geographical database covers all countries and contains over eight million placenames that are available for download free of charge.

Global Airport Database – Comprehensive set of global airport data (download available for free).

Global Health Facts – Search global data by health topic and/or country. You can also interactively compare data for up to five countries at a time.

Google Public Data – In addition to plainly using the main Google search engine to search for a specific data set, Google has a public data library with some valuable sets available for free.

Guardian.co.uk Data Store – Governments around the globe are opening up their data vaults – allowing you to check out the numbers for yourself. This is the Guardian’s gateway to that information. Search for government data here from the UK (including London), USA, Australia and New Zealand – and look out for new countries and places as we add them. Read more about this on the Datablog. Full list of government data sites here.

Harvard Geographic Information Systems – Contains a highly credible listing of various national and international data providers and data sources, with a strong focus on geographic data.

International Civil Aviation Organization (ICAO) – Global air traffic data available for a fee.

Infochimps – Request data sets, search for existing data sets, or post and sell your own data sets.

International Statistical Agencies
US Census Bureau: http://www.census.gov/aboutus/stat_int.html
US Bureau of Labor Statistics: http://www.bls.gov/bls/other.htm
United Nations: http://unstats.un.org/unsd/methods/inter-natlinks/sd_intstat.htm

MelissaData – Buy comprehensive zip code data for about $150. Tailored for businesses with use in marketing.

NationMaster – NationMaster is a massive central data source and a handy way to graphically compare nations. NationMaster is a vast compilation of data from such sources as the CIA World Factbook, UN, and OECD. Using the form above, you can generate maps and graphs on all kinds of statistics with ease.

National Association of Counties (NACO) – Includes a US county data library.

Numbrary – Numbrary is a free online service dedicated to finding, using and sharing numbers on the web.

OECD Stat Extracts – OECD.Stat includes data and metadata for OECD (Organization for Economic Cooperation and Development) countries and selected non-member economies.

QuickFacts (US Census Bureau Site) – Quick, easy access to facts about people, business, and geography.

StateMaster – StateMaster is a unique statistical database which allows you to research and compare a multitude of different data on US states. We have compiled information from various primary sources such as the US Census Bureau, the FBI, and the National Center for Educational Statistics. More than just a mere collection of various data, StateMaster goes beyond the numbers to provide you with visualization technology like pie charts, maps, graphs and scatterplots. We also have thousands of map and flag images, state profiles, and correlations.

United Nations Development Programme (UNDP) – Includes UN Human Development reports and statistics such as the Human Development Index.

USA Counties (US Census Bureau Site) – A directory of data tables for US states and individual counties. Includes over 6,500 data items.

Weather Underground – Provides free access to historical weather data for cities around the globe.

Wolfram|Alpha – Deemed a “computational knowledge engine”, the W|A search and discovery tool is mathematically-based and tries to turn queries (term-based or data-driven) into actionable knowledge with visualization of in-house data sets and information relevant to your query.

World Gazetteer – The World Gazetteer provides a comprehensive set of population data and related statistics.

World Port Source – Contains extensive data on global sea ports, characterized by size and searchable by shipping liners and other various data fields.

Curiosity, Passion, and Quantifying Human Characteristics

“You can’t light the fire of passion in someone else if it doesn’t burn in you to begin with.” – Thomas Friedman

In his The World Is Flat, Friedman speaks to the growing need for curiosity and passion in today’s job market. Core intelligence, as historically measured by the Intelligence Quotient (IQ), is and will always be important, but in a flat world it’s the curiosity and passion that will matter most.

Friedman references a Curiosity Quotient (CQ) and a Passion Quotient (PQ) that purportedly parallel the common IQ framework for scoring a person’s intelligence. More specifically, he expresses a comparative relationship between the three variables: CQ + PQ > IQ. But can curiosity and passion be measured like intelligence? More generally, can other individual characteristics be measured?

Traditional measurement is the process of obtaining a magnitude for a quantity. Things are measured by counting, and not by observation or estimation. It’s supported by strong criteria that support that measured value, such as a universal frame or scale of reference. By traditional measurement, we cannot really find CQ, PQ, or even IQ. However, there are other types of measurement…

In representational theory, measurement is defined as a correlation of numbers with entities that are not numbers. In information theory, measurement is actually a component of estimation with the uncertainty reduced infinitesimally to zero. Measurement means estimating through support of any number of measurable or unmeasurable parameters, and reducing uncertainty through various means until reaching a high-confidence end value. By the extended definitions of measurement, we can practically quantify anything!!!

So what do we get by measuring traditionally-unmeasurable human characteristics, emotions, abilities, and qualities? What do we get by identifying any new particular Qualitative Quotient (QQ) such as the CQ or PQ? Well, Friedman is on the right track here. We become smarter by surpassing our current understanding of intelligence. And as our QQs surpass the IQ, so does our ability to flatten the world, innovate, grow and succeed as a civilization and society.

The process of trying to quantify characteristics helps us realize the underlying factors that contribute to a specific quality. What makes someone passionate? How can we tell if someone is curious? Is it genetic, demontrated by experience, and exhibited sub-consciously? Can it be determined through the collective interpretation of dreams? Examining the underpinnings of qualities makes us more intelligent as individuals, organizations, and societies. Once quantified, we can look for patterns and trends in our data across different geographies, demographics, and slices of traditionally-measurable data.

What we’ll learn then, well, I’m curious to find out.

kevin berardinelli

Tag: data