What is Information?

By Gilbert Carl Herschberger II


Something seems amiss. We have an industry to create it, store it, move it around; but, how many of us know what it really is? Now is the time for our industry to embrace a more precise definition and re-examine the impact of overdependence on a data model in an information age.

Introduction

This is an armchair guide to conceptual space. Conceptual space is where all concepts are. Just as all physical objects are found in physical space, all conceptual objects are found in conceptual space. And just as we should explore physical space, we should also explore conceptual space.

Laws are different between physical and conceptual space. Unlike physical space, permanent landmarks are rare, and points of interest are transient. Concepts tend to move around as we explore, because we explore, as a direct result of our own exploration. We change concepts as concepts change us.

This book, like a treasure map, might help you to discover something of value. This can be a challenge, so prepare yourself well. This subject is both difficult and fundamental. It is difficult because the best answer is not obvious. It is fundamental because the way in which we answer this question influences every other thing we do.

Appropriate caution should be exercised. Because it is both difficult and fundamental, we must employ a deliberate and methodical approach. We do not desire to use an approach that could lead us to a misleading answer, one that might obscure the significant and advocates the mundane. We need a breakthrough.

Approach

Discovering the right question is a critical part of conceptual research. Even while they are necessary, assumptions should be eventually challenged. A lot of questions must be asked. Compelling ones must be captured for further study.

Our approach is based upon by a classic physical strategy known as divide and conquor. This physical strategy seems to translate well to conceptual space. After a measure of exploration, we divide this puzzle into a number of smaller problems. A smaller problem is easier for us to conquor. Eventually, we can conquor all problems and solve the puzzle.

Here is how we have chosen to divide this puzzle so far:

  1. Assumption
  2. Model

We intend to answer these questions:

  • Why must intelligence depend on assumption?
  • What are the differences between data and information models?

We with gather evidence from our experience in physical space to build a model of conceptual space. Physical space is an inexhaustible source of concepts. The important question remains: Which of these are valuable?

Question: What is information?

Most, if not all, of the confusion about information comes from failing to take this question seriously. The answer is obvious, isn't it? A mere child knows it. We know it, too. Or rather, we might assume we do.

The role of assumption

While our presentation is linear, conceptual space is not. Every concept is interconnected with every other concept. Every concept touches every other concept. There is no start point, no end point. Conceptual space is a complex tangle of unpredictable connections.

There may be nothing like this in physical space. While a useful model of physical objects can safely include isolation, a model of conceptual objects cannot. Isolation is where one object does not touch another. When grossly oversimplified, a complex tangle of connections might resemble a (spider's) web or a (fishing) net.

Instead of believing that two concepts are unconnected, we should assume that connections exist, just waiting to be discovered.

Because conceptual space is non-linear, it is impossible to perfectly isolate a well-defined concept as a starting point. Pre-existing well-defined concepts are required in order to define our first concept. Therefore, it is a paradox where we cannot build our first concept from a collection of well-defined concepts because we have not yet built our first concept.

We must start somewhere so we always start with assumption. Assumption is a mechanism we must use to temporarily suspend the paradox. An assumption is a poorly-defined concept. It is a place-holder, a way to defer exploration and definition to a later time.

Using a precise definition, assumption does not imply that a mistake has been made. Assumption exists without regard to intent. An assumption could be intentional, poorly-defined because we have decided to defer its definition. Or, it could be unintentional, poorly-defined because we are as yet unaware that it ought to be defined.

By definition, then, you are making an assumption. Risk comes from ignoring this.

Assumption is a good thing. It is critical. It is vital, healthy. It is a fundamental part of thinking and analysis. Thinking is the use of biological information processing. Analysis is the thorough examination of information, which depends on thinking.

Imagine a world without assumption. Without assumption, information itself might not exist. Thinking would be impossible because we could not bootstrap information processing itself. Analysis would be forever paralyzed. But, because we have to start somewhere, we must start with a conceptual universe filled with assumption.

Assumption is the cornerstone of creating and gathering information. Assumption enables us to examine something important to us by assuming things about something unimportant to us.

What is currently important to us? We use existing information to help us. But when we begin, we do not have enough information to objectively determine what is important to us. We must assume that something is important to us. As much as we desire the contrary, this must always be subjective.

Information is subjective when we are unable to determine the value of the information. The value of information can be determined by examining and evaluating all of the related information.

Conclusion

Within this document, we have introduced a number of poorly-defined concepts. We must ask ourselves: Is further definition of these concepts important? The next section explains why the concept of information should be well-defined.


Why important

For many people, information itself is a poorly-defined concept. Its definition is assumed. Its impact on day-to-day activity is assumed. Its value is assumed.

Most, if not all, modern cultures place a high value on sharing information, analyzing information and gathering information. Information gathering may distinguish humans from animals. Our survival depends on it.

Why bother to ask? Information has already been defined for us in an off-the-shelf dictionary. It has already been defined in many books on data processing, computer programming, management information systems and information technology. It doesn't need to be spelled out or analysed fully, does it?

It is all too easy to find an imprecise definition. An imprecise definition is what one should expect to find in the dictionary. That is what you'll also find in many so-called information technology books. One author after another parrots the imprecise definition with great authority.

Many have mistakenly assumed they know the answer. Many are comfortable and content with its imprecise definition.

But, what is the risk? Most of the time, there is no extraordinary risk. A person can fail to answer the question fully and experience no harm, no risk, no danger. Most of the time, it does not matter at all.

On the other hand, it is very important some of the time. And failing to answer this question can expose a person, a team, a company or an industry to extraordinary risk.

When we don't think deeply enough about it, we mistakenly confuse information with something else. That confusion can cause us to provide a solution to the wrong problem. We can imagine we are helping people to manage their information when, in fact, we are making their situation worse.

Precise Definition

Here is a more precise definition of "information":

Anything that anyone could possibly want to ask, to know or discover, without regard for its merit, accuracy or timeliness.

Information includes both the truth and the lie, both the accurate and inaccurate, both the precise and nebulous. It covers the full and complete spectrum of reason and fallacy, seriousness and humor, opinion and hyperbole.

Information includes both what we know about a measure and a measurement. Our perception is shaped by it and it is shaped by our perception. It exists apart from its value.

Information includes both the lost and the found. We create it, develop it and throw it away. We store it, retrieve it, move it around. We accept it, ignore it and battle against it. It is both part of the work we have done and work we have never got around to doing. It is the essence of every concept, idea, plan and purpose.

You are important to it and it should be important to you.

Imprecise definition

Here is a more common, and therefore, more imprecise definition:

Anything of value that a typical person would want to ask, to know or discover.

This definition puts a spin on the kind of information a typical person would want. A typical person is reasonable, right? It is safe to assume that a typical person wants to gather valuable information, isn't it? The accurate can be more valuable than the inaccurate. In other words, a "truth" may be more valuable than a "lie". Accuracy and precision can easily make the difference.

Many intelligent and reputable people have worked with the imprecise definition when working with information technology. If it weren't for the tragic and far reaching consequences of this oversimplification, this would not be a problem. We could go on building information models in which everything is defined with great precision, every model would work properly every time without delay. Further, everyone would be fully trained to understand their part in the information ecology.

But reality is more powerful than fantasy. All those working within the bounds of reality desparately need information technology based on reality, not fantasy.

What is data?

Let's try to get this straight. Data is information, but information is not just data. Data is a kind of information. When information is more precise, more objective, more easily understood, it can be reduced to data. More people can agree on data and more readily accept it.

This is an important part of understanding and embracing the nature of information. Most of the information we gather cannot be defined. Most is nebulous, yet to be understood, to be analysed, to be researched and discovered. Therefore, people are more likely to dispute it, challenge it and resist it. Only when information is dissected and digested, the more subjective part can be removed safely, leaving the more objective parts.

Comparison

It may be challenging to appreciate the mangled interaction between these precise and imprecise definitions. I say mangled because it is difficult to speak of this with precision when using one word to describe two very different things.

The more imprecise definition of "information" is based on an assumption. It assumes that we want to gather something of value. Therefore, it ignores the reality that we sometimes gather the value-less, the inaccurate, the old and imprecise.

"Information", as defined imprecisely, is a kind of "information", as defined precisely.

In terms of sets, valuable information is a subset of all information. If we start with all of the information that could possibly exist and we start eliminating the in-valuble and redundent, what remains is a subset of information that we subjectively determine to be something of value.

"information"2 is a subset of "information"1.

Therefore, what we often call "information technology" is more of a data technology. It should be no surprise that long ago the first computer systems provided a platform for information first (a file subsystem) that could be used to build a platform for data. And yet, the file subsystem must depend on and assume certain data structures to store information. The quest for a high-performance free-form database seems to be redundent when we already have it in the form of a time-honored file subsystem.

Many of the people we imagine are working in the "information technology" department are actually doing the work of data processing. We continue to create database designs with little regard for the information that many businesses need to store.

The role of chaos

In the beginning, a conceptual universe is filled with chaos. As more and more of the information within the universe is examined and evaluated, it becomes less chaotic and more organized.

The size of a conceptual universe is unknown and unknowable. For an explorer of exploring conceptual space, this is one of the thrills. We can spend hundreds and thousands of years exploring part of conceptual space without coming to the end of what could possibly be known.

There is always another niche just waiting to be explored.

While it might be reassuring to imagine that conceptual space is mostly organized with a few patches of chaos here and there, this is not the case at all. But rather, conceptual space is mostly disorganized--chaos--with a few patches here and there that have already been well examined. We can make some sense of it.

Picture this

Here is how many people think of the word itself:

Wrong: in - formation
By separating the work this way, it affects our definition. Some are inspired to think of a band or army corps marching in formation. It inspires us to think of information as mostly organized and fully formed. But it is not.

Here is how the word was created historically:

Right: inform - ation

As you may know, to inform means to bring something, most likely something of value, to another person. When a person has taken this action properly, it can be said that they are well informed. Information is, therefore, the conceptual stuff that one person brings to another.

The role of desire

Our desire often gets the better of us. In our shared fantasy, we would like others to give us exactly what we want, when we want it. We do not want others to give us another copy of what we already have; we do not want them to tell us what we already know.

We desire to recieve reliable, trustworthy and highly valuable facts. We want to be entertained. We want information to be presented in a direct way that helps us see most clearly in the shortest amount of time. Our time is valuable; we don't want to waste it on rumor, fallacy and subjective opinion when we're being deadly serious.

This desire is strong. It is so strong, in fact, that some have allowed their desire for valuable, accurate, precise, up-to-date information to color their definition of information itself.

Tragically, this leads to the opposite of what serious minded people need. What do serious minded people need? They need to always know where the information came from. To fully detemine the value of the information, you must be able to find out where it came from. The source of the information is part of its context. The value of information is determined more by its context than by the information itself.

The role of a message

Information alone cannot be moved from one context to another. It must be encoded by a source context and decoded by a target context. A message is the mechanism for moving information from one context to another.

A message has an important role in the movement of information from place to place. A message is made up of information and it carries information. A message is both information and an information wrapper.

Information technology products are sometimes based on a fantasy where no message is lost, every message is decoded and decoded perfectly. This may not reflect the reality.

  • In the real world, messages are sometimes lost. A lost message may still be a message.
  • Sometimes, a message cannot be decoded. An un-decoded message still exists; it waits for decoding.
  • Sometimes, a message is encoded or decoded incorrectly. It still exists. It is still a message.'
  • Both the encoded and decoded messages exist at the same time. It may or may not be safe to discard the one in preference for the other.

A message is information, but information may not always be part of a message. When is information a message? It depends on your perspective. Within a collection, information exists without being part of a message. But to transfer information from one collection to another, some kind of message is required. So information may or may not be part of a message depending on the way you define your collection.

Formula

Our information theory has provided the following formula. It is a close approximation for the value of information and has many exciting applications.
i = dc2

i is a close approximation for the value of the information in this iteration and set.

d is the value of data, a mostly objective measure for information in this set.

c is the value of context, a mostly subjective measure for information in this set. Because value of context is far more significant than the value of data, it must be squared to approximate the value of information.

Go, No Go

The formula can be applied to Go, No Go decisions. It is often difficult to determine if a product should be created, if a project should be completed, if a process should be analysed. The above formula can be applied to such a decision to increase your confidence that your decision is the right one. Here is how.

  1. Separate the objective measures from the subjective measures.

  2. Ask a series of five objective questions to determine a percentage, a probability of success. The five questions should be answered factually by analysing, researching and measuring. If the answer to one question is unknown or immeasurable, the formula resolves to zero.

    d = q1q2q3q4q5

  3. Ask a series of five subjective questions to determine a percentage. Separate subjective measures, but do not ignore them. The five questions should be answered honestly. If the answer to one question is unknown or immeasurable, the formula resolves to zero. By answering five subjective questions, you can exchange subjective measures for a more objective one.

    c = q6q7q8q9q10

  4. Evaluate the formula. The resulting percentage is also a measure of probability of success. It answers the question, how likely is it we will succeed?

The most exciting and powerful thing about using this technique comes from your ability to repeat the process as often as you like throughout your endeavor. The formula provides more objective feedback at the highest level.

What should you do when the value of your project drops below 50%? You should work first on the component of the project that has caused the value of your project to drop. It becomes your highest priority. You must resolve that component if you are to save your project.

A low score does not mean that your project is guaranteed to fail. Rather, it means that it will be more of a challenge to succeed. The formula enables you to better understand the challenges you face.

Objective measures

Success breeds success. When applying this formula to a project, these may be your objective measures:
  1. How much of this team remains the same from its last success?

  2. How much of this team depends on your product to get their work done?

  3. How much of the platform remains the same from its last success? The platform components include the target environment, operating system, programming language(s) and all other tools.

  4. How must of this team's performance has been documented, measured and reviewed?

  5. How much of the target market (your customer) remains the same?

  6. How much of the product's code, data, data structures, processing model and class/object design remains the same? When starting a new product, give a higher score when starting with a similar product and existing libraries of reusable components. Data taken from the real world deserves a higher score than data created for the sake of having data.

Subjective measures

Confidence and enthusiasm breeds success. When applying this formula to a project, these may be your objective measures:
  1. How much of your project has been defined? budgeted? scheduled? approved?

  2. How much of your product is understood by your target market (your customer)?

  3. How confident is your team that you can deliver what is promised within the constraints of budget and timeframe?

  4. How well do you know your target market? company? suppliers?

  5. How well does you and your team know your mission? goal? strategy? tactics?


Inside Gilbert's World, we promote the study of trailing-edge information technology to pave the way to a better future.

Copyright Gilbert Carl Herschberger II