Best of the Interwebs

Having spent many years doing things on the Internet has taught me that virtually anything (knowledge-wise) is available online. I remember at the age of 16 I found an old 1930s era telephone; I found the wiring schematics online, re-wired it and made it work. This was back in the mid to late 90s when the commercial Internet was booming, but fast and easy access for all was not necessarily the rule.

What follows is a list of online resources that make available all kinds of information quickly and easily. If you know of any good websites please comment with information such as topic / subject matter and URL. As a final comment I am intentionally leaving out search engines, e.g. Google, because everyone knows about them and they require a certain understanding of the Internet to be really effective.

General Knowledge:

  1. Wikipedia, http://en.wikipedia.org
  2. Wikibooks, http://en.wikibooks.org (There are a number of great resources under the wiki heading, you can find them at the bottom of the Wikipedia main page.)
  3. iTunes U (This is available via the iTunes store and contains thousands of free video and audio from Universities around the world)
  4. Adobe Media Player (You can download this from Adobe, it has great informational videos as well as episodes of Star Trek and 90210 etc.)
  5. Digg, http://digg.com (Harnesses the power of the random surfer to find all kinds of stuff. It’s not the most organized collection of info, but neither is this blog post.)

General Science:

  1. Wolfram Alpha, http://www.wolframalpha.com/index.html

History Source Material

  1. Internet History Sourcebook Project, http://www.fordham.edu/halsall/ (This is a listing of primary source materials for all epochs of recorded human history. Indispensable to the history buff or history major.)
  2. Project Gutenberg, http://www.gutenberg.org/wiki/Main_Page (This should not necessarily be under the history heading, but this site contains tens of thousands of free e-books [historical because their copyrights have expired.])

Music

  1. Ricci Adams’ musictheory.net, http://www.musictheory.net/ (Includes various utilities and trainers, all available for download)
  2. Chordie, http://www.chordie.com (Includes tons of guitar chords, tablature, lessons and instruction for free)

Tech Stuff

  1. Revision3, http://www.revision3.com (A website of free videos about everything cool, technical and some other random stuff.)

Data Management for n00bs 2.0

What do we want from our data?

There are 6 generally accepted qualities that data should have.  Data should be:

  • Accurate:  Reliability and precision, the degree to which the correctness of a quantity is expressed
  • Relevant:  Appropriate to decision making requirements
  • Secure:  Cannot be inadvertently or unscrupulously destroyed, accessed or altered
  • Shareable:  Many people can access it
  • Timely:  Most current information
  • Transportable:  Easily ported to the necessary decision makers

Data and information may take many forms, some of which may be difficult or nearly impossible to capture.  Graphs, images, tables, reports, documents (narrative) and sales materials are all types of data and information.  Corporate culture and social networks are also types of data and information that are highly used, but difficult to capture.  When a new employee arrives, they often tap into this soft data management network that is collectively managed by the employees.  Who hasn’t started a new job and wondered “Who do I talk to about this problem…?”  Usually we go to someone and ask.  That person will then tell you “how things are done here” (corporate culture) or “go talk to so-and-so” (social networking.)  These are vital to a company, but are rarely documented and less often still, well documented.

This illustrates the difference between hard and soft data and information.  Something that is hard is not reliant on a flow of energy (computer or human brain), e.g., things found in filing cabinets, hard-drives, drawers, warehouses, etc.  Something that is soft is found in the human brain, on RAM, on a computer monitor (and not saved), etc.  Granted that the average human brain will work in some capacity for around 70 years, which is longer than the life of any hard-drive up to this point, it is still considered soft information.  Why?  I don’t know, it just is.  Probably has something to do with our current inability to directly access data stored on the human brain via IT.

If we roughly reverse the good values of data we get the 6 things we want to avoid:

  • Badly Interfaced:  Hard to scrupulously access
  • Delayed:  Long waits for requested data
  • Not Integrated:  Data is scattered all over the dang place
  • Redundant:  Same data stored in different places for no good reason
  • Uncontrolled:  Data gets into the database willy-nilly
  • Unrealistic:  Data doesn’t reflect a real world need

These are bad, don’t do them.

The beauty of the web is the power of interconnectivity, now many businesses can have internal and external data sources.  Internal are those sources which are collected by the company.  External are databases compiled by some entity outside the company.  The real challenge becomes integrating data from an external source, where the IS department has no control over the 6 positive data qualities.

For all my local cohorts I hope that this is in someway helpful or thought provoking.  Our clients are big on Web 2.0, even though when pressed for an exact definition they may not be able to give a clear, unified picture of their views and needs.  This is normal as most clients know what they need, but not exactly how to get it or what form it should take.  In any case, remember what is good and what is bad about data management.  This may help guide certain systems forward.  Or something.  This kind of thing is found in books anyway, so someone thinks it’s important.

Source:  Mostly from Wiley’s Data Management, Databases and Organizations, by Richard T. Watson, 4th ed.

Data Management for n00bs 1.0

Data, information and knowledge are the most important resources to many businesses today.  Though often used interchangeably, they are distinct and separate.  More importantly, the distinctions between these are subjective, based upon business requirements.

Data are raw facts; they are not useful in making decisions.  They are the numbers that make up sales, metrics, etc.  The addition of some meta-data (data about the data) can make them easier to use and comprehend, but for them to be useful they must be converted into information.

Information is collected data, organized and interpreted into something that can aide in the decision making process.  A chart comparing sales data from multiple regions across time could be a type of information used to make a decision.  A chart showing overall unit test coverage per line of code might be another example of data.

Knowledge is the ability to turn data into information and apply said information.  An individual may be presented with data showing the performance metrics for a piece of software.  They cannot transform that data into information for a decision maker if they do not know what GHz represents.  More over information is useless if a decision maker cannot apply it to a real situation.

As previously noted, these terms are subjective.  Data for one person may be perfectly valid information for someone with a higher degree of expertise.  Knowledge may allow two individuals to extrapolate two completely different sets of information for the same data.

Example:

Data = 0, 1, 1, 2, 3, 5, 8, 13, 21, 34…

For some people this may mean nothing at all.  In case you are one of those people, here is some knowledge.  This is the Fibonacci sequence and it is made by adding one number to the next number, i.e., 1 + 1 = 2 + 1 = 3 + 2 = 5 + 3 = 8, etc.  Two different people might extrapolate the following information from this data.

Person 1:  The ratio of each successive pair of numbers in the series quickly converges on 1.61804. . . , as 5 divided by 3 is 1.666…, and 8 divided by 5 is 1.60.  This value is called phi.

Person 2: This pattern shows the ideal growth of a population in gendered reproduction with one living offspring per gestational cycle.  An example of this is rabbit populations.

The first piece of information might be useful to a mathematician and the second to a farmer, though the average person may still think of the information as data without more knowledge.

What does this ultimately mean?  Using this knowledge about data, information and knowledge a person can begin analyzing the data being collected by an organization to determine what information is needed, what data supports that information and what knowledge is required to coordinate data and information.  Form there IS is born in the realm of data management.