Data Definition

Data is a collection of distinct pieces of information, particularly information that has been formatted (i.e., organized) in some specific way for use in analysis or making decisions. It is the plural of datum, which can be defined as a statement that is accepted at face value (at least temporarily).

Information can be broadly defined as any pattern that can be recognized by some system (e.g., a living organism, an electronic system or a mechanical device) and/or that can influence the formation or transformation of other patterns. The pattern can be in any of a wide variety of forms, for example spoken or printed words, temperatures, visual images, pain, radioactivity, DNA, the structure of a crystal, color, or electron flows. It can range from extremely simple single binary values (e.g., yes or no, or zero or one) to something so complex that only a few human minds can understand it.

Data can exist in a variety of forms, including knowledge stored in a human mind, text written or printed on paper, and patterns of bytes stored in electronic memory chip, on magnetic media (e.g., a hard disk drive or magnetic tape) or on an optical disk (e.g., a CDROM or DVD disk).

Data is one of the two broad categories of computer software. The other is programs, which are sets of instructions for manipulating data.

Data processing is the manipulation of data in order to increase its usefulness. A major part of it can be conversion into a form that is easier for machines and humans to read, store and communicate. It can also include error checking, sorting, merging and analysis (e.g., finding averages and trends and performing comparisons).

Raw data is a relative term that refers to data prior to further processing by a human or computer. Thus data that was previously processed for some objective could be considered raw data when used for some other purpose.

A data structure is a way of storing data in a computer so that it can be used efficiently. Efficiency in this context refers to the ability to find and manipulate data quickly and with the minimum consumption of computer and network resources, mainly CPU (central processing unit) time, memory space and bandwidth.

Metadata is can be defined as data about a set of data, such as its structure, content, quality, context, origin, ownership and condition. It is used to organize, locate, manipulate and otherwise work with data when it is not necessary or desired to work with the actual data itself. Metadata is usually far smaller and easier to work with than the data that it describes.

Created May 12, 2006.
Copyright © 2006 The Linux Information Project. All Rights Reserved.