1. Why Data Types and Formats Matter
Just as movies can be compared by genre, tone, or style, data can be compared and classified by type and format. Understanding these differences helps data analysts choose the right methods for analysis and interpretation.
A spreadsheet containing movie data provides a useful way to see how different kinds of data appear and how they are used.
2. Qualitative vs. Quantitative Data
Qualitative Data
Qualitative data describes qualities or characteristics that cannot be measured numerically.
Key characteristics
- Descriptive, not numeric
- Often names, labels, or categories
- Cannot be counted or measured directly
Examples
- Movie titles
- Actor names
- Genres
Quantitative Data
Quantitative data represents measurable or countable values expressed as numbers.
Key characteristics
- Numeric
- Can be measured or counted
- Represents amounts or quantities
Examples
- Movie budget (in dollars)
- Box office revenue
3. Discrete vs. Continuous Data (Quantitative Subtypes)
Discrete Data
Discrete data consists of countable values with fixed precision.
Characteristics
- Limited number of possible values
- Often represented with fixed decimal places
Examples
- Movie budgets
- Box office revenue (dollars and cents)
There are no values between one cent and the next.
Continuous Data
Continuous data can be measured on a scale and represented with many decimal places.
Characteristics
- Infinite possible values within a range
- Measured rather than counted
Example
- Movie runtime expressed as minutes with decimals (e.g., 110.0356 minutes)
4. Nominal vs. Ordinal Data (Qualitative Subtypes)
Nominal Data
Nominal data is categorical data with no inherent order.
Characteristics
- Labels only
- No ranking or sequence
Example
- Survey responses: “Yes,” “No,” “Not sure”
Ordinal Data
Ordinal data is categorical data with a defined order or ranking.
Characteristics
- Ordered categories
- Differences between values are not necessarily equal
Example
- Movie ratings from 1 to 5
- Rankings based on preference or satisfaction
5. Internal vs. External Data
Internal Data
Internal data is generated and stored within an organization’s own systems.
Characteristics
- Easier to access
- Typically more reliable
- Directly controlled by the organization
Example
- A movie studio’s internal production and sales records
External Data
External data is collected outside the organization.
Characteristics
- Comes from third parties or public sources
- Often necessary for broader analysis
- May require additional validation
Example
- Movie data from other studios or public databases
6. Structured vs. Unstructured Data
Structured Data
Structured data is organized in a clear format, making it easy to search and analyze.
Characteristics
- Stored in rows and columns
- Fits neatly into tables
Examples
- Spreadsheets
- Relational databases
Structured data supports efficient querying and analysis.
Unstructured Data
Unstructured data does not follow a predefined format.
Characteristics
- No consistent structure like rows and columns
- Harder to search and analyze directly
Examples
- Audio files
- Video files
Unstructured data may contain internal patterns, but they are not immediately accessible in tabular form.
7. Key Takeaways
- Data can be qualitative or quantitative
- Quantitative data can be discrete or continuous
- Qualitative data can be nominal or ordinal
- Data may be internal or external to an organization
- Structured data is organized and analysis-ready
- Unstructured data lacks a clear tabular format
- Understanding data formats helps analysts choose appropriate methods
One-sentence summary
Understanding data types and formats allows data analysts to organize, interpret, and analyze information more effectively and accurately.
