1) Why Changing How You View Data Matters
In data analytics, no two projects are exactly the same. Each project requires focusing on data from a different angle. By changing how data is viewed and explored, analysts can identify hidden issues, clean data more efficiently, and improve analysis quality.
Viewing data differently is not just about analysis—it is a powerful data-cleaning strategy.
2) Sorting and Filtering as Data-Cleaning Tools
Sorting
Sorting arranges data in a meaningful order (alphabetical, numerical, chronological).
How sorting helps with data cleaning:
- Makes incorrect or unusual values easier to spot
- Brings duplicate entries closer together
- Helps identify values that are out of expected ranges
Examples:
- Sorting prices from lowest to highest to spot unusually small or large values
- Sorting names alphabetically to find duplicates or inconsistencies
Filtering
Filtering displays only data that meets specific criteria while hiding the rest.
How filtering helps with data cleaning:
- Isolates values that need attention
- Focuses on specific conditions (e.g., values above a threshold)
- Separates relevant data from noise
Examples:
- Showing only values greater than a certain amount
- Filtering to show only even or odd values
- Displaying only records with missing or unusual entries
3) Pivot Tables for Clean, High-Level Views
A pivot table is a data summarization tool that can:
- Group data
- Count records
- Calculate totals or averages
- Reorganize data into a compact, readable format
How pivot tables help with data cleaning:
- Provide a clutter-free summary of large datasets
- Make anomalies or inconsistencies easier to detect
- Allow analysts to focus only on relevant subsets of data
Example use cases:
- Identifying top-performing products
- Checking whether totals or averages seem reasonable
- Quickly excluding irrelevant data from a specific analysis
Pivot tables help analysts spot issues that might be hard to see in raw data.
4) VLOOKUP for Cross-Referencing Data
VLOOKUP (Vertical Lookup) is a function used to search for a value in one location and return related information from another location.
Why VLOOKUP is useful in data cleaning:
- Confirms that codes, IDs, or keys match valid records
- Helps identify missing or mismatched references
- Links related data stored across different sheets or datasets
Key ideas:
- Searches for a value in the leftmost column of a specified range
- Returns a corresponding value from the same row
- Can require an exact match to avoid incorrect mappings
VLOOKUP is especially valuable when working with multiple datasets or reference tables.
5) Locking References for Accurate Lookups
When copying formulas like VLOOKUP:
- Cell references can change unintentionally
- Absolute references (using
$) prevent this
Benefits:
- Ensures consistent lookups
- Reduces formula errors
- Improves reliability when applying functions across many rows
6) Plotting Data to Detect Outliers
Plotting means visualizing data using charts or graphs.
How plotting helps with data cleaning:
- Reveals outliers quickly
- Highlights skewed or abnormal values
- Makes data-entry errors visually obvious
Example:
- A bar or column chart of prices reveals one value that is much lower than the rest
- Investigation shows a misplaced decimal point
Plotting is often the fastest way to catch errors that are hard to detect numerically.
7) Common Errors Revealed by Visual Analysis
Visual tools can uncover:
- Decimal placement errors
- Unexpected spikes or drops
- Values far outside the normal range
These errors can significantly impact calculations and conclusions if left uncorrected.
8) How These Tools Work Together
Effective data cleaning often combines multiple approaches:
- Sorting and filtering to isolate issues
- Pivot tables to summarize and validate data
- VLOOKUP to confirm relationships across datasets
- Plotting to visually detect outliers and inconsistencies
Using multiple perspectives increases the chance of catching subtle errors.
9) Key Takeaways
- Viewing data from different angles improves data cleaning.
- Sorting and filtering help isolate problematic records.
- Pivot tables provide clean summaries that reveal inconsistencies.
- VLOOKUP connects related data and validates references.
- Plotting exposes outliers and hidden errors quickly.
- Combining these methods leads to cleaner, more reliable data and better analytical outcomes.
