Pandas DataFrame & Series

1. Pandas Core Classes

Pandas has two main data structures:

1. DataFrame

2D structure (rows + columns)
Like a spreadsheet or SQL table

2. Series

1D structure
Represents a single column or row

Key Point:
DataFrame = table, Series = single column/row

2. What is a DataFrame?

A DataFrame:

Has labeled rows and columns
Can store multiple data types
Used for:
- Data manipulation
- Data analysis

Key Point:
Main structure for working with data

3. Creating a DataFrame

From Dictionary

pd.DataFrame({
    "Name": ["Alice", "Bob"],
    "Age": [25, 30]
})

pd.DataFrame({
    "Name": ["Alice", "Bob"],
    "Age": [25, 30]
})

Keys → column names
Values → column data

From NumPy Array

pd.DataFrame(array, columns=["A", "B"], index=[0,1])

pd.DataFrame(array, columns=["A", "B"], index=[0,1])

Key Point:
Flexible creation from different data sources

4. Loading Data (CSV)

Use:

pd.read_csv("file.csv")

pd.read_csv("file.csv")

Reads CSV into DataFrame
Can load from:
- URL
- Local file

Key Point:
Most common way to import data

5. What is a Series?

A Series:

1D labeled array
Represents:
- A column
- A row

Example:

df["Age"] → Series

Key Point:
Series = building block of DataFrame

6. Attributes vs Methods

Attributes (no parentheses)

.columns → column names
.shape → (rows, columns)

Methods (use parentheses)

.info() → dataset info

Key Point:
Attribute = property, Method = action

7. Key DataFrame Attributes

Columns

df.columns

df.columns

df.columns

Returns column names

Shape

df.shape

df.shape

Returns:
- (number of rows, number of columns)

Info

df.info()

df.info()

Shows:
- Data types
- Missing values
- Memory usage

Key Point:
Use these for quick inspection

8. Null Values (NaN)

Missing data → NaN (Not a Number)

Important for:

Data cleaning
Analysis

Key Point:
NaN represents missing values

9. Selecting Columns

Bracket notation (recommended):

df["Age"]

df["Age"]

Dot notation:

df.Age

df.Age

Limitation:

Dot notation fails if column has spaces

Key Point:
Prefer bracket notation

10. Selecting Multiple Columns

df[["Age", "Fare"]]

df[["Age", "Fare"]]

Returns:

New DataFrame

Key Point:
Use list inside brackets

11. Selecting Rows with `iloc`

iloc = index-based selection

Single row:

df.iloc[0]

df.iloc[0]

Multiple rows:

df.iloc[0:3]

df.iloc[0:3]

Key Point:
Uses integer positions

12. Selecting Rows & Columns with `iloc`

df.iloc[0:3, 2:4]

df.iloc[0:3, 2:4]

Rows 0–2
Columns 2–3

Key Point:
Select subset of data

13. Accessing Single Value

df.iloc[0, 3]

df.iloc[0, 3]

Returns:

Single value

Key Point:
Use two indices

14. Selecting with `loc`

loc = label-based selection

df.loc[1:3, "Name"]

df.loc[1:3, "Name"]

Uses row/column names

Key Point:
Select using labels instead of index

15. Adding a New Column

df["NewColumn"] = values

df["NewColumn"] = values

Adds column to DataFrame

Key Point:
Easy data expansion

16. Data Type Note

Mixed/string columns → type = "object"

Because:

Pandas built on NumPy

Key Point:
Object = generic data type

17. Practical Workflow

Common steps:

Load data
Inspect structure
Select data
Analyze
Modify

Key Point:
Pandas simplifies full workflow

18. Why Pandas is Powerful

Easy to read
Handles large data
Combines many operations

Key Point:
High productivity tool

19. Debugging Tip

When errors occur, check:

.shape
.columns
.info()

Key Point:
Understand data before fixing errors

20. Learning Strategy

Practice selecting data
Experiment with slicing
Read documentation

Key Point:
Hands-on practice is essential

Final Summary

Pandas provides two core data structures: DataFrame and Series. A DataFrame is a two-dimensional table used for storing and analyzing data, while a Series represents a single column or row. Using methods like read_csv(), .iloc, .loc, and attributes such as .shape and .columns, data professionals can efficiently load, inspect, manipulate, and analyze datasets. Pandas simplifies complex data workflows and is one of the most essential tools in data science.

Key Takeaways

DataFrame = table, Series = column/row
Use pd.read_csv() to load data
.columns, .shape, .info() for inspection
Use [] for column selection
Use iloc (index) and loc (label)
Add columns with assignment
NaN = missing data
Pandas built on NumPy
Core tool for data analysis

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

Pandas DataFrame & Series

1. Pandas Core Classes

1. DataFrame

2. Series

2. What is a DataFrame?

3. Creating a DataFrame

From Dictionary

From NumPy Array

4. Loading Data (CSV)

5. What is a Series?

6. Attributes vs Methods

Attributes (no parentheses)

Methods (use parentheses)

7. Key DataFrame Attributes

Columns

Shape

Info

8. Null Values (NaN)

9. Selecting Columns

Bracket notation (recommended):

Dot notation:

10. Selecting Multiple Columns

11. Selecting Rows with `iloc`

Single row:

Multiple rows:

12. Selecting Rows & Columns with `iloc`

13. Accessing Single Value

14. Selecting with `loc`

15. Adding a New Column

16. Data Type Note

17. Practical Workflow

18. Why Pandas is Powerful

19. Debugging Tip

20. Learning Strategy

Final Summary

Key Takeaways

Like this:

Related

1. Pandas Core Classes

1. DataFrame

2. Series

2. What is a DataFrame?

3. Creating a DataFrame

From Dictionary

From NumPy Array

4. Loading Data (CSV)

5. What is a Series?

6. Attributes vs Methods

Attributes (no parentheses)

Methods (use parentheses)

7. Key DataFrame Attributes

Columns

Shape

Info

8. Null Values (NaN)

9. Selecting Columns

Bracket notation (recommended):

Dot notation:

10. Selecting Multiple Columns

11. Selecting Rows with iloc

Single row:

Multiple rows:

12. Selecting Rows & Columns with iloc

13. Accessing Single Value

14. Selecting with loc

15. Adding a New Column

16. Data Type Note

17. Practical Workflow

18. Why Pandas is Powerful

19. Debugging Tip

20. Learning Strategy

Final Summary

Key Takeaways

Share this:

Like this:

Related

Discover more from Your Gateway to Data Mastery

11. Selecting Rows with `iloc`

12. Selecting Rows & Columns with `iloc`

14. Selecting with `loc`