Explore data
Get a quick overview of your dataset to understand its contents and structure.
Example Data
Follow along with right out of the box example data. Copy following data in the information request of the agent you are working in.
Before conducting an analysis you might first familiarize yourself with the data. Seeing parts of it, understanding the distribution of certain values, viewing summary statistics, understanding missing values, and similar information. Step back and assessing the health of your data, get a sense of the big picture, and make informed choices before conducting an analysis.
Show parts of your data
Excel
In Excel, you would scroll through your worksheet and apply filters to show parts of data.
t0 Prompt
Show me the first 5 rows
Display last 10 rows
Code
The python code looks as follows:
Function | Argument |
---|---|
head() | Shows first five rows |
head(6) | Shows first six rows |
tail() | Shows last five rows |
tail(10) | Shows last 10 rows |
View column types and missing values
Excel
In Excel, you might click through each column, guess its type, or use filters to find blanks.
t0 Prompt
What columns do I have and what are their formats?
Show me a summary of the table
Are there any missing values?
Code
The python code looks as follows:
Function | Description |
---|---|
info() | Shows column names, types, non-null count |
dtypes | Shows the data type of each column |
Get summary statistics
Excel
In Excel, you use formulas like =AVERAGE()
or =MIN()
to calculate summary statistics, or use the "Quick Analysis" tool.
t0 Prompt
Summarize the numeric columns
Show me averages and ranges
What are the stats for this table?
Code
The python code looks as follows:
Function | Description |
---|---|
describe() | Summary stats for all numeric columns |
describe(include="all") | Also shows stats for categorical (non-numeric) columns |
View column names
Excel
In Excel, you look at the first row or header to see column names.
t0 Prompt
What are the column names?
List the headers
Code
The python code looks as follows: