Cheatsheet
Contents
Cheatsheet¶
Files
You can download cheatsheet as a pdf here
.
Read and evaluate¶
pd.read_csv()
: Reads a CSV file into a DataFrame.
df.head()
: Returns the first n rows of the DataFrame.
df.tail()
: Returns the last n rows of the DataFrame.
df.columns
: Returns the column labels of the DataFrame.
df.shape
: Returns a tuple (rows, columns) of the DataFrame.
df.empty
: Indicates if the DataFrame is empty.
df['title'].unique()
: Returns unique values of Series.
df['title'].value_counts()
: Returns Series containing counts of unique value in column (example ’title’).
Access rows, columns, and cells¶
df['title']
or df.title
: Select single column with specific name (example ‘title’).
df.loc[]
: Access rows & columns by label(s) or a boolean array.
df.iloc[]
: Purely integer-location based indexing for selection by position.
df.iat[1, 2]
: Access single value by index.
df.at[4, 'A']
: Access single value by label.
Clean up¶
pd.isna()
: Detects missing values.
pd.notna()
: Detects non-missing values.
df.dropna()
: Removes missing values from the DataFrame.
df.duplicated()
: Returns boolean Series of duplicate rows.
df.drop_duplicates()
: Removes duplicate rows from DataFrame.
Series.apply()
: Invoke function on values of Series.
Series.str.rstrip()
: Removes trailing characters.
Series.str.zfill()
: Pads Series with zeros.
Series.str.strip()
: Strips whitespaces from Series.
Loop through rows¶
df.iterrows()
: Loops through DataFrame rows as (index, Series) pairs.
Merge DataFrames¶
pd.merge()
: Merge DataFrame or named Series objects with a database-style join.
how=“left”
: Merges on all ids from left DataFrame. Ids not in left DataFrame will not be included.
how=“right”
: Merges on all ids from right DataFrame. Ids not in right DataFrame will not be included.
how=”outer”
: Merges on all ids from both DataFrames.
how=“inner”
: Merges only on ids found in both DataFrames. Ids found in only one DataFrame will not be included.
Reshaping¶
df.explode()
: Transforms each element of a list-like to a row, replicating index values.
df.pivot()
: Reshape data (produce a “pivot” table) based on column values.
df.pivot_table()
: Create a spreadsheet-style pivot table as a DataFrame.
lambda
: An anonymous (unnamed) function that applies arguments to various parameters and returns an expression (outcome).
df.melt()
: Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
Sort DataFrame¶
df.sort_values()
: Sort by the values along either axis.
Create new CSV¶
df.to_csv()
: Writes the DataFrame to a CSV file