Python is good for machine learning and data analysis because of its libraries.
is the most common plotting library which helps to draw simple diagram from data.
By using Numpy and Pandas, data can be imported to Anaconda
- Jupyter Notebook. Unlike building application or websites, it is better to use Notebook for data science as the plotting can be presented neatly.
So, here are the prerequisites:
1. [Import libraries]
First, we have to create a new notebook using Jupyter. Open Anaconda application:
Then open a new notebook using Python3, under the same folder as the *csv downloaded.
Here is the page that we're going to import the libraries:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Press Ctrl+Enter (This is the way to run codes in Jupyter).
to automatically show the plotted diagram without repeatedly typing
Then we can move on to import data.
2. [Import data]
Insert a new cell below, and use Pandas to import data frame:
Copy the snippet below into the row
df = pd.read_csv('WorldCups.csv')
We could see the first 5 rows of data from previous world cup champions in a table.
3. [Plot figure]
After importing the data, we could use it to plot diagrams easily. Let's say we want to see the number of attendances across years:
And we could see a diagram like this:
to show a better diagram aspect ratio.
is a function for plotting, the first argument is X-axis while the second is Y-axis. To plot more complicated diagrams such as data from multi-columns and heatmap, we could use another library named SeaBorn which is more powerful and colourful.