One of the simple modes to create dataframe in Python, pandas is to create it from a dictionary. Below example create a dataframe from dictionary.
import pandas as pd
dict1 = {
'Ford': [120, 230, 120, 431],
'Renault': [320, 233, 547, 622],
'Audi': [230, 123, 457, 232],
'Toyota': [230, 123, 457, 232],
'Opel': [230, 123, 457, 232]
}
print(dict1.keys())
print("Ford key is:", end=' ')
print(dict1['Ford'])
sells = pd.DataFrame(dict1) # create sells dataframe from 'dict1' dictionary
print('DataFrame is:')
print(sells)
#output:
dict_keys(['Ford', 'Renault', 'Audi', 'Toyota', 'Opel'])
Ford key is: [120, 230, 120, 431]
DataFrame is:
Ford Renault Audi Toyota Opel
0 120 320 230 230 230
1 230 233 123 123 123
2 120 547 457 457 457
3 431 622 232 232 232
Comments about above code: "dict1" dictionary contain situation with car sales for a machine dealer, for first 4 month of the year.
Dictionary keys are 'Ford', 'Renault' 'Audi' 'Toyota' 'Opel'.
Dictionary values are lists, means for 'Ford' key value is list [120, 230, 120, 431].
Recalling from dictionaries theory print(dict1.keys()), which will show:
dict_keys(['Ford', 'Renault', 'Audi', 'Toyota', 'Opel'])
and print(dict1['Ford']) will show [120, 230, 120, 431].
And finally, method that create sells dataframe from dictionary is: sells = pd.DataFrame(dict1).
There is a possibility to create dataframe from dictionary using method from_dict of class DataFrame, which works similar like in previous example, but it has more options.
Example:
import pandas as pd
dict2 = {
'candy':['80%', '60%', '45%'],
'chocolate':['12%', '24%', '7%'],
'wafer':['14%', '18%', '16%']
}
sells2=pd.DataFrame.from_dict(dict2)
print(sells2)
#output:
candy chocolate wafer
0 80% 12% 14%
1 60% 24% 18%
2 45% 7% 16%
We see using DataFrame.from_dict specifying only dictionary from which we create dataframe, it will create a dataframe in which keys become dataframe columns.
Using from_dict with parameter orient='index' will create a dataframe in which keys are first values in row like below example:
import pandas as pd
dict2 = {
'candy':['80%', '60%', '45%'],
'chocolate':['12%', '24%', '7%'],
'wafer':['14%', '18%', '16%']
}
sells2i=pd.DataFrame.from_dict(dict2, orient='index')
print(sells2i)
#output:
0 1 2
candy 80% 60% 45%
chocolate 12% 24% 7%
wafer 14% 18% 16%
Creating dataframe from a list of tuples.
This can be achieved using method DataFrame.from_records
Example:
import pandas as pd
marks = [('Mike', 9), ('Debora', 10), ('Steve', 9), ('Tim', 8)]
marks_df = pd.DataFrame.from_records(marks, columns=['Student', 'Mark'])
#Output:
Student Mark
0 Mike 9
1 Debora 10
2 Steve 9
3 Tim 8
Other example for dataframe from list of tuples with more elements:
import pandas as pd
marks = [('Mike', 9, 8), ('Debora', 10, 9), ('Steve', 9, 10), ('Tim', 8, 9)]
marks_df = pd.DataFrame.from_records(marks, columns=['Student', 'Mark1','Mark2'])
print(marks_df)
#output:
Student Mark1 Mark2
0 Mike 9 8
1 Debora 10 9
2 Steve 9 10
3 Tim 8 9
from_records can be used similar to create dataframe from a list of dictionaries,
Example:
import pandas as pd
marks = [{'Student':'Mike' , 'Mark':9},
{'Student':'Debora' , 'Mark':10},
{'Student':'Steve' , 'Mark':9},
{'Student':'Tim' , 'Mark':8}
]
marks_df1 = pd.DataFrame.from_records(marks)
print(marks_df1)
#Output
Student Mark
0 Mike 9
1 Debora 10
2 Steve 9
3 Tim 8
We can create dataframe from numpy array,
Example:
import pandas as pd
import numpy as np
data = np.array([('Student','Mark'),('Mike', 9), ('Debora', '10'), ('Steve', 9), ('Tim', 8)])
marks_df2= pd.DataFrame.from_records(data)
print(marks_df2)
#output:
0 1
0 Student Mark
1 Mike 9
2 Debora 10
3 Steve 9
4 Tim 8