Program to demonstrate Data series and Data Frames using pandas.
1.Write a Pandas program to add, subtract, multiple and divide two Pandas Series.
import pandas as pd
s1 = pd.Series([10, 20, 30, 40, 50])
s2 = pd.Series([5, 10, 15, 20, 25])
# Add s1 and s2
s3 = s1 + s2
print("Addition of two Series:")
print(s3)
# Subtract s2 from s1
s4 = s1 - s2
print("\nSubtraction of two Series:")
print(s4)
# Multiply s1 and s2
s5 = s1 * s2
print("\nMultiplication of two Series:")
print(s5)
# Divide s1 by s2
s6 = s1 / s2
print("\nDivision of two Series:")
print(s6)
2.Write a Pandas program to convert Series of lists to one Series.
import pandas as pd
s1 = pd.Series([[1, 2, 3], [4, 5], [6, 7, 8, 9], [10]])
s2 = pd.Series([val for sublist in s1 for val in sublist])
print(s2)
3.Write a Pandas program to select the rows where the number of attempts in the examination is greater than 2.
: exam_data = {‘name’: [‘Anastasia’, ‘Dima’, ‘Katherine’, ‘James’, ‘Emily’, ‘Michael’, ‘Matthew’, ‘Laura’, ‘Kevin’, ‘Jonas’], ‘score’: [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], ‘attempts’: [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], ‘qualify’: [‘yes’, ‘no’, ‘yes’, ‘no’, ‘no’, ‘yes’, ‘yes’, ‘no’, ‘no’, ‘yes’]} labels = [‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ‘i’, ‘j’]
import numpy as np
import pandas as pd
exam_data = {
'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']
}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df = pd.DataFrame(exam_data, index=labels)
result = df[df['attempts'] > 2]
print(result)
4.Write a Pandas program to sort the data frame first by ‘name’ in descending order, then by ‘score’ in ascending order.
: exam_data = {‘name’: [‘Anastasia’, ‘Dima’, ‘Katherine’, ‘James’, ‘Emily’, ‘Michael’, ‘Matthew’, ‘Laura’, ‘Kevin’, ‘Jonas’], ‘score’: [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], ‘attempts’: [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], ‘qualify’: [‘yes’, ‘no’, ‘yes’, ‘no’, ‘no’, ‘yes’, ‘yes’, ‘no’, ‘no’, ‘yes’]} labels = [‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ‘i’, ‘j’] Values for each column will be: name : ‘Suresh’, score: 15.5, attempts: 1, qualify: ‘yes’, label: ‘k’
import pandas as pd
import numpy as np
# create the initial DataFrame
exam_data = {
'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']
}
# create the DataFrame
df = pd.DataFrame(exam_data, index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'])
# add a new row to the DataFrame
new_row = pd.Series(data={'name': 'Suresh', 'score': 15.5, 'attempts': 1, 'qualify': 'yes'}, name='k')
df = df.append(new_row)
# sort the DataFrame by 'name' in descending order, then by 'score' in ascending order
df = df.sort_values(by=['name', 'score'], ascending=[False, True])
print(df)