What is the primary purpose of the “crosstab” function in Pandas?

Data cleaning and preprocessing
Data visualization
Cross-tabulation of categorical data
Data transformation

The correct answer is C. Cross-tabulation of categorical data.

A crosstab, also known as a contingency table, is a statistical table that displays the frequency of occurrence of various categorical variables. The crosstab function in Pandas can be used to create crosstabs of dataframes.

Data cleaning and preprocessing are the processes of identifying and correcting errors in data, and transforming data into a format that is suitable for analysis. Data visualization is the process of creating graphical representations of data. Data transformation is the process of changing the form of data, such as by reshaping or pivoting it.

Here is an example of how the crosstab function can be used to create a crosstab of data from a dataframe:

“`
import pandas as pd

df = pd.DataFrame({‘A’: [1, 2, 3], ‘B’: [4, 5, 6], ‘C’: [7, 8, 9]})

crosstab = df.crosstab(‘A’, ‘B’)

print(crosstab)
“`

The output of the above code is a crosstab with three rows and three columns. The rows represent the values of the ‘A’ column, the columns represent the values of the ‘B’ column, and the cells of the crosstab contain the counts of the number of times each combination of values occurs in the dataframe.

The crosstab function can also be used to create crosstabs of data from multiple dataframes. For example, the following code creates a crosstab of data from two dataframes, df1 and df2:

“`
crosstab = df1.crosstab(df2[‘A’])

print(crosstab)
“`

The output of the above code is a crosstab with three rows and three columns. The rows represent the values of the ‘A’ column in df1, the columns represent the values of the ‘A’ column in df2, and the cells of the crosstab contain the counts of the number of times each combination of values occurs in the two dataframes.