The correct answer is C. Data Encoding.
Data encoding is the process of converting categorical data into numerical format for modeling. This is done by assigning a unique number to each category. For example, if you have a dataset of customers with the following categories: “Male”, “Female”, and “Other”, you could encode them as follows:
- Male = 0
- Female = 1
- Other = 2
Once the data is encoded, it can be used in machine learning models.
A. Data Aggregation is the process of combining data from multiple sources into a single dataset. This is often done to create a more complete picture of the data. For example, you might aggregate data from sales, marketing, and customer service to get a better understanding of your overall business performance.
B. Data Imputation is the process of filling in missing values in a dataset. This is often done before data analysis or modeling. There are a number of different imputation methods, such as mean imputation, median imputation, and multiple imputation.
D. Data Normalization is the process of transforming data so that it has a mean of 0 and a standard deviation of 1. This is often done before data analysis or modeling. Normalization can help to improve the performance of machine learning models.