. . . . . . . . adopts a dictionary-oriented approach, associating to each category label a progressive integer number.

LabelEncoder class
LabelBinarizer class
DictVectorizer
FeatureHasher

The correct answer is: C. DictVectorizer

A DictVectorizer adopts a dictionary-oriented approach, associating to each category label a progressive integer number. This is done by creating a dictionary mapping each category label to a unique integer ID. The ID is then used to represent the category label in the vectorized data.

A LabelEncoder class is a class that can be used to convert categorical data into numerical data. It does this by assigning a unique integer ID to each category label. The ID is then used to represent the category label in the numerical data.

A LabelBinarizer class is a class that can be used to convert categorical data into binary data. It does this by creating a binary vector for each data point, where each element of the vector represents whether or not the data point belongs to the corresponding category.

A FeatureHasher class is a class that can be used to convert categorical data into a set of features. It does this by hashing each category label and then using the hash value as a feature.

Here is an example of how a DictVectorizer can be used:

“`
import pandas as pd
from sklearn.feature_extraction import DictVectorizer

df = pd.DataFrame({‘category’: [‘a’, ‘b’, ‘c’], ‘value’: [1, 2, 3]})

vectorizer = DictVectorizer()

X = vectorizer.fit_transform(df)

print(X)

[[0 1]
[1 2]
[2 3]]
“`

As you can see, the DictVectorizer has converted the categorical data into a vector of integers. The first element of the vector represents the category label ‘a’, the second element represents the category label ‘b’, and so on.

I hope this helps! Let me know if you have any other questions.

Exit mobile version