{"id":55295,"date":"2024-04-16T00:24:57","date_gmt":"2024-04-16T00:24:57","guid":{"rendered":"https:\/\/exam.pscnotes.com\/mcq\/?p=55295"},"modified":"2024-04-16T00:24:57","modified_gmt":"2024-04-16T00:24:57","slug":"lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego","status":"publish","type":"post","link":"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/","title":{"rendered":"Let&#8217;s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data. You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset?"},"content":{"rendered":"<p>[amp_mcq option1=&#8221;All categories of categorical variable are not present in the test dataset.&#8221; option2=&#8221;Frequency distribution of categories is different in train as compared to the test dataset.&#8221; option3=&#8221;Train and Test always have same distribution.&#8221; option4=&#8221;Both A and B&#8221; correct=&#8221;option2&#8243;]<!--more--><\/p>\n<p>The correct answer is: Both A and B.<\/p>\n<p>One hot encoding (OHE) is a technique used to convert categorical features into numerical features. It does this by creating a new feature for each unique category in the original feature. For example, if a feature has the categories &#8220;red&#8221;, &#8220;blue&#8221;, and &#8220;green&#8221;, OHE would create three new features, one for each category.<\/p>\n<p>One challenge that can arise when using OHE is that the distribution of categories in the test data may be different from the distribution of categories in the training data. This can happen if the test data is collected from a different population than the training data. If the distribution of categories is different, then the OHE features may not be as effective in predicting the target variable in the test data.<\/p>\n<p>Another challenge that can arise when using OHE is that some categories may not be present in the test data. This can happen if the test data is collected from a different population than the training data, or if the test data is collected from a different time period than the training data. If some categories are not present in the test data, then the OHE features will not be able to capture the information about those categories.<\/p>\n<p>To avoid these challenges, it is important to check the distribution of categories in the test data before using OHE. If the distribution of categories is different from the distribution of categories in the training data, then you may need to adjust the OHE features or collect more training data. If some categories are not present in the test data, then you may need to remove those categories from the training data or create new OHE features for them.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>[amp_mcq option1=&#8221;All categories of categorical variable are not present in the test dataset.&#8221; option2=&#8221;Frequency distribution of categories is different in train as compared to the test dataset.&#8221; option3=&#8221;Train and Test always have same distribution.&#8221; option4=&#8221;Both A and B&#8221; correct=&#8221;option2&#8243;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[729],"tags":[],"class_list":["post-55295","post","type-post","status-publish","format-standard","hentry","category-machine-learning","no-featured-image-padding"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v22.2 (Yoast SEO v23.3) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Let&#039;s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data. You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset?<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Let&#039;s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data. You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset?\" \/>\n<meta property=\"og:description\" content=\"[amp_mcq option1=&#8221;All categories of categorical variable are not present in the test dataset.&#8221; option2=&#8221;Frequency distribution of categories is different in train as compared to the test dataset.&#8221; option3=&#8221;Train and Test always have same distribution.&#8221; option4=&#8221;Both A and B&#8221; correct=&#8221;option2&#8243;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/\" \/>\n<meta property=\"og:site_name\" content=\"MCQ and Quiz for Exams\" \/>\n<meta property=\"article:published_time\" content=\"2024-04-16T00:24:57+00:00\" \/>\n<meta name=\"author\" content=\"rawan239\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"rawan239\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Let's say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data. You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset?","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/","og_locale":"en_US","og_type":"article","og_title":"Let's say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data. You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset?","og_description":"[amp_mcq option1=&#8221;All categories of categorical variable are not present in the test dataset.&#8221; option2=&#8221;Frequency distribution of categories is different in train as compared to the test dataset.&#8221; option3=&#8221;Train and Test always have same distribution.&#8221; option4=&#8221;Both A and B&#8221; correct=&#8221;option2&#8243;]","og_url":"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/","og_site_name":"MCQ and Quiz for Exams","article_published_time":"2024-04-16T00:24:57+00:00","author":"rawan239","twitter_card":"summary_large_image","twitter_misc":{"Written by":"rawan239","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/","url":"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/","name":"Let's say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data. You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset?","isPartOf":{"@id":"https:\/\/exam.pscnotes.com\/mcq\/#website"},"datePublished":"2024-04-16T00:24:57+00:00","dateModified":"2024-04-16T00:24:57+00:00","author":{"@id":"https:\/\/exam.pscnotes.com\/mcq\/#\/schema\/person\/5807dafeb27d2ec82344d6cbd6c3d209"},"breadcrumb":{"@id":"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/exam.pscnotes.com\/mcq\/lets-say-you-are-working-with-categorical-features-and-you-have-not-looked-at-the-distribution-of-the-categorical-variable-in-the-test-data-you-want-to-apply-one-hot-encoding-ohe-on-the-catego\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/exam.pscnotes.com\/mcq\/"},{"@type":"ListItem","position":2,"name":"mcq","item":"https:\/\/exam.pscnotes.com\/mcq\/category\/mcq\/"},{"@type":"ListItem","position":3,"name":"Machine learning","item":"https:\/\/exam.pscnotes.com\/mcq\/category\/mcq\/machine-learning\/"},{"@type":"ListItem","position":4,"name":"Let&#8217;s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data. You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset?"}]},{"@type":"WebSite","@id":"https:\/\/exam.pscnotes.com\/mcq\/#website","url":"https:\/\/exam.pscnotes.com\/mcq\/","name":"MCQ and Quiz for Exams","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/exam.pscnotes.com\/mcq\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/exam.pscnotes.com\/mcq\/#\/schema\/person\/5807dafeb27d2ec82344d6cbd6c3d209","name":"rawan239","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/exam.pscnotes.com\/mcq\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/761a7274f9cce048fa5b921221e7934820d74514df93ef195a9d22af0c1c9001?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/761a7274f9cce048fa5b921221e7934820d74514df93ef195a9d22af0c1c9001?s=96&d=mm&r=g","caption":"rawan239"},"sameAs":["https:\/\/exam.pscnotes.com"],"url":"https:\/\/exam.pscnotes.com\/mcq\/author\/rawan239\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/posts\/55295","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/comments?post=55295"}],"version-history":[{"count":0,"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/posts\/55295\/revisions"}],"wp:attachment":[{"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/media?parent=55295"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/categories?post=55295"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/exam.pscnotes.com\/mcq\/wp-json\/wp\/v2\/tags?post=55295"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}