Trusted by Students Everywhere
Why Choose Us?
0% AI Guarantee

Human-written only.

24/7 Support

Anytime, anywhere.

Plagiarism Free

100% Original.

Expert Tutors

Masters & PhDs.

100% Confidential

Your privacy matters.

On-Time Delivery

Never miss a deadline.

Part 2: Text Mining A dataset of Shark Tank episodes is made available

Statistics Oct 27, 2021

Part 2: Text Mining

A dataset of Shark Tank episodes is made available. It contains 495 entrepreneurs making their pitch to the VC sharks.

You will ONLY use “Description” column for the initial text mining exercise.

  1. Pick out the Deal (Dependent Variable) and Description columns into a separate data frame.
  2. Create two corpora, one with those who secured a Deal, the other with those who did not secure a deal.
  3. The following exercise is to be done for both the corpora:
  • Find the number of characters for both the corpuses.
  • Remove Stop Words from the corpora. (Words like ‘also’, ‘made’, ‘makes’, ‘like’, ‘this’, ‘even’ and ‘company’ are to be removed)
  • What were the top 3 most frequently occurring words in both corpuses (after removing stop words)?
  • Plot the Word Cloud for both the corpora.
  1. Refer to both the word clouds. What do you infer?
  2. Looking at the word clouds, is it true that the entrepreneurs who introduced devices are less likely to secure a deal based on your analysis?

Expert Solution

Buy This Solution
16.99 USD
Instant Access
Already a member? Sign In
Important Note: This solution is from our archive and has been purchased by others. Submitting it as-is may trigger plagiarism detection. Use it for reference only.

For ready-to-submit work, please order a fresh solution below.

Or get a fresh solution
Get Custom Quote
Secure Payment