Fill This Form To Receive Instant Help
Homework answers / question archive / DASC/CSE 5300 Programming Assignment 1 Description: Getting started with some data analysis
DASC/CSE 5300
Programming Assignment 1
Description:
Getting started with some data analysis.
First you need to get access to Python, this can be either installing Python (free download) or on-line. Either (or both) are OK.
https://www.python.org/downloads/ (any version of 3.7 through newest should be fine) https://colab.research.google.com/ (free no install access, with a free Google account)
Then try out a few simple things:
2+2, 217 – 1 (A Mersenne Prime number https://en.wikipedia.org/wiki/Mersenne_prime)
If you need some help to get started, there are many Python tutorials on-line, maybe “too” many. Find one or two that matches your level, your programming experience.
Then, we want to know if we can detect if some given text is written in English, Spanish or Swedish. How?
Count some letter frequencies of some texts from:
https://www.gutenberg.org/ get a few sample texts in each language
Count letter frequencies
(https://www.geeksforgeeks.org/python-frequency-of-each-character-in-string/
https://stackoverflow.com/questions/40985203/counting-letter-frequency-in-a-string-python https://www.quora.com/How-do-you-count-the-frequency-of-a-letter-in-Python) There are many other sources of text samples
(You are free to use any of this code, texts, or anything you find on the Web/Internet, As long as you clearly identify, in your code, where and what parts.)
Now, the interesting part – for a new, previously unseen text, which language is the “closest” match?
How will you do this “automatically” (in your program)?
Already member? Sign In