cm4770, Charlie Multerer

Part 1: Word Trends and N-grams




It was interesting to explore this program to see various trends in words throughout time. Currently I am taking a course about international relations, POL 240, and we are are covering politics after the events of 9/11. Because of this, I was interested to see how words associated with this event changed after it occurred. The most significant finding from the first graph is that in 2001, the usage of the words "terrorism" and "terrorist" drastically increased. For this graph, I set the smoothing to 0 to clearly see how each of these words had different uses throughout the 2000s. For the second graph, I thought it would be interesting to see how the introduction of technologies and the development of the technology companies has grown in terms of the uses of their names in the past two decades. After iPhones and iPads were first introduced, their names became significantly more used. Also, I thought it was fascinating to see how Google grew as Microsoft shrunk. For this graph, I set the smoothing to 0 and made the words case sensitive.

Part 2: Language Tools


Chosen Book: The Great Gatsby
Word Cloud:

Insightful Display:

Seeing a visualization of how often various names appear in the text, as well as when they do is insightful. This tool could help with identifying various parts of the text by which characters are present.
Interesting Display:

I am not entirely sure what exactly this display is doing, but it shows a map with various locations from the book. I find it extremely interesting that this program is able to locate places mentioned from the text, and even connect them.

Part 3: Sentiment Analysis

Find two words in Sentimood's list that could be either positive or negative, depending on context and interpretation.

The word "killed" has a serious negative connotation based off of its literal definition. In another context, for example, "You killed it during the basketball game last night!", the world "killed" would mean that someone did something good.

Alternatively, the word "incredible" typically has a positive connotation. However, its definition is simply, "impossible to believe". With this definition, something could be "incredibly frightening" which demonstrates a negative connotation.

Find two words where you think the weighting is seriously wrong?

The word "almost" does not have a positive or negative connotation. However, in my personal opinion, I would say that the word "almost" is one of the saddest to exist. For example, "He was almost good enough", "She almost made it", "They almost lived"

Similarly, the word "respect" does not have a connotation in Sentimood. I believe that it should be weighted heavily in a positive manner. Respect pertains to admiration and kindness, both of which have positive connotations.

Try some sentences from literature, your own writing, tweets, or whatever, with both Sentimood and the commercial analyzer. Give two examples of sentences where they agree and both appear to be correct.

Taken from a New York Times article, "‘Ghost Guns’: Firearm Kits Bought Online Fuel Epidemic of Violence They are untraceable, assembled from parts and can be ordered by gang members, felons and even children. They are increasingly the lethal weapon of easy access around the U.S., but especially California." Both Sentimood and the commercial analyzer agreed that this text was negative.

Another headline from the NYT, "Climate Promises Made in Glasgow Now Rest With a Handful of Powerful Leaders." Both Sentimood and the commercial analyzer agreed that this text was positive.

Give two examples where they differ markedly in their assessment.

A tweet from Joe Biden, "COVID-19 has disrupted supply chains around the world. Now, even in the midst of a historic economic recovery, Americans are facing prices that are just too high." Sentimood was unable to determine if this statement was positive or negative, whereas the commercial analyzer made the determination that it was very negative.

Another NYT headline, "A clash over culture and politics comes to World, a groundbreaking institution that covers evangelical Christians, our media columnist Ben Smith writes." The commercial analyzer noted that this text was without sentiment, while Sentimood claimed that it was negative.

Give two examples where they agree and both appear to be clearly wrong.

From NYT, "Faith Groups Push to Scrap Mandates in Biden’s Child Care Plan." Both Sentimood and the commercial analyzer agreed that this headline was positive because of the words 'faith' and 'care'. However, this headline is actually negative as it is addressing a dispute that has arisen within American politics.

Another tweet from Joe Biden, "For all of you at home who feel left behind and forgotten in an economy that is changing rapidly, this bill is for you." Both analyzers claimed that this statement was negative. It is actually positive because it is intended to give hope.

Part 4: Machine Translation


Google Translate

English 1 Spanish English 2
Good Input 1 "Don't judge a book by its cover" "No juzgar un libro por su cubierta" "Do not judge a book by its cover"
Good Input 2 "Better late than never" "Mejor tarde que nunca" "Better late than never"
Bad Input 1 "It's raining cats and dogs" "Lueve a cántaros" "It's pouring down rain"
Bad Input 2 "Burn the midnight oil" "Quemar las pestañas" "Burn eyelashes"

Microsoft Bing Translator

English 1 Spanish English 2
Good Input 1 "Cry over spilt milk" "Llorar por leche derramada" "Cry over spilt milk"
Good Input 2 "The ball is in your court" "la pelota está en tu cancha" "The ball is in your court"
Bad Input 1 "Back to the drawing board" "Volver a empezar desde cero" "Start over from scratch"
Bad Input 2 "Steal someone's thunder" "Saludar con sombrero ajeno" "Greet with someone else's hat"

For the most part, each of the translators worked decently well. Each translator has 2 instances where the literal translation was incorrect. For both translators, one instance managed to get the message across, and the other instance was completely wrong and made no sense. In practice, I would say that these translating tools could be effective almost all of the time. As long as a person is not speaking in idioms, communications can be had with these translator tools. Furthermore, the main difference between services that I noticed was how user friendly each service was. As expected, Google Translate had more features and they worked better than Microsoft Bing Translate.

Part 5: Machine Learning


Experiment 1

Experiment 2

For the first experiment, I took 135 samples of myself and 150 samples of my friend from my spanish class. It worked quite well immediately. I did not need to take any more samples as it was able to differentiate us right away. For the second experiment, I took 103 samples of myself without my hat on and 92 samples of me with my hat on and my hood up. Again, these samples were sufficient and worked without needing to gather any more.

THE END