Deck 12: Natural Language Processing  

Full screen (f)
exit full mode
Question
Which of the following statements is false?

A) TextBlob is the fundamental class for NLP with the textblob module.
B) The following code creates a TextBlob containing two sentences:
From textblob import TextBlob
Text = 'Today is a beautiful day. Tomorrow looks like bad weather.'
Blob = TextBlob(text)
C) TextBlobs, Sentences and Words cannot be xe "TextBlob NLP library:compare TextBlobs to strings"xe "TextBlob NLP library:textblob module"compared with strings.
D) Sentences, Words and TextBlobs inherit from BaseBlob, so they have many common methods and properties.
Use Space or
up arrow
down arrow
to flip the card.
Question
Which of the following statements a), b) or c) is false?

A) One of the most common and valuable NLP tasks is sentiment analysis, which determines whether text is positive, neutral or negative.
B) Companies might use sentiment analysis to determine whether people are speaking positively or negatively online about their products. cxe "positive sentiment"xe "negative sentiment". A sentence that contains the word "good" has positive sentiment.
D) All of the above statements are true.
Question
Assuming you have a TextBlob named blob containing 'Today is a beautiful day. Tomorrow looks like bad weather.', what property should replace the ? in the following snippet to get the output shown below? In [8]: blob.?
Out[8]: WordList(['Today', 'is', 'a', 'beautiful', 'day', 'Tomorrow', 'looks', 'like', 'bad', 'weather'])

A) word
B) wordlist
C) words
D) None of the above
Question
Which of the following statements is false?

A) An xe "n-grams"n-gram is a sequence of n text items, such as letters in words or words in a sentence. In natural language processing, n-grams can be used to identify letters or words that frequently appear adjacent to one another.
B) For text-based user input, n-grams can help predict the next letter or word a user will type-such as when completing items in IPython with tab-completion or when entering a message to a friend in your favorite smartphone messaging app. For speech-to-text, n-grams might be used to improve the quality of the transcription.
C) You can pass the keyword argument n to TextBlob's ngrams method to produce n-grams of any desired length.
D) The following code uses TextBlob's ngrams method to create all the trigrams from a sentence. The code is actually incorrect-it should have used the keyword argument n=3 to TextBlob's ngrams method:
In [1]: from textblob import TextBlob
In [2]: text = 'Today is a beautiful day. Tomorrow looks like bad weather.'
In [3]: blob = TextBlob(text)
In [4]: blob.ngrams()
Out[4]:
[WordList(['Today', 'is', 'a']),
WordList(['is', 'a', 'beautiful']),
WordList(['a', 'beautiful', 'day']),
WordList(['beautiful', 'day', 'Tomorrow']),
WordList(['day', 'Tomorrow', 'looks']),
WordList(['Tomorrow', 'looks', 'like']),
WordList(['looks', 'like', 'bad']),
WordList(['like', 'bad', 'weather'])]
Question
Splitting text into meaningful units, such as words and numbers is called ________.

A) inflectionization
B) tokenization.
C) lemmatization.
D) parts-of-speech tagging
Question
Which of the following statements is false?

A) The TextBlob library uses xe "Google Translate"Google Translate to detect a text's language and translate TextBlobs, Sentences and Words into other languages.
B) The following code uses TextBlob's translate method to translate a TextBlob's text to Spanish ('es') then detect the language on the result. The to keyword argument specifies the target language.
In [38]: spanish = blob.translate(to='es')
In [39]: spanish
Out[39]: TextBlob("Hoy es un hermoso dia. Mañana parece mal tiempo.")
In [40]: spanish.detect_language()
Out[40]: 'es'
C) Calling TextBlob's translate method without arguments translates from the detected source language to English.
D) All of the above statements are true.
Question
Given the following Word object: In [1]: from textblob import Word
In [2]: happy = Word('happy')
Which of the following statements a), b) or c) is false?

A) The TextBlob library uses the NLTK library's WordNet interface, enabling you to look up xe "word definitions"word definitions, and get xe "synonyms"synonyms and xe "antonyms"antonyms.
B) The Word class's definitions property returns a list of all the word's definitions in the WordNet database:
In [3]: happy.definitions
Out[3]:
['enjoying or showing or marked by joy or pleasure',
'marked by good fortune',
'eagerly disposed to act or to be of service',
'well expressed and to the point']
C) The Word class's define method enables you to pass a part of speech as an argument so you can get definitions matching only that part of speech.
D) All of the above statements are true.
Question
________ are sets of consecutive words in a corpus for use in identifying words that frequently appear adjacent to one another.

A) blobs
B) n-grams
C) stems
D) inflections
Question
In the field of NLP, a text collection is generally known as a ________.

A) corpus
B) compilation
C) book
D) volume
Question
Consider the following code: In [18]: blob
Out[18]: TextBlob("Today is a beautiful day. Tomorrow looks like bad weather.")
In [19]: blob.sentiment
Out[19]: Sentiment(polarity=0.07500000000000007,
Subjectivity=0.8333333333333333)
Which of the following statements is false?

A) A TextBlob's sentiment property returns a Sentiment object indicating whether the text is xe "positive sentiment"positive or xe "negative sentiment"negative and whether it's objective or subjective.
B) The xe "Sentiment named tuple:polarity"xe "polarity of Sentiment named tuple"polarity indicates xe "sentiment"sentiment with a value from -1.0 (negative) to 1.0 (positive) with 0.0 being neutral.
C) The xe "Sentiment named tuple:subjectivity"xe "subjectivity of Sentiment named tuple"subjectivity is a value from 0.0 (objective) to 1.0 (subjective).
D) Based on the values for this TextBlob, the overall sentiment is close to xe "neutral sentiment"neutral, and the text is mostly objective.
Question
Which of the following statements a), b) or c) is false?

A) TextBlobs, Sentences and Words all have a correct method that you can call to correct spelling.
B) Assuming word is a Word object containing 'theyr', calling correct on word returns the correctly spelled word that has the highest confidence value (as returned by spellcheck).
C) Calling correct on a TextBlob or Sentence checks the spelling of each word. For each incorrect word, correct replaces it with the correctly spelled one that has the highest confidence value:
In [6]: from textblob import Word
In [7]: sentence = TextBlob('This sentense has missplled wrds.')
In [8]: sentence.correct()
Out[8]: TextBlob("This sentence has misspelled words.")
D) All of the above statements are true.
Question
Which of the following statements a), b) or c) is false?

A) Parts-of-speech (POS) tagging is the process of evaluating words based on their context to determine each word's part of speech.
B) There are eight primary xe "English parts of speech"English parts of speech-xe "parts-of-speech (POS) tagging:nouns"nouns, xe "parts-of-speech (POS) tagging:pronouns"pronouns, xe "parts-of-speech (POS) tagging:verbs"verbs, xe "parts-of-speech (POS) tagging:adjectives"adjectives, xe "parts-of-speech (POS) tagging:adverbs"adverbs, xe "parts-of-speech (POS) tagging:prepositions"prepositions, xe "parts-of-speech (POS) tagging:conjunctions"conjunctions and xe "parts-of-speech (POS) tagging:interjections"interjections (words that express emotion and that are typically followed by punctuation, like "Yes!" or "Ha!").
C) An important use of POS tagging is determining a word's meaning among its possibly many meanings-this is important for helping computers "understand" natural language.
D) All of the above statements are true.
Question
Which of the following is not a TextBlob capability?

A) Parts-of-speech (POS) tagging. xe "TextBlob NLP library:noun phrase extraction"b. Language detection.
C) Sentiment analysis.
D) Similarity detection.
Question
Which of the following statements a), b) or c) is false?

A) You can get a Word's synsets-that is, its xe "synset (set of synonyms)"xe "sets of synonyms (synset)"sets of synonyms-via the synsets property. The result of applying this property to a Word is a list of Synset objects:
In [4]: happy.synsets
Out[4]:
[Synset('happy.a.01'),
Synset('felicitous.s.02'),
Synset('glad.s.02'),
Synset('happy.s.04')]
B) Each Synset represents a group of synonyms. In the code in part b) above, the notation happy.a.01: \bullet happy is the original Word's lemmatized form (in this case, it's the same).
\bullet a is the part of speech, which can be a for adjective, n for noun, v for verb, r for adverb or s for adjective satellite.
\bullet 01 is a 0-based index number. Many words have multiple meanings, and this is the index number of the corresponding meaning in the WordNet database.
C) There's also a get_synsets method that enables you to pass a part of speech as an argument so you can get Synsets matching only that part of speech.
D) All of the above statements are true.
Question
Which of the following statements is false?

A) Inflections are different forms of the same words, such as singular and plural (like "person" and "people") and different verb tenses (like "run" and "ran").
B)When you're calculating word frequencies, you might first want to convert all inflected words to the same form for more accurate word frequencies.
C) Words and WordLists each support converting words to their singular or plural forms.
D) The following code pluralizes a bunch of nouns:
In [6]: from textblob import TextBlob
In [7]: animals = TextBlob('dog cat fish bird').words
In [8]: animals.plural()
Out[8]: WordList(['dogs', 'cats', 'fish', 'birds'])
Question
Which of the following statements a), b) or c) is false?

A) You can use the open source wordcloud module's WordCloud class to generate word clouds with just a few lines of code. By default, wordcloud creates rectangular word clouds, but the library can create word clouds with arbitrary shapes.
B) To create a word cloud of a given shape, you can initialize a WordCloud object with an image known as a xe "mask image"mask. The WordCloud fills non-white areas of the mask image with text.
C) The following code loads a mask image by using the imread function from the imageio module that comes with Anaconda:
Import imageio
Mask_image = imageio.imread('mask_heart. png ')
This function returns the image as a NumPy array, which is required by WordCloud.
D) All of the above statements are true.
Question
Which of the following statements is false?

A) Stop words are common words like "a," "of," "is," "it," and the like that are often removed from text before analyzing it because they typically do not provide useful information. Before using NLTK's stop-words lists, you must download them, which you do with the nltk module's download function:
In [1]: import nltk
In [2]: nltk.download('stopwords')
B) The following code loads the 'english' stop words list:
In [3]: from nltk.corpus import stopwords
In [4]: stops = stopwords.words('english')
C) The following code creates a TextBlob from which we can remove stop words:
In [5]: from textblob import TextBlob
In [6]: blob = TextBlob('Today is a beautiful day.')
D) To remove the stop words, we use the TextBlob's words function in a list comprehension that adds each word to the resulting list only if the word is not in stops:
In [7]: [word for word in blob.words]
Out[7]: ['Today', 'beautiful', 'day']
Question
Which of the following statements is false?

A) Stemming removes a prefix or suffix from a word leaving only a stem, which may or may not be a real word.
B) Lemmatization is similar to stemming, but factors in the word's part of speech and meaning and results in a real word.
C) Stemming and lemmatization are normalization operations, in which you prepare words for analysis.
D) Words support stemming and lemmatization via the methods stem and lemmatize. The following code correctly stems and lemmatizes a Word:
In [1]: from textblob import Word
In [2]: word = Word('varieties')
In [3]: word.stem()
Out[3]: 'variety'
In [4]: word.lemmatize()
Out[4]: 'varieti'
Question
Which of the following statements is false?

A) For many natural language processing tasks, it's important that the text be free of spelling errors.
B) You can check a Word's spelling with its spellcheck method, which returns a list of tuples containing possible correct spellings and a confidence value for each.
C) Let's assume we meant to type the word "they" but we misspelled it as "theyr." The spell checking results show two possible corrections with the word 'they' having the highest confidence value:
In [1]: from textblob import Word
In [2]: word = Word('theyr')
In [3]: %precision 2
Out[3]: '%.2f'
In [4]: word.spellcheck()
Out[4]: [('they', 0.57), ('their', 0.43)]
D) When using TextBlob's spellcheck method, the word with the highest confidence value will be the correct word for the given context.
Question
Which of the following statements a), b) or c) is false?

A) Natural language lacks mathematical precision.
B) Nuances of meaning make xe "natural language understanding"natural language understanding difficult.
C) A text's meaning can be influenced by its context and the reader's "world view."
D) All of the above statements are true.
Question
Which of the following statements about WordCloud is false?

A) WordCloud's generate method receives the text to use in the word cloud as an argument and creates the word cloud, which it returns as a WordCloud object.
B) Before creating a word cloud, you should explicitly remove xe "stop word"stop words from the text to get the best word cloud.
C) Method generate calculates the word frequencies for the remaining words after stop words are removed.
D) Method generate uses a maximum of 200 words in the word cloud by default, but you can customize this with the max_words keyword argument.
Question
The following code creates and configures a WordCloud object: from wordcloud import WordCloud
Wordcloud = WordCloud(colormap='prism', mask=mask_image,
Background_color='white')
Which of the following statements is false?

A) The default WordCloud width and height in pixels is 400x200, unless you specify width and height keyword arguments or a mask image. The mask keyword argument specifies the mask image to use.
B) For a mask image, the WordCloud size is the image's size.
C) WordCloud uses Matplotlib under the hood. WordCloud assigns random colors from a color map. You can supply the colormap keyword argument and use one of xe "Matplotlib visualization library:color maps"Matplotlib's named color maps.
D) By default, the word is drawn on a white background.
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/22
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 12: Natural Language Processing  
1
Which of the following statements is false?

A) TextBlob is the fundamental class for NLP with the textblob module.
B) The following code creates a TextBlob containing two sentences:
From textblob import TextBlob
Text = 'Today is a beautiful day. Tomorrow looks like bad weather.'
Blob = TextBlob(text)
C) TextBlobs, Sentences and Words cannot be xe "TextBlob NLP library:compare TextBlobs to strings"xe "TextBlob NLP library:textblob module"compared with strings.
D) Sentences, Words and TextBlobs inherit from BaseBlob, so they have many common methods and properties.
C
2
Which of the following statements a), b) or c) is false?

A) One of the most common and valuable NLP tasks is sentiment analysis, which determines whether text is positive, neutral or negative.
B) Companies might use sentiment analysis to determine whether people are speaking positively or negatively online about their products. cxe "positive sentiment"xe "negative sentiment". A sentence that contains the word "good" has positive sentiment.
D) All of the above statements are true.
C
3
Assuming you have a TextBlob named blob containing 'Today is a beautiful day. Tomorrow looks like bad weather.', what property should replace the ? in the following snippet to get the output shown below? In [8]: blob.?
Out[8]: WordList(['Today', 'is', 'a', 'beautiful', 'day', 'Tomorrow', 'looks', 'like', 'bad', 'weather'])

A) word
B) wordlist
C) words
D) None of the above
C
4
Which of the following statements is false?

A) An xe "n-grams"n-gram is a sequence of n text items, such as letters in words or words in a sentence. In natural language processing, n-grams can be used to identify letters or words that frequently appear adjacent to one another.
B) For text-based user input, n-grams can help predict the next letter or word a user will type-such as when completing items in IPython with tab-completion or when entering a message to a friend in your favorite smartphone messaging app. For speech-to-text, n-grams might be used to improve the quality of the transcription.
C) You can pass the keyword argument n to TextBlob's ngrams method to produce n-grams of any desired length.
D) The following code uses TextBlob's ngrams method to create all the trigrams from a sentence. The code is actually incorrect-it should have used the keyword argument n=3 to TextBlob's ngrams method:
In [1]: from textblob import TextBlob
In [2]: text = 'Today is a beautiful day. Tomorrow looks like bad weather.'
In [3]: blob = TextBlob(text)
In [4]: blob.ngrams()
Out[4]:
[WordList(['Today', 'is', 'a']),
WordList(['is', 'a', 'beautiful']),
WordList(['a', 'beautiful', 'day']),
WordList(['beautiful', 'day', 'Tomorrow']),
WordList(['day', 'Tomorrow', 'looks']),
WordList(['Tomorrow', 'looks', 'like']),
WordList(['looks', 'like', 'bad']),
WordList(['like', 'bad', 'weather'])]
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
5
Splitting text into meaningful units, such as words and numbers is called ________.

A) inflectionization
B) tokenization.
C) lemmatization.
D) parts-of-speech tagging
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
6
Which of the following statements is false?

A) The TextBlob library uses xe "Google Translate"Google Translate to detect a text's language and translate TextBlobs, Sentences and Words into other languages.
B) The following code uses TextBlob's translate method to translate a TextBlob's text to Spanish ('es') then detect the language on the result. The to keyword argument specifies the target language.
In [38]: spanish = blob.translate(to='es')
In [39]: spanish
Out[39]: TextBlob("Hoy es un hermoso dia. Mañana parece mal tiempo.")
In [40]: spanish.detect_language()
Out[40]: 'es'
C) Calling TextBlob's translate method without arguments translates from the detected source language to English.
D) All of the above statements are true.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
7
Given the following Word object: In [1]: from textblob import Word
In [2]: happy = Word('happy')
Which of the following statements a), b) or c) is false?

A) The TextBlob library uses the NLTK library's WordNet interface, enabling you to look up xe "word definitions"word definitions, and get xe "synonyms"synonyms and xe "antonyms"antonyms.
B) The Word class's definitions property returns a list of all the word's definitions in the WordNet database:
In [3]: happy.definitions
Out[3]:
['enjoying or showing or marked by joy or pleasure',
'marked by good fortune',
'eagerly disposed to act or to be of service',
'well expressed and to the point']
C) The Word class's define method enables you to pass a part of speech as an argument so you can get definitions matching only that part of speech.
D) All of the above statements are true.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
8
________ are sets of consecutive words in a corpus for use in identifying words that frequently appear adjacent to one another.

A) blobs
B) n-grams
C) stems
D) inflections
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
9
In the field of NLP, a text collection is generally known as a ________.

A) corpus
B) compilation
C) book
D) volume
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
10
Consider the following code: In [18]: blob
Out[18]: TextBlob("Today is a beautiful day. Tomorrow looks like bad weather.")
In [19]: blob.sentiment
Out[19]: Sentiment(polarity=0.07500000000000007,
Subjectivity=0.8333333333333333)
Which of the following statements is false?

A) A TextBlob's sentiment property returns a Sentiment object indicating whether the text is xe "positive sentiment"positive or xe "negative sentiment"negative and whether it's objective or subjective.
B) The xe "Sentiment named tuple:polarity"xe "polarity of Sentiment named tuple"polarity indicates xe "sentiment"sentiment with a value from -1.0 (negative) to 1.0 (positive) with 0.0 being neutral.
C) The xe "Sentiment named tuple:subjectivity"xe "subjectivity of Sentiment named tuple"subjectivity is a value from 0.0 (objective) to 1.0 (subjective).
D) Based on the values for this TextBlob, the overall sentiment is close to xe "neutral sentiment"neutral, and the text is mostly objective.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
11
Which of the following statements a), b) or c) is false?

A) TextBlobs, Sentences and Words all have a correct method that you can call to correct spelling.
B) Assuming word is a Word object containing 'theyr', calling correct on word returns the correctly spelled word that has the highest confidence value (as returned by spellcheck).
C) Calling correct on a TextBlob or Sentence checks the spelling of each word. For each incorrect word, correct replaces it with the correctly spelled one that has the highest confidence value:
In [6]: from textblob import Word
In [7]: sentence = TextBlob('This sentense has missplled wrds.')
In [8]: sentence.correct()
Out[8]: TextBlob("This sentence has misspelled words.")
D) All of the above statements are true.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
12
Which of the following statements a), b) or c) is false?

A) Parts-of-speech (POS) tagging is the process of evaluating words based on their context to determine each word's part of speech.
B) There are eight primary xe "English parts of speech"English parts of speech-xe "parts-of-speech (POS) tagging:nouns"nouns, xe "parts-of-speech (POS) tagging:pronouns"pronouns, xe "parts-of-speech (POS) tagging:verbs"verbs, xe "parts-of-speech (POS) tagging:adjectives"adjectives, xe "parts-of-speech (POS) tagging:adverbs"adverbs, xe "parts-of-speech (POS) tagging:prepositions"prepositions, xe "parts-of-speech (POS) tagging:conjunctions"conjunctions and xe "parts-of-speech (POS) tagging:interjections"interjections (words that express emotion and that are typically followed by punctuation, like "Yes!" or "Ha!").
C) An important use of POS tagging is determining a word's meaning among its possibly many meanings-this is important for helping computers "understand" natural language.
D) All of the above statements are true.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
13
Which of the following is not a TextBlob capability?

A) Parts-of-speech (POS) tagging. xe "TextBlob NLP library:noun phrase extraction"b. Language detection.
C) Sentiment analysis.
D) Similarity detection.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
14
Which of the following statements a), b) or c) is false?

A) You can get a Word's synsets-that is, its xe "synset (set of synonyms)"xe "sets of synonyms (synset)"sets of synonyms-via the synsets property. The result of applying this property to a Word is a list of Synset objects:
In [4]: happy.synsets
Out[4]:
[Synset('happy.a.01'),
Synset('felicitous.s.02'),
Synset('glad.s.02'),
Synset('happy.s.04')]
B) Each Synset represents a group of synonyms. In the code in part b) above, the notation happy.a.01: \bullet happy is the original Word's lemmatized form (in this case, it's the same).
\bullet a is the part of speech, which can be a for adjective, n for noun, v for verb, r for adverb or s for adjective satellite.
\bullet 01 is a 0-based index number. Many words have multiple meanings, and this is the index number of the corresponding meaning in the WordNet database.
C) There's also a get_synsets method that enables you to pass a part of speech as an argument so you can get Synsets matching only that part of speech.
D) All of the above statements are true.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
15
Which of the following statements is false?

A) Inflections are different forms of the same words, such as singular and plural (like "person" and "people") and different verb tenses (like "run" and "ran").
B)When you're calculating word frequencies, you might first want to convert all inflected words to the same form for more accurate word frequencies.
C) Words and WordLists each support converting words to their singular or plural forms.
D) The following code pluralizes a bunch of nouns:
In [6]: from textblob import TextBlob
In [7]: animals = TextBlob('dog cat fish bird').words
In [8]: animals.plural()
Out[8]: WordList(['dogs', 'cats', 'fish', 'birds'])
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
16
Which of the following statements a), b) or c) is false?

A) You can use the open source wordcloud module's WordCloud class to generate word clouds with just a few lines of code. By default, wordcloud creates rectangular word clouds, but the library can create word clouds with arbitrary shapes.
B) To create a word cloud of a given shape, you can initialize a WordCloud object with an image known as a xe "mask image"mask. The WordCloud fills non-white areas of the mask image with text.
C) The following code loads a mask image by using the imread function from the imageio module that comes with Anaconda:
Import imageio
Mask_image = imageio.imread('mask_heart. png ')
This function returns the image as a NumPy array, which is required by WordCloud.
D) All of the above statements are true.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
17
Which of the following statements is false?

A) Stop words are common words like "a," "of," "is," "it," and the like that are often removed from text before analyzing it because they typically do not provide useful information. Before using NLTK's stop-words lists, you must download them, which you do with the nltk module's download function:
In [1]: import nltk
In [2]: nltk.download('stopwords')
B) The following code loads the 'english' stop words list:
In [3]: from nltk.corpus import stopwords
In [4]: stops = stopwords.words('english')
C) The following code creates a TextBlob from which we can remove stop words:
In [5]: from textblob import TextBlob
In [6]: blob = TextBlob('Today is a beautiful day.')
D) To remove the stop words, we use the TextBlob's words function in a list comprehension that adds each word to the resulting list only if the word is not in stops:
In [7]: [word for word in blob.words]
Out[7]: ['Today', 'beautiful', 'day']
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
18
Which of the following statements is false?

A) Stemming removes a prefix or suffix from a word leaving only a stem, which may or may not be a real word.
B) Lemmatization is similar to stemming, but factors in the word's part of speech and meaning and results in a real word.
C) Stemming and lemmatization are normalization operations, in which you prepare words for analysis.
D) Words support stemming and lemmatization via the methods stem and lemmatize. The following code correctly stems and lemmatizes a Word:
In [1]: from textblob import Word
In [2]: word = Word('varieties')
In [3]: word.stem()
Out[3]: 'variety'
In [4]: word.lemmatize()
Out[4]: 'varieti'
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
19
Which of the following statements is false?

A) For many natural language processing tasks, it's important that the text be free of spelling errors.
B) You can check a Word's spelling with its spellcheck method, which returns a list of tuples containing possible correct spellings and a confidence value for each.
C) Let's assume we meant to type the word "they" but we misspelled it as "theyr." The spell checking results show two possible corrections with the word 'they' having the highest confidence value:
In [1]: from textblob import Word
In [2]: word = Word('theyr')
In [3]: %precision 2
Out[3]: '%.2f'
In [4]: word.spellcheck()
Out[4]: [('they', 0.57), ('their', 0.43)]
D) When using TextBlob's spellcheck method, the word with the highest confidence value will be the correct word for the given context.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
20
Which of the following statements a), b) or c) is false?

A) Natural language lacks mathematical precision.
B) Nuances of meaning make xe "natural language understanding"natural language understanding difficult.
C) A text's meaning can be influenced by its context and the reader's "world view."
D) All of the above statements are true.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
21
Which of the following statements about WordCloud is false?

A) WordCloud's generate method receives the text to use in the word cloud as an argument and creates the word cloud, which it returns as a WordCloud object.
B) Before creating a word cloud, you should explicitly remove xe "stop word"stop words from the text to get the best word cloud.
C) Method generate calculates the word frequencies for the remaining words after stop words are removed.
D) Method generate uses a maximum of 200 words in the word cloud by default, but you can customize this with the max_words keyword argument.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
22
The following code creates and configures a WordCloud object: from wordcloud import WordCloud
Wordcloud = WordCloud(colormap='prism', mask=mask_image,
Background_color='white')
Which of the following statements is false?

A) The default WordCloud width and height in pixels is 400x200, unless you specify width and height keyword arguments or a mask image. The mask keyword argument specifies the mask image to use.
B) For a mask image, the WordCloud size is the image's size.
C) WordCloud uses Matplotlib under the hood. WordCloud assigns random colors from a color map. You can supply the colormap keyword argument and use one of xe "Matplotlib visualization library:color maps"Matplotlib's named color maps.
D) By default, the word is drawn on a white background.
Unlock Deck
Unlock for access to all 22 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 22 flashcards in this deck.