Deck 11: Unsupervised Data Mining

ملء الشاشة (f)
exit full mode
سؤال
Which method uses the farthest distance between a pair of observations that do not belong to the same cluster?

A) The single linkage method
B) The complete linkage method
C) The centroid method
D) The average linkage method
استخدم زر المسافة أو
up arrow
down arrow
لقلب البطاقة.
سؤال
In the following AGNES algorithm, what type of linkage distance method is being displayed? <strong>In the following AGNES algorithm, what type of linkage distance method is being displayed?  </strong> A) The centroid method B) The single linkage method C) The complete linkage method D) No linkage method is shown <div style=padding-top: 35px>

A) The centroid method
B) The single linkage method
C) The complete linkage method
D) No linkage method is shown
سؤال
When using R for Agglomerative Clustering, the plot function is used to create the dendrogram as well as a banner plot. What function is used to split these results into distinct clusters?

A) aResult
B) data.frame
C) cutree
D) view
سؤال
The following results are a subset of a study on the demographics of a city population. Participants were asked to respond if male (1) or female (0), current annual salary, and if they were raised in a suburb (1) or in a city (0). Based on the hierarchical clustering results, which of the following is not a valid observation that can be made? <strong>The following results are a subset of a study on the demographics of a city population. Participants were asked to respond if male (1) or female (0), current annual salary, and if they were raised in a suburb (1) or in a city (0). Based on the hierarchical clustering results, which of the following is not a valid observation that can be made?  </strong> A) Cluster 3 has the most observations of participants that grew up in the city. B) Cluster 1 has a 48% male population with 49% being raised in the suburbs. C) Cluster 4 has the highest average salary, with 67% of participants being female. D) Cluster 2 has the most participants in the cluster with over half being raised in a city. <div style=padding-top: 35px>

A) Cluster 3 has the most observations of participants that grew up in the city.
B) Cluster 1 has a 48% male population with 49% being raised in the suburbs.
C) Cluster 4 has the highest average salary, with 67% of participants being female.
D) Cluster 2 has the most participants in the cluster with over half being raised in a city.
سؤال
The following results are a subset of a study on the demographics of a city population. Participants were asked to respond if male (1) or female (0), current annual salary, and if they were raised in a suburb (1) or in a city (0). Based on the hierarchical clustering results, which of the following is not a valid observation that can be made? <strong>The following results are a subset of a study on the demographics of a city population. Participants were asked to respond if male (1) or female (0), current annual salary, and if they were raised in a suburb (1) or in a city (0). Based on the hierarchical clustering results, which of the following is not a valid observation that can be made?  </strong> A) Cluster 3 has the most observations of participants that grew up in the city. B) Cluster 1 has a 48% male population with 49% being raised in the suburbs. C) Cluster 4 has the highest average salary, with 53% of participants being female. D) Cluster 2 has the most participants in the cluster with over half being raised in a city. <div style=padding-top: 35px>

A) Cluster 3 has the most observations of participants that grew up in the city.
B) Cluster 1 has a 48% male population with 49% being raised in the suburbs.
C) Cluster 4 has the highest average salary, with 53% of participants being female.
D) Cluster 2 has the most participants in the cluster with over half being raised in a city.
سؤال
In the k-Means Clustering Method, there is a general process of how k-means clustering algorithm can be classified. Which one of the following is not one of the general processes?

A) Specify the k value
B) Randomly assign k observations to its nearest cluster center
C) Calculate the cluster centroids
D) Reassign each observation to the nearest observation point
سؤال
Using the following dendrogram created by using the k-means method, identify the number k was given to form the clusters present. <strong>Using the following dendrogram created by using the k-means method, identify the number k was given to form the clusters present.  </strong> A) 5 B) 8 C) 9 D) 10 <div style=padding-top: 35px>

A) 5
B) 8
C) 9
D) 10
سؤال
The marketing department is examining the data pulled from the retail stores over the month of December. In this time period, three items are of interest, Sound Bars, LED under counter lights, and shelving units.

- In researching if two of the items are purchased, if the third will be also, the following confidence level was calculated at 0.625, with an expected confidence of 0.20. Calculate the lift ratio.

A) 0.1250
B) 3.125
C) 0.3200
D) 0.425
سؤال
The marketing department is examining the data pulled from the retail stores over the month of December. In this time period, three items are of interest, Sound Bars, LED under counter lights, and shelving units.

-In researching if two of the items are purchased, if the third will be also, the following confidence level was calculated at 0.575, with an expected confidence of 0.10. Calculate the lift ratio.

A) 0.0575
B) 5.75
C) 0.6389
D) 0.475
سؤال
Of 8,000 grocery store transactions, 795 have been identified as having coffee, ice cream, and chips as part of the same transaction. Calculate the support of the association rule.

A) 10.063
B) 0.0994
C) 7.95
D) 0.795
سؤال
Of 10,000 grocery store transactions, 895 have been identified as having coffee, ice cream, and chips as part of the same transaction. Calculate the support of the association rule.

A) 11.173
B) 0.0895
C) 8.95
D) 0.895
سؤال
Martin wants to understand the strength of association among toilet paper, milk, and eggs. Of 8,000 transactions, the number of transactions including antecedent is 1,710 whereas the number of transactions including both antecedent and consequent transactions is 750. Calculate the confidence.

A) 0.3075
B) 0.2138
C) 0.4386
D) 0.0938
سؤال
Martin wants to understand the strength of association among toilet paper, milk, and eggs. Of 5,000 transactions, the number of transactions including antecedent is 1,520 whereas the number of transactions including both antecedent and consequent transactions is 690. Calculate the confidence.

A) 0.4420
B) 0.3040
C) 0.4539
D) 0.1380
سؤال
Of 6,200 total transactions, 1,470 transactions are the number of consequents, confidence equals 0.5276. Calculate the expected confidence.

A) 0.24
B) 2.37
C) 3,271
D) 0.7629
سؤال
Of 5,000 total transactions, 1,400 transactions are the number of consequents, confidence equals 0.5203. Calculate the expected confidence.

A) 0.28
B) 2.80
C) 2,601
D) 0.6919
سؤال
Using the following transactions, what is the frequency distribution? <strong>Using the following transactions, what is the frequency distribution?  </strong> A) Latte-3%; Scone-5%; Muffin-7%; Egg-4%; Espresso-2%; Coffee-1%; Fruit Cup-3%; Cookie-2% B) Latte-1/5; Scone-1/5; Muffin-1/7; Egg-1/4; Espresso-1/2; Coffee-1/2; Fruit Cup-1/3; Cookie-1/3 C) Latte-1; Scone-2; Muffin-4; Egg-3; Espresso-5; Coffee-6; Fruit Cup-7; Cookie-21 D) Latte-4; Scone-5; Muffin-7; Egg-4; Espresso-2; Coffee-2; Fruit Cup-3; Cookie-2 <div style=padding-top: 35px>

A) Latte-3%; Scone-5%; Muffin-7%; Egg-4%; Espresso-2%; Coffee-1%; Fruit Cup-3%; Cookie-2%
B) Latte-1/5; Scone-1/5; Muffin-1/7; Egg-1/4; Espresso-1/2; Coffee-1/2; Fruit Cup-1/3; Cookie-1/3
C) Latte-1; Scone-2; Muffin-4; Egg-3; Espresso-5; Coffee-6; Fruit Cup-7; Cookie-21
D) Latte-4; Scone-5; Muffin-7; Egg-4; Espresso-2; Coffee-2; Fruit Cup-3; Cookie-2
سؤال
Using the following transactions, what is the frequency distribution? <strong>Using the following transactions, what is the frequency distribution?  </strong> A) Latte-4%; Scone-4%; Muffin-7%; Egg-3%; Espresso-2%; Coffee-2%; Fruit Cup-3%; Cookie-2% B) Latte-1/6; Scone-1/4; Muffin-1/7; Egg-1/3; Espresso-1/2; Coffee-1/2; Fruit Cup-1/3; Cookie-1/2 C) Latte-1; Scone-2; Muffin-4; Egg-3; Espresso-5; Coffee-6; Fruit Cup-7; Cookie-8 D) Latte-5; Scone-4; Muffin-7; Egg-3; Espresso-2; Coffee-3; Fruit Cup-3; Cookie-2 <div style=padding-top: 35px>

A) Latte-4%; Scone-4%; Muffin-7%; Egg-3%; Espresso-2%; Coffee-2%; Fruit Cup-3%; Cookie-2%
B) Latte-1/6; Scone-1/4; Muffin-1/7; Egg-1/3; Espresso-1/2; Coffee-1/2; Fruit Cup-1/3; Cookie-1/2
C) Latte-1; Scone-2; Muffin-4; Egg-3; Espresso-5; Coffee-6; Fruit Cup-7; Cookie-8
D) Latte-5; Scone-4; Muffin-7; Egg-3; Espresso-2; Coffee-3; Fruit Cup-3; Cookie-2
سؤال
Use the proportion of the following transactions that contain both Latte and Scone and calculate the Support of the association rule. <strong>Use the proportion of the following transactions that contain both Latte and Scone and calculate the Support of the association rule.  </strong> A) 0.25 B) 0.70 C) 0.30 D) 0.75 <div style=padding-top: 35px>

A) 0.25
B) 0.70
C) 0.30
D) 0.75
سؤال
Use the proportion of the following transactions that contain both Latte and Scone and calculate the Support of the association rule. <strong>Use the proportion of the following transactions that contain both Latte and Scone and calculate the Support of the association rule.  </strong> A) 0.25 B) 0.70 C) 0.30 D) 0.75 <div style=padding-top: 35px>

A) 0.25
B) 0.70
C) 0.30
D) 0.75
سؤال
Carmen pulled transactions from the month of March to see if there is an association between coffee and breakfast sandwich purchases. After running through confidence and expected confidence calculations, she calculated the lift ratio at 1.52. What association does a calculated lift ratio of 1.52 reflect?

A) A lift ratio of 1.52 is over 1, indicating a weak association between coffee and breakfast sandwiches.
B) 1.52 implies that identifying a person who purchases coffee and a breakfast sandwich is 52% better than guessing that a random person will purchase a breakfast sandwich.
C) 1.52 implies that 52% of consumers will purchase coffee and not a breakfast sandwich.
D) The lift ratio cannot determine the association between purchases.
سؤال
Using R, which function is used to conduct association rule analysis?

A) itemFrequency
B) apriori
C) inspect
D) read.transaction
سؤال
Eva is trying to figure out the total number of possible combinations for 130 inventory items. Calculate the number for Eva.

A) 3.80887E + 61 possible combinations
B) 1.126E + 124 possible combinations
C) 1.06112E + 62 possible combinations
D) 1.26780E + 251 possible combinations
سؤال
Eva is trying to figure out the total number of possible combinations for 50 inventory items. Calculate the number for Eva.

A) 2.57689E + 23 possible combinations
B) 5.15378E + 47 possible combinations
C) 7.17898E + 23 possible combinations
D) 2.65614E + 98 possible combinations
سؤال
Martin wants to use Gower's coefficient to compute the distance for each variable and to convert it into a [0,1] scale. What analysis package will he use to run the analysis?

A) Analytic Solver only
B) Analytic Solver & R
C) R only
D) Excel
سؤال
Of the following options, which is not accurate for clustering?

A) Euclidean distance or Manhattan distance measures for numerical variables and matching.
B) AGNES takes each observation in the data initially and forms its own cluster.
C) Hierarchical clustering commonly follows agglomerative and divisive clustering.
D) Cluster analysis is where small amounts of data are organized against larger statistical sets.
سؤال
The use of error sum of squares to measure the loss of information that occurs when observations are clustered describes which method?

A) The centroid method
B) The complete linkage method
C) The average linkage method
D) Ward's method
سؤال
k-means clustering algorithm can be summarized as all of the following except for

A) Can be either numerical or character variables.
B) Specify the k value.
C) Randomly assign k observations as cluster centers.
D) Assign each observation to its nearest cluster center.
سؤال
Tosh Marketing Group wants to identify customers who are likely to purchase high-end appliances for a new marketing campaign. After collecting data from recent customers, the following plot was created showing age and spend variables with 3 distinct clusters. Which cluster offers the best target with the highest spend for marketing materials? <strong>Tosh Marketing Group wants to identify customers who are likely to purchase high-end appliances for a new marketing campaign. After collecting data from recent customers, the following plot was created showing age and spend variables with 3 distinct clusters. Which cluster offers the best target with the highest spend for marketing materials?  </strong> A) The lower-left cluster reflects the younger age range and should be the target of the marketing. B) The upper-middle cluster reflects a higher spend for appliances versus the lower-left and far-right clusters. C) The far-right cluster is in the middle and the highest spend for appliances. D) Both the upper-middle cluster and the lower-left cluster are the highest spends for appliances. <div style=padding-top: 35px>

A) The lower-left cluster reflects the younger age range and should be the target of the marketing.
B) The upper-middle cluster reflects a higher spend for appliances versus the lower-left and far-right clusters.
C) The far-right cluster is in the middle and the highest spend for appliances.
D) Both the upper-middle cluster and the lower-left cluster are the highest spends for appliances.
سؤال
In reviewing purchases at Costco on a given Saturday, 645 transactions out of 1,250 included toilet paper, detergent, and clothing or {toilet paper, detergent} => {clothing}. Calculate the support of the association rule.

A) 0.516
B) 1.516
C) 1,250
D) 6.45
سؤال
In reviewing purchases at Costco on a given Saturday, 385 transactions out of 1,000 included toilet paper, detergent, and clothing or {toilet paper, detergent} => {clothing}. Calculate the support of the association rule.

A) 0.385
B) 1.385
C) 1,000
D) 3.85
سؤال
Toilet paper and detergent are the antecedent, where 950 of the 6,500 transactions include both items. Of the overall 6,500 transactions, 126 include toilet paper, detergent, and the consequent of clothing. Using the association rule, what is the confidence?

A) 0.0194
B) 18.415
C) 0.146
D) 0.1326
سؤال
Toilet paper and detergent are the antecedent, where 890 of the 10,000 transactions include both items. Of the overall 10,000 transactions, 234 include toilet paper, detergent, and the consequent of clothing. Using the association rule, what is the confidence?

A) 0.0234
B) 20.826
C) 0.089
D) 0.2629
سؤال
Costco is known for lower prices for bulk items. After calculating the support and confidence values for {toilet paper} => {avocados}, there appears to be a strong association because a large percentage of customers purchase these items. However, to avoid assuming the strength of association, what option should be done to confirm the strength of the association?

A) Review the transactions for accuracy
B) Rerun the confidence
C) Calculate the lift ratio
D) If the support and confidence are strong then no need to run anything more
سؤال
Based on the following table, what is the frequency distribution of the most purchased item? <strong>Based on the following table, what is the frequency distribution of the most purchased item?  </strong> A) bread B) cheese C) milk D) yogurt <div style=padding-top: 35px>

A) bread
B) cheese
C) milk
D) yogurt
سؤال
Based on the following table, what is the frequency distribution of the most purchased item? <strong>Based on the following table, what is the frequency distribution of the most purchased item?  </strong> A) bread B) cheese C) milk D) yogurt <div style=padding-top: 35px>

A) bread
B) cheese
C) milk
D) yogurt
سؤال
Based on the following table, what is the proportion of the transactions that include both milk and bread? <strong>Based on the following table, what is the proportion of the transactions that include both milk and bread?  </strong> A) 0.80 B) 0.60 C) 0.20 D) 0.40 <div style=padding-top: 35px>

A) 0.80
B) 0.60
C) 0.20
D) 0.40
سؤال
Based on the following table, what is the proportion of the transactions that include both milk and bread? <strong>Based on the following table, what is the proportion of the transactions that include both milk and bread?  </strong> A) 0.80 B) 0.60 C) 0.20 D) 0.40 <div style=padding-top: 35px>

A) 0.80
B) 0.60
C) 0.20
D) 0.40
سؤال
The Corner Market is using 1,750 transactions on item purchases for analysis. Based on initial results, the manager noticed eggs and potato chips were frequently in the same transactions. After calculating the confidence and the expected confidence on the data, 0.44 and 0.61 respectively, they want to run a lift ratio to ensure there is a positive association. Calculate the lift ratio and determine if the association is positive or negative.

A) 0.7213, positive number, positive association
B) 0.7213, under one, negative association
C) 1.386, over one, positive association
D) -1.386, negative number, negative association
سؤال
The Corner Market is using 2,500 transactions on item purchases for analysis. Based on initial results, the manager noticed eggs and potato chips were frequently in the same transactions. After calculating the confidence and the expected confidence on the data, 0.54 and 0.62 respectively, they want to run a lift ratio to ensure there is a positive association. Calculate the lift ratio and determine if the association is positive or negative.

A) 0.871, positive number, positive association
B) 0.871, under one, negative association
C) 1.148, over one, positive association
D) -1.148, negative number, negative association
سؤال
Sara is a marketing analysis manager for a top cereal producer. Part of her job is to review product sales to grocery chains and if the contracted product shelf placement is the optimal location to maximize sales. Because of the large data sets, she is only interested in creating a small number of clusters to view the results. What type of clustering method would work best?

A) hierarchical clustering
B) agglomerative clustering
C) divisive clustering
D) k-means clustering
سؤال
Amazon uses searches and items purchased to create future product marketing recommendations. Additionally, demographics drive additional potential products to be recommended. To do this, what type of market basket analysis is used?

A) Information Rule
B) Supervised Data Analysis
C) Association Rule
D) k-mean
سؤال
Using R, what function is used to view the rules by their lift ratios?

A) lookup
B) apriori
C) rules
D) sort
سؤال
In cluster analysis, measures are used to form clusters. However, when large data sets are imported into R, sometimes the variables do not share the same format. To overcome this, you standardize the data using the _____ function.

A) score
B) scale
C) standard
D) dist
سؤال
The forming of groups into internally homogeneous groups where each has a unique characteristic, different from other groups, is called cluster analysis.
سؤال
The most commonly used approach for hierarchical clustering is divisive clustering.
سؤال
When using k-means clustering, the number of clusters are specified at the end of the analysis to remove overlapping clusters.
سؤال
When evaluating large data sets, it is customary to cluster large data sets using the k-means to reduce the computation of measures during each iteration compared to hierarchical clustering methods.
سؤال
When using R, after the data is imported, set.seed function is used to set the random seed and the k function sets the k parameters to preselect the number of clusters.
سؤال
In understanding the association rules, it is best to think of them as an If-Then statement.
سؤال
Under the association rule, a lift ratio between 0 and 1 indicates a positive association.
سؤال
If-Then logical statements are constructed with the If portion being the consequent and the Then being the antecedent.
سؤال
The Ward's method is the use of a different algorithm to minimize the dissimilarity within clusters by using error sum of squares.
سؤال
A dendrogram allows for a visual inspection of the clustering results.
فتح الحزمة
قم بالتسجيل لفتح البطاقات في هذه المجموعة!
Unlock Deck
Unlock Deck
1/53
auto play flashcards
العب
simple tutorial
ملء الشاشة (f)
exit full mode
Deck 11: Unsupervised Data Mining
1
Which method uses the farthest distance between a pair of observations that do not belong to the same cluster?

A) The single linkage method
B) The complete linkage method
C) The centroid method
D) The average linkage method
The complete linkage method
2
In the following AGNES algorithm, what type of linkage distance method is being displayed? <strong>In the following AGNES algorithm, what type of linkage distance method is being displayed?  </strong> A) The centroid method B) The single linkage method C) The complete linkage method D) No linkage method is shown

A) The centroid method
B) The single linkage method
C) The complete linkage method
D) No linkage method is shown
The single linkage method
3
When using R for Agglomerative Clustering, the plot function is used to create the dendrogram as well as a banner plot. What function is used to split these results into distinct clusters?

A) aResult
B) data.frame
C) cutree
D) view
cutree
4
The following results are a subset of a study on the demographics of a city population. Participants were asked to respond if male (1) or female (0), current annual salary, and if they were raised in a suburb (1) or in a city (0). Based on the hierarchical clustering results, which of the following is not a valid observation that can be made? <strong>The following results are a subset of a study on the demographics of a city population. Participants were asked to respond if male (1) or female (0), current annual salary, and if they were raised in a suburb (1) or in a city (0). Based on the hierarchical clustering results, which of the following is not a valid observation that can be made?  </strong> A) Cluster 3 has the most observations of participants that grew up in the city. B) Cluster 1 has a 48% male population with 49% being raised in the suburbs. C) Cluster 4 has the highest average salary, with 67% of participants being female. D) Cluster 2 has the most participants in the cluster with over half being raised in a city.

A) Cluster 3 has the most observations of participants that grew up in the city.
B) Cluster 1 has a 48% male population with 49% being raised in the suburbs.
C) Cluster 4 has the highest average salary, with 67% of participants being female.
D) Cluster 2 has the most participants in the cluster with over half being raised in a city.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
5
The following results are a subset of a study on the demographics of a city population. Participants were asked to respond if male (1) or female (0), current annual salary, and if they were raised in a suburb (1) or in a city (0). Based on the hierarchical clustering results, which of the following is not a valid observation that can be made? <strong>The following results are a subset of a study on the demographics of a city population. Participants were asked to respond if male (1) or female (0), current annual salary, and if they were raised in a suburb (1) or in a city (0). Based on the hierarchical clustering results, which of the following is not a valid observation that can be made?  </strong> A) Cluster 3 has the most observations of participants that grew up in the city. B) Cluster 1 has a 48% male population with 49% being raised in the suburbs. C) Cluster 4 has the highest average salary, with 53% of participants being female. D) Cluster 2 has the most participants in the cluster with over half being raised in a city.

A) Cluster 3 has the most observations of participants that grew up in the city.
B) Cluster 1 has a 48% male population with 49% being raised in the suburbs.
C) Cluster 4 has the highest average salary, with 53% of participants being female.
D) Cluster 2 has the most participants in the cluster with over half being raised in a city.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
6
In the k-Means Clustering Method, there is a general process of how k-means clustering algorithm can be classified. Which one of the following is not one of the general processes?

A) Specify the k value
B) Randomly assign k observations to its nearest cluster center
C) Calculate the cluster centroids
D) Reassign each observation to the nearest observation point
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
7
Using the following dendrogram created by using the k-means method, identify the number k was given to form the clusters present. <strong>Using the following dendrogram created by using the k-means method, identify the number k was given to form the clusters present.  </strong> A) 5 B) 8 C) 9 D) 10

A) 5
B) 8
C) 9
D) 10
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
8
The marketing department is examining the data pulled from the retail stores over the month of December. In this time period, three items are of interest, Sound Bars, LED under counter lights, and shelving units.

- In researching if two of the items are purchased, if the third will be also, the following confidence level was calculated at 0.625, with an expected confidence of 0.20. Calculate the lift ratio.

A) 0.1250
B) 3.125
C) 0.3200
D) 0.425
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
9
The marketing department is examining the data pulled from the retail stores over the month of December. In this time period, three items are of interest, Sound Bars, LED under counter lights, and shelving units.

-In researching if two of the items are purchased, if the third will be also, the following confidence level was calculated at 0.575, with an expected confidence of 0.10. Calculate the lift ratio.

A) 0.0575
B) 5.75
C) 0.6389
D) 0.475
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
10
Of 8,000 grocery store transactions, 795 have been identified as having coffee, ice cream, and chips as part of the same transaction. Calculate the support of the association rule.

A) 10.063
B) 0.0994
C) 7.95
D) 0.795
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
11
Of 10,000 grocery store transactions, 895 have been identified as having coffee, ice cream, and chips as part of the same transaction. Calculate the support of the association rule.

A) 11.173
B) 0.0895
C) 8.95
D) 0.895
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
12
Martin wants to understand the strength of association among toilet paper, milk, and eggs. Of 8,000 transactions, the number of transactions including antecedent is 1,710 whereas the number of transactions including both antecedent and consequent transactions is 750. Calculate the confidence.

A) 0.3075
B) 0.2138
C) 0.4386
D) 0.0938
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
13
Martin wants to understand the strength of association among toilet paper, milk, and eggs. Of 5,000 transactions, the number of transactions including antecedent is 1,520 whereas the number of transactions including both antecedent and consequent transactions is 690. Calculate the confidence.

A) 0.4420
B) 0.3040
C) 0.4539
D) 0.1380
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
14
Of 6,200 total transactions, 1,470 transactions are the number of consequents, confidence equals 0.5276. Calculate the expected confidence.

A) 0.24
B) 2.37
C) 3,271
D) 0.7629
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
15
Of 5,000 total transactions, 1,400 transactions are the number of consequents, confidence equals 0.5203. Calculate the expected confidence.

A) 0.28
B) 2.80
C) 2,601
D) 0.6919
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
16
Using the following transactions, what is the frequency distribution? <strong>Using the following transactions, what is the frequency distribution?  </strong> A) Latte-3%; Scone-5%; Muffin-7%; Egg-4%; Espresso-2%; Coffee-1%; Fruit Cup-3%; Cookie-2% B) Latte-1/5; Scone-1/5; Muffin-1/7; Egg-1/4; Espresso-1/2; Coffee-1/2; Fruit Cup-1/3; Cookie-1/3 C) Latte-1; Scone-2; Muffin-4; Egg-3; Espresso-5; Coffee-6; Fruit Cup-7; Cookie-21 D) Latte-4; Scone-5; Muffin-7; Egg-4; Espresso-2; Coffee-2; Fruit Cup-3; Cookie-2

A) Latte-3%; Scone-5%; Muffin-7%; Egg-4%; Espresso-2%; Coffee-1%; Fruit Cup-3%; Cookie-2%
B) Latte-1/5; Scone-1/5; Muffin-1/7; Egg-1/4; Espresso-1/2; Coffee-1/2; Fruit Cup-1/3; Cookie-1/3
C) Latte-1; Scone-2; Muffin-4; Egg-3; Espresso-5; Coffee-6; Fruit Cup-7; Cookie-21
D) Latte-4; Scone-5; Muffin-7; Egg-4; Espresso-2; Coffee-2; Fruit Cup-3; Cookie-2
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
17
Using the following transactions, what is the frequency distribution? <strong>Using the following transactions, what is the frequency distribution?  </strong> A) Latte-4%; Scone-4%; Muffin-7%; Egg-3%; Espresso-2%; Coffee-2%; Fruit Cup-3%; Cookie-2% B) Latte-1/6; Scone-1/4; Muffin-1/7; Egg-1/3; Espresso-1/2; Coffee-1/2; Fruit Cup-1/3; Cookie-1/2 C) Latte-1; Scone-2; Muffin-4; Egg-3; Espresso-5; Coffee-6; Fruit Cup-7; Cookie-8 D) Latte-5; Scone-4; Muffin-7; Egg-3; Espresso-2; Coffee-3; Fruit Cup-3; Cookie-2

A) Latte-4%; Scone-4%; Muffin-7%; Egg-3%; Espresso-2%; Coffee-2%; Fruit Cup-3%; Cookie-2%
B) Latte-1/6; Scone-1/4; Muffin-1/7; Egg-1/3; Espresso-1/2; Coffee-1/2; Fruit Cup-1/3; Cookie-1/2
C) Latte-1; Scone-2; Muffin-4; Egg-3; Espresso-5; Coffee-6; Fruit Cup-7; Cookie-8
D) Latte-5; Scone-4; Muffin-7; Egg-3; Espresso-2; Coffee-3; Fruit Cup-3; Cookie-2
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
18
Use the proportion of the following transactions that contain both Latte and Scone and calculate the Support of the association rule. <strong>Use the proportion of the following transactions that contain both Latte and Scone and calculate the Support of the association rule.  </strong> A) 0.25 B) 0.70 C) 0.30 D) 0.75

A) 0.25
B) 0.70
C) 0.30
D) 0.75
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
19
Use the proportion of the following transactions that contain both Latte and Scone and calculate the Support of the association rule. <strong>Use the proportion of the following transactions that contain both Latte and Scone and calculate the Support of the association rule.  </strong> A) 0.25 B) 0.70 C) 0.30 D) 0.75

A) 0.25
B) 0.70
C) 0.30
D) 0.75
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
20
Carmen pulled transactions from the month of March to see if there is an association between coffee and breakfast sandwich purchases. After running through confidence and expected confidence calculations, she calculated the lift ratio at 1.52. What association does a calculated lift ratio of 1.52 reflect?

A) A lift ratio of 1.52 is over 1, indicating a weak association between coffee and breakfast sandwiches.
B) 1.52 implies that identifying a person who purchases coffee and a breakfast sandwich is 52% better than guessing that a random person will purchase a breakfast sandwich.
C) 1.52 implies that 52% of consumers will purchase coffee and not a breakfast sandwich.
D) The lift ratio cannot determine the association between purchases.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
21
Using R, which function is used to conduct association rule analysis?

A) itemFrequency
B) apriori
C) inspect
D) read.transaction
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
22
Eva is trying to figure out the total number of possible combinations for 130 inventory items. Calculate the number for Eva.

A) 3.80887E + 61 possible combinations
B) 1.126E + 124 possible combinations
C) 1.06112E + 62 possible combinations
D) 1.26780E + 251 possible combinations
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
23
Eva is trying to figure out the total number of possible combinations for 50 inventory items. Calculate the number for Eva.

A) 2.57689E + 23 possible combinations
B) 5.15378E + 47 possible combinations
C) 7.17898E + 23 possible combinations
D) 2.65614E + 98 possible combinations
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
24
Martin wants to use Gower's coefficient to compute the distance for each variable and to convert it into a [0,1] scale. What analysis package will he use to run the analysis?

A) Analytic Solver only
B) Analytic Solver & R
C) R only
D) Excel
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
25
Of the following options, which is not accurate for clustering?

A) Euclidean distance or Manhattan distance measures for numerical variables and matching.
B) AGNES takes each observation in the data initially and forms its own cluster.
C) Hierarchical clustering commonly follows agglomerative and divisive clustering.
D) Cluster analysis is where small amounts of data are organized against larger statistical sets.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
26
The use of error sum of squares to measure the loss of information that occurs when observations are clustered describes which method?

A) The centroid method
B) The complete linkage method
C) The average linkage method
D) Ward's method
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
27
k-means clustering algorithm can be summarized as all of the following except for

A) Can be either numerical or character variables.
B) Specify the k value.
C) Randomly assign k observations as cluster centers.
D) Assign each observation to its nearest cluster center.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
28
Tosh Marketing Group wants to identify customers who are likely to purchase high-end appliances for a new marketing campaign. After collecting data from recent customers, the following plot was created showing age and spend variables with 3 distinct clusters. Which cluster offers the best target with the highest spend for marketing materials? <strong>Tosh Marketing Group wants to identify customers who are likely to purchase high-end appliances for a new marketing campaign. After collecting data from recent customers, the following plot was created showing age and spend variables with 3 distinct clusters. Which cluster offers the best target with the highest spend for marketing materials?  </strong> A) The lower-left cluster reflects the younger age range and should be the target of the marketing. B) The upper-middle cluster reflects a higher spend for appliances versus the lower-left and far-right clusters. C) The far-right cluster is in the middle and the highest spend for appliances. D) Both the upper-middle cluster and the lower-left cluster are the highest spends for appliances.

A) The lower-left cluster reflects the younger age range and should be the target of the marketing.
B) The upper-middle cluster reflects a higher spend for appliances versus the lower-left and far-right clusters.
C) The far-right cluster is in the middle and the highest spend for appliances.
D) Both the upper-middle cluster and the lower-left cluster are the highest spends for appliances.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
29
In reviewing purchases at Costco on a given Saturday, 645 transactions out of 1,250 included toilet paper, detergent, and clothing or {toilet paper, detergent} => {clothing}. Calculate the support of the association rule.

A) 0.516
B) 1.516
C) 1,250
D) 6.45
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
30
In reviewing purchases at Costco on a given Saturday, 385 transactions out of 1,000 included toilet paper, detergent, and clothing or {toilet paper, detergent} => {clothing}. Calculate the support of the association rule.

A) 0.385
B) 1.385
C) 1,000
D) 3.85
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
31
Toilet paper and detergent are the antecedent, where 950 of the 6,500 transactions include both items. Of the overall 6,500 transactions, 126 include toilet paper, detergent, and the consequent of clothing. Using the association rule, what is the confidence?

A) 0.0194
B) 18.415
C) 0.146
D) 0.1326
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
32
Toilet paper and detergent are the antecedent, where 890 of the 10,000 transactions include both items. Of the overall 10,000 transactions, 234 include toilet paper, detergent, and the consequent of clothing. Using the association rule, what is the confidence?

A) 0.0234
B) 20.826
C) 0.089
D) 0.2629
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
33
Costco is known for lower prices for bulk items. After calculating the support and confidence values for {toilet paper} => {avocados}, there appears to be a strong association because a large percentage of customers purchase these items. However, to avoid assuming the strength of association, what option should be done to confirm the strength of the association?

A) Review the transactions for accuracy
B) Rerun the confidence
C) Calculate the lift ratio
D) If the support and confidence are strong then no need to run anything more
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
34
Based on the following table, what is the frequency distribution of the most purchased item? <strong>Based on the following table, what is the frequency distribution of the most purchased item?  </strong> A) bread B) cheese C) milk D) yogurt

A) bread
B) cheese
C) milk
D) yogurt
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
35
Based on the following table, what is the frequency distribution of the most purchased item? <strong>Based on the following table, what is the frequency distribution of the most purchased item?  </strong> A) bread B) cheese C) milk D) yogurt

A) bread
B) cheese
C) milk
D) yogurt
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
36
Based on the following table, what is the proportion of the transactions that include both milk and bread? <strong>Based on the following table, what is the proportion of the transactions that include both milk and bread?  </strong> A) 0.80 B) 0.60 C) 0.20 D) 0.40

A) 0.80
B) 0.60
C) 0.20
D) 0.40
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
37
Based on the following table, what is the proportion of the transactions that include both milk and bread? <strong>Based on the following table, what is the proportion of the transactions that include both milk and bread?  </strong> A) 0.80 B) 0.60 C) 0.20 D) 0.40

A) 0.80
B) 0.60
C) 0.20
D) 0.40
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
38
The Corner Market is using 1,750 transactions on item purchases for analysis. Based on initial results, the manager noticed eggs and potato chips were frequently in the same transactions. After calculating the confidence and the expected confidence on the data, 0.44 and 0.61 respectively, they want to run a lift ratio to ensure there is a positive association. Calculate the lift ratio and determine if the association is positive or negative.

A) 0.7213, positive number, positive association
B) 0.7213, under one, negative association
C) 1.386, over one, positive association
D) -1.386, negative number, negative association
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
39
The Corner Market is using 2,500 transactions on item purchases for analysis. Based on initial results, the manager noticed eggs and potato chips were frequently in the same transactions. After calculating the confidence and the expected confidence on the data, 0.54 and 0.62 respectively, they want to run a lift ratio to ensure there is a positive association. Calculate the lift ratio and determine if the association is positive or negative.

A) 0.871, positive number, positive association
B) 0.871, under one, negative association
C) 1.148, over one, positive association
D) -1.148, negative number, negative association
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
40
Sara is a marketing analysis manager for a top cereal producer. Part of her job is to review product sales to grocery chains and if the contracted product shelf placement is the optimal location to maximize sales. Because of the large data sets, she is only interested in creating a small number of clusters to view the results. What type of clustering method would work best?

A) hierarchical clustering
B) agglomerative clustering
C) divisive clustering
D) k-means clustering
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
41
Amazon uses searches and items purchased to create future product marketing recommendations. Additionally, demographics drive additional potential products to be recommended. To do this, what type of market basket analysis is used?

A) Information Rule
B) Supervised Data Analysis
C) Association Rule
D) k-mean
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
42
Using R, what function is used to view the rules by their lift ratios?

A) lookup
B) apriori
C) rules
D) sort
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
43
In cluster analysis, measures are used to form clusters. However, when large data sets are imported into R, sometimes the variables do not share the same format. To overcome this, you standardize the data using the _____ function.

A) score
B) scale
C) standard
D) dist
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
44
The forming of groups into internally homogeneous groups where each has a unique characteristic, different from other groups, is called cluster analysis.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
45
The most commonly used approach for hierarchical clustering is divisive clustering.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
46
When using k-means clustering, the number of clusters are specified at the end of the analysis to remove overlapping clusters.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
47
When evaluating large data sets, it is customary to cluster large data sets using the k-means to reduce the computation of measures during each iteration compared to hierarchical clustering methods.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
48
When using R, after the data is imported, set.seed function is used to set the random seed and the k function sets the k parameters to preselect the number of clusters.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
49
In understanding the association rules, it is best to think of them as an If-Then statement.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
50
Under the association rule, a lift ratio between 0 and 1 indicates a positive association.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
51
If-Then logical statements are constructed with the If portion being the consequent and the Then being the antecedent.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
52
The Ward's method is the use of a different algorithm to minimize the dissimilarity within clusters by using error sum of squares.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
53
A dendrogram allows for a visual inspection of the clustering results.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.
فتح الحزمة
k this deck
locked card icon
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 53 في هذه المجموعة.