# NPTEL Data Science for Engineers Assignment 8 Answers 2023

Hello NPTEL Learners, In this article, you will find NPTEL Data Science for Engineers Assignment 8 Week 8 Answers 2023. All the Answers are provided below to help the students as a reference don’t straight away look for the solutions, first try to solve the questions by yourself. If you find any difficulty, then look for the solutions.

###### NPTEL Data Science for Engineers Assignment 8 Answers 2023 Join Group👇

Note: We are trying to give our best so please share with your friends also.

## NPTEL Data Science for Engineers Assignment 8 Answers 2023:

Consider the dataset “USArrests.csv”. Answer questions 1 to 4 based on the information given below: This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population livingin urban areas.

• Set the column “States” as index of the data frame while reading the data • Set the random number generator to set.seed(123) • Normalize the data using scale function and build the K-means algorithm with the given conditions: – number of clusters = 4 – nstart=20

#### Q.1. According to the built model, the within cluster sum of squares for each cluster is __ (the order of values in each option could be different):-

• a. 8.316061 11.952463 16.212213 19.922437
• b. 7.453059 12.158682 13.212213 21.158766
• c. 8.316061 13.952463 15.212213 19.922437
• d. None of the above

#### Q.2. According to the built model, the size of each cluster is (the order of values _ in each option could be different):-

• a. 13 13 7 14
• b. 11 18 25 24
• c. 8 13 16 13
• d. None of the above

#### Q.3. The Between Cluster Sum-of-Squares (BCSS) value of the built K-means model is _ (Choose the appropriate range)

• a. 100 – 200
• b. 200 – 300
• c. 300 – 350
• d. None of the above

#### Q.4. The Total Sum-of-Squares value of the built k-means model is _(Choose the appropriate range)

• a. 100 – 200
• b. 200 – 300
• c. 300 – 350
• d. None of the above

#### Q.5.Which of the statement is INCORRECT about KNN algorithm?

• a. KNN works ONLY for binary classification problems
• b. If k=1, then the algorithm is simply called the nearest neighbour algorithm
• c. Number of neighbours (K) will influence classification output
• d. None of the above

#### Q.6. K means clustering algorithm clusters the data points based on:-

• a. dependent and independent variables
• b. the eigen values
• c. distance between the points and a cluster centre
• d. None of the above

#### Q.7. The method / metric which is NOT useful to determine the optimal number of clusters in unsupervised clustering algorithms is

• a. Scatter plot
• b. Elbow method
• c. Dendrogram
• d. None of the above

#### Q.8. The unsupervised learning algorithm which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest centroid is

• a. Hierarchical clustering
• b. K-means clustering
• c. KNN
• d. None of the above
##### NPTEL Data Science for Engineers Assignment 8 Answers Join Group👇

Disclaimer: This answer is provided by us only for discussion purpose if any answer will be getting wrong don’t blame us. If any doubt or suggestions regarding any question kindly comment. The solution is provided by Chase2learn. This tutorial is only for Discussion and Learning purpose.

#### About NPTEL Data Science for Engineers Course:

Learning Objectives :

1. Introduce R as a programming language
2. Introduce the mathematical foundations required for data science
3. Introduce the first level data science algorithms
4. Introduce a data analytics problem solving framework
5. Introduce a practical capstone case study

Learning Outcomes:

1. Describe a flow process for data science problems (Remembering)
2. Classify data science problems into standard typology (Comprehension)
3. Develop R codes for data science solutions (Application)
4. Correlate results to the solution approach followed (Analysis)
5. Assess the solution approach (Evaluation)
6. Construct use cases to validate approach and identify modifications required (Creating)
##### Course Outcome:
• Week 1:  Course philosophy and introduction to R
• Week 2:  Linear algebra for data science
•                 1. Algebraic view – vectors, matrices, product of matrix & vector, rank, null space, solution of over-determined set of equations and pseudo-inverse)
•                 2. Geometric view – vectors, distance, projections, eigenvalue decomposition
• Week 3:  Statistics (descriptive statistics, notion of probability, distributions, mean, variance, covariance, covariance matrix, understanding univariate and multivariate normal distributions, introduction to hypothesis testing, confidence                        interval for estimates)
• Week 4:  Optimization
• Week 5:  1. Optimization
• 2. Typology of data science problems and a solution framework
• Week 6:  1. Simple linear regression and verifying assumptions used in linear regression
• 2. Multivariate linear regression, model assessment, assessing importance of different variables, subset selection
• Week 7:  Classification using logistic regression
• Week 8:  Classification using kNN and k-means clustering
###### CRITERIA TO GET A CERTIFICATE:

Average assignment score = 25% of average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

If you have not registered for exam kindly register Through https://examform.nptel.ac.in/