টপিক সীমারেখা
Regression Analysis
Download the code from this link: https://drive.google.com/file/d/1O2HzFSRaEDHokzTdXdUtcVa4G-aH6W8E/view?usp=sharing
A working professional who has just started his career wanted to purchase a used car for commuting between home and office. He believed that the number of kilometers the cars were driven influenced the purchase price of the car. He visited various used car showrooms and collected the data for kilometers driven and the price of the car. The data are as follows:
Price ( in Lakh Rupees)
Kilometer (in Thousand)
Price ( in Lakh Rupees)
Kilometer (in Thousand)
2.59
36.6
7.6
36.4
5.02
20.4
8.2
28.2
7.5
31.6
3.2
45.3
2.92
45.9
3.5
38.3
8.55
10.8
5.74
35.2
a. Calculate the least squares equation that can be used to predict the price of the car.
b. The professional would like to purchase a car that was driven 25,000 km. How much is the predicted price of the car?
Logistic Regression
- PPT - https://www.dropbox.com/scl/fi/qdv1u57n9a4v5nxmi6g3c/Lecture-13-Logistic-regression.pptx?rlkey=s7kjqizt22q1a8acuoxafe2rt&dl=0
- Note - https://www.dropbox.com/scl/fi/u7a4cesvix4x5em61lnfj/Logistics-Regression-notes.pdf?rlkey=frzm0g3fb015mjcjm9loemxjg&dl=0
- Code - https://drive.google.com/file/d/1yGyuKrCFMLuk7qrdWyHI_tJnlFjrYZv9/view?usp=sharing
a) Import the data from the following link: https://stats.idre.ucla.edu/stat/data/binary.csv
b) Present the summary of the data.
c) Now do Logistic Regression to find out relation between variables that are affecting the admission of a student in a institute.
d) Based on his or her GRE score, find out GPA obtained and rank of the student.
Unsupervised Learning: Clustering
- What is Clustering and Types of Clustering methods
Collect the mtcars data from kaggle link: https://www.kaggle.com/datasets/ruiromanini/mtcars
Load the mtcars dataset in Python. Determine the appropriate number of clusters (k) for the dataset by implementing the Elbow method. Use kmeans method to cluster the data with k. Visualize the clusters.
Collect the credit card data from kaggle link: https://www.kaggle.com/datasets/arjunbhasin2013/ccdata
Cluster the customers based on Balance, Purchases, Cash_Advance, Credit_limits, and Payments. Use kmeans clustering technique and find the optimal number of clusters using the Elbow method.
Classification
- Download
- PPT - https://www.dropbox.com/scl/fi/22nyvo58ledat3wugzh4f/Lecture-11-Decision-Trees.pptx?rlkey=h1m3ibl6p25b0mcvp1202eybj&dl=0
- Python Code - https://colab.research.google.com/drive/16IQn9Eiajvuv2NqUiRSlsPjFraTzrsrR?usp=drive_link
- Mall Customer data - https://drive.google.com/file/d/1DkEzUF4ksfu-umrLsfvb3vvJxm9OG2ra/view?usp=drive_link
- R Code - https://www.dropbox.com/scl/fi/22nyvo58ledat3wugzh4f/Lecture-11-Decision-Trees.pptx?rlkey=h1m3ibl6p25b0mcvp1202eybj&dl=0
Consider a dataset based on which we will determine whether to play football or not.
a) Build the complete decision tree.
b) Write python code to build the decision tree and check for accuracy with test data- Here we discuss about Probabilistic Graphical Networks. Naive Bayes network is directed, and Markov network is undirected.
Python Code
How to read a boxplot? https://www.listendata.com/2014/08/how-to-read-box-plot.html
