
Cong P. answered 04/26/23
Data Scientist / Tutor / PhD in Computer Science
This is outline of the steps for this popular ML problem:
1. Import the necessary libraries:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
from sklearn.metrics import roc_curve
2. Load the Iris dataset:
data = pd.read_csv('iris.csv')
3. Create the training and testing datasets:
X = data.drop(['species'], axis=1)
y = data['species']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
4. Create a Logistic Regression model and fit it to the training data:
model = LogisticRegression()
model.fit(X_train, y_train)
5. Make predictions on the testing data:
y_pred = model.predict(X_test)
6. Create a confusion matrix:
cm = confusion_matrix(y_test, y_pred)
7. Calculate the model accuracy, sensitivity, and specificity:
acc = (cm[0][0] + cm[1][1]) / (cm[0][0] + cm[1][1] + cm[0][1] + cm[1][0])
TPR = cm[1][1] / (cm[1][1] + cm[1][0])
TNR = cm[0][0] / (cm[0][0] + cm[0][1])
8. Create an ROC curve:
fpr, tpr, thresholds = roc_curve(y_test, y_pred)
Please contact me if you need help to understand topics in data science, machine learning, algorithms, data structures