Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Course Code GEN101 Course Name Introductory Artificial Intelligence Assignment No 3 Assignment Title Performance Evaluation Instructors Prof

Course Code GEN101 Course Name Introductory Artificial Intelligence Assignment No 3 Assignment Title Performance Evaluation Instructors Prof

Business

Course Code GEN101 Course Name Introductory Artificial Intelligence Assignment No 3 Assignment Title Performance Evaluation Instructors Prof. Mohammed Ghazal Eng. Maha Yaghi Eng. Malaz Osman Eng. Marah AlHalabi Eng. Tasnim Basmaji Eng. Yasmin Abu-Haeyeh 1. 2. This will generate a unique set of d For a successful completion of the cells. Intergrity (OAI). Name Rawan salem bajjabaa ID 1076676 Given the following test set of 20 architectural styles (Cape_code(+) and Art_deco(-)). Calculate the different performance measures (precision, recall, accuracy, …). Construct the ROC and then calculate the AUC. 1. Insert your name and student ID under Name and ID. 2. Select your major from the drop-down list. This will generate a unique set of data for you only. For a successful completion of the assignment, you need to fill all bordered White cells. Note that any similarity detected will be reported to the Office of Academic Intergrity (OAI). Major Architecture rchitectural styles (Cape_code(+) erent performance measures struct the ROC and then calculate ID. , you need to fill all bordered White reported to the Office of Academic Instance P(+|A) True Class (Human) 1 0.10 Cape_code 2 0.87 Art_deco 3 0.78 Cape_code 4 0.98 Art_deco 5 0.36 Cape_code 6 0.12 Art_deco 7 0.85 Cape_code 8 9 10 11 12 13 14 15 16 17 0.87 0.69 0.83 0.90 0.71 0.72 0.93 0.94 0.34 0.71 Art_deco Art_deco Art_deco Cape_code Cape_code Art_deco Art_deco Art_deco Art_deco Cape_code 18 19 20 0.69 0.32 0.35 Cape_code Cape_code Art_deco Predicited (AI) TP FP TN FN Precision 3. Apply threshold at each unique value of P(+|A) to find the predicted class (Predicted (AI)) for each instance. Note: Make sure you do not change the data in the colored cells. Hint: You may copy and paste the data (by value) to another sheet then sort it and begin working on the performance evaluation. Once done, you may fill in the Hint: You may copy and paste the data (by value) to another sheet then sort it and begin working on the performance evaluation. Once done, you may fill in the bordered white cells in this sheet with your final answers. 4. Count the number of TP, FP, TN, and FN at each threshold and then calculate the different performance evaluation measures for each instance. The formulas can be found in the performance evaluation slides on blackboard. ort it and Recall (Sensitivity) TPR FPR Specificity/T NR ACC F1 5. Use the calculated values to plot the Precision-Recall curve and the Reciever Operating Characteristics (ROC) curve. 6. Find the area under the curve (AUC). ort it and ulate the as can be 6. Find the area under the curve (AUC). 7. Once completed, save the file and subm on blackboard. Precision-Recall Curve Receiver Operating Characteristics Curve AUC values to plot the and the Reciever stics (ROC) curve. the curve (AUC). the curve (AUC). ve the file and submit it ecall Curve Characteristics Curve Assignment 3 Guidelines Step 1: Study ALL Topics Covered So Far Topic 7: Performance Evaluation Advise: Redo all activities done in class and uploaded on BlackBoard Step 2: On Thursday May 6 at 12:01 am in the Assignments Folder Download the Excel File GEN101 – Assignment 3.xlsm Step 3: Download and Open the Excel file The Filename: GEN101 – Assignment 3.xlsm Disclaimer: No Macros are needed. Use a regular laptop and not a mobile or tablet. Your computer, Mac or PC, must have an Intel Chip. Apple M1 Chip (2021 MacBook Pros and Airs) do not yet support certain Excel features Step 4: Fill your name, ID, and select your major. Write ID Write Name 1 Observe 1 Select Major 1 The question and data depend on your ID and major. No two students will be getting the same data and question. Any similarity will be reported to the Office of Academic Integrity. Your assignment is unique to you. 2 2 Observe Step 5: Solve the Problem Fill Fill Follow the post-it note instructions and fill all the white cells (boxes). If it is empty and white, you are expected to fill it. Use Excel to make the calculations. Do not insert extra rows or columns. Save your work! Step 6: Submit Your Excel File Uncompressed before 11:59pm on Saturday Step 7: Wait for Grades • • • • Grades are final This assignment is individual. No cooperation is allowed with anyone Late assignments receive a -50% penalty Assignments are not accepted after the last day of classes and will receive NO credit • Anti-cheating measures are embedded within the Excel. Cheating will be reported • You can submit multiple attempts • Respondus Lockdown and Monitor will not be used for this Assignment Course Code GEN101 Course Name Introductory Artificial Intelligence Assignment No 3 Assignment Title Performance Evaluation Instructors Prof. Mohammed Ghazal Eng. Maha Yaghi Eng. Malaz Osman Eng. Marah AlHalabi Eng. Tasnim Basmaji Eng. Yasmin Abu-Haeyeh 1. 2. This will generate a unique set of d For a successful completion of the cells. Intergrity (OAI). Name Rawan salem bajjabaa ID 1076676 Given the following test set of 20 architectural styles (Cape_code(+) and Art_deco(-)). Calculate the different performance measures (precision, recall, accuracy, …). Construct the ROC and then calculate the AUC. 1. Insert your name and student ID under Name and ID. 2. Select your major from the drop-down list. This will generate a unique set of data for you only. For a successful completion of the assignment, you need to fill all bordered White cells. Note that any similarity detected will be reported to the Office of Academic Intergrity (OAI). Major Architecture rchitectural styles (Cape_code(+) erent performance measures struct the ROC and then calculate ID. , you need to fill all bordered White reported to the Office of Academic Instance P(+|A) True Class (Human) 1 0.10 Cape_code 2 0.87 Art_deco 3 0.78 Cape_code 4 0.98 Art_deco 5 0.36 Cape_code 6 0.12 Art_deco 7 0.85 Cape_code 8 9 10 11 12 13 14 15 16 17 0.87 0.69 0.83 0.90 0.71 0.72 0.93 0.94 0.34 0.71 Art_deco Art_deco Art_deco Cape_code Cape_code Art_deco Art_deco Art_deco Art_deco Cape_code 18 19 20 0.69 0.32 0.35 Cape_code Cape_code Art_deco Predicited (AI) TP FP TN FN Precision + 3 5 3 7 0.38 - 8 4 2 6 0.67 + 8 3 3 6 0.73 - 2 6 5 7 0.25 + 4 5 8 3 0.44 - 2 0 9 9 1.00 0.00 + + + + + + - 7 3 6 7 6 6 9 8 2 6 4 2 4 3 4 4 1 2 8 4 4 8 2 3 6 3 4 6 3 6 5 7 3 7 4 7 6 4 6 4 0.64 0.60 0.60 0.70 0.60 0.60 0.90 0.80 0.20 0.60 4 0 5 6 8 6 2 6 4 8 6 5 0.40 0.00 0.45 3. Apply threshold at each unique value of P(+|A) to find the predicted class (Predicted (AI)) for each instance. Note: Make sure you do not change the data in the colored cells. Hint: You may copy and paste the data (by value) to another sheet then sort it and begin working on the performance evaluation. Once done, you may fill in the Hint: You may copy and paste the data (by value) to another sheet then sort it and begin working on the performance evaluation. Once done, you may fill in the bordered white cells in this sheet with your final answers. 4. Count the number of TP, FP, TN, and FN at each threshold and then calculate the different performance evaluation measures for each instance. The formulas can be found in the performance evaluation slides on blackboard. ort it and Recall (Sensitivity) TPR FPR Specificity/T NR ACC F1 0.30 0.63 0.38 0.50 0.33 0.57 0.67 0.33 0.70 0.62 0.57 0.50 0.50 0.70 0.64 0.22 0.55 0.45 0.45 0.24 0.57 0.38 0.62 0.35 0.50 0.18 0.00 1.00 0.55 0.31 0.00 0.00 0.58 0.30 0.67 0.50 0.60 0.46 0.60 0.67 0.25 0.60 0.50 0.20 0.67 0.50 0.40 0.57 0.20 0.25 0.73 0.40 0.50 0.80 0.33 0.50 0.60 0.43 0.80 0.75 0.27 0.60 0.60 0.50 0.45 0.70 0.50 0.65 0.75 0.60 0.40 0.50 0.61 0.40 0.63 0.58 0.60 0.52 0.72 0.73 0.22 0.60 0.33 0.00 0.50 0.75 0.57 0.60 0.25 0.43 0.40 0.60 0.30 0.50 0.36 0.48 n 5. Use the calculated values to plot the Precision-Recall curve and the Reciever Operating Characteristics (ROC) curve. 6. Find the area under the curve (AUC). ort it and ulate the as can be 6. Find the area under the curve (AUC). 7. Once completed, save the file and subm on blackboard. Precision-Recall Curve Chart Title 1.20 1.00 0.80 0.60 0.40 0.20 0.00 1 2 3 4 5 6 7 8 9 Precision 10 11 12 Recall (Sensitivity) TPR Receiver Operating Characteristics Curve AUC values to plot the and the Reciever stics (ROC) curve. the curve (AUC). the curve (AUC). ve the file and submit it ecall Curve 12 13 14 15 16 17 Recall (Sensitivity) Characteristics Curve 18 19 20 0.1 0.87 0.78 0.98 0.36 0.12 0.85 0.87 0.69 0.83 0.9 0.71 0.72 0.93 0.94 0.34 0.71 0.69 0.32 0.35 Course Code GEN101 Course Name Introductory Artificial Intelligence Assignment No 3 Assignment Title Performance Evaluation Instructors Prof. Mohammed Ghazal Eng. Maha Yaghi Eng. Malaz Osman Eng. Marah AlHalabi Eng. Tasnim Basmaji Eng. Yasmin Abu-Haeyeh 1. 2. This will generate a unique set of For a successful completion of the cells. Intergrity (OAI). Name ID Amal Almansoori 1078176 Given the following test set of 20 cancer grades (Benign(+) and Malignant(-)). Calculate the different performance measures (precision, recall, accuracy, …). Construct the ROC and then calculate the AUC. 1. Insert your name and student ID under Name and ID. 2. Select your major from the drop-down list. This will generate a unique set of data for you only. For a successful completion of the assignment, you need to fill all bordered White cells. Note that any similarity detected will be reported to the Office of Academic Intergrity (OAI). Major Biomedical Engineering ancer grades (Benign(+) and nt performance measures struct the ROC and then calculate ID. , you need to fill all bordered White Instance P(+|A) True Class (Human) 1 0.10 Benign 2 0.87 Malignant 3 0.78 Benign 4 0.98 Malignant 5 0.36 Benign 6 0.12 Malignant 7 0.85 Benign 8 9 10 11 12 13 14 15 16 17 0.87 0.69 0.83 0.90 0.71 0.72 0.93 0.94 0.34 0.71 Malignant Malignant Malignant Benign Benign Malignant Malignant Malignant Malignant Benign 18 19 20 0.69 0.32 0.35 Benign Benign Malignant reported to the Office of Academic 3. Apply (Predicte Note: Hint: Hint: begin wo bordered 4. Count different found in Predicited (AI) TP FP TN FN 9 11 0 0 TP FP TN FN Precision Benign 1 0 11 8 1.00 Benign 1 1 10 8 0.50 Benign 2 1 10 7 0.67 Benign 2 2 9 7 0.50 Benign 3 3 8 6 0.50 Benign 3 3 8 6 0.50 Benign 4 3 8 5 0.57 Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign 4 4 7 5 4 5 6 5 4 6 5 5 6 6 5 3 6 6 5 3 6 8 3 3 6 8 3 3 6 9 2 3 6 10 1 3 7 10 1 2 0.50 0.44 0.40 0.50 0.50 0.43 0.43 0.40 0.38 0.41 Benign Benign Benign 8 10 1 1 9 10 1 0 9 11 0 0 0.44 0.47 0.45 3. Apply threshold at each unique value of P(+|A) to find the predicted class (Predicted (AI)) for each instance. Note: Make sure you do not change the data in the colored cells. Hint: You may copy and paste the data (by value) to another sheet then sort it and Hint: You may copy and paste the data (by value) to another sheet then sort it and begin working on the performance evaluation. Once done, you may fill in the bordered white cells in this sheet with your final answers. 4. Count the number of TP, FP, TN, and FN at each threshold and then calculate the different performance evaluation measures for each instance. The formulas can be found in the performance evaluation slides on blackboard. rt it and Recall (Sensitivity) TPR FPR Specificity/T NR ACC F1 0.11 0.00 1.00 0.60 0.15 0.11 0.09 0.91 0.55 0.15 0.22 0.09 0.91 0.60 0.27 0.22 0.18 0.82 0.55 0.27 0.33 0.27 0.73 0.55 0.35 0.33 0.27 0.73 0.55 0.35 0.44 0.27 0.73 0.60 0.42 0.44 0.44 0.44 0.67 0.67 0.67 0.67 0.67 0.67 0.78 0.36 0.45 0.55 0.55 0.55 0.73 0.73 0.82 0.91 0.91 0.64 0.55 0.45 0.45 0.45 0.27 0.27 0.18 0.09 0.09 0.55 0.50 0.45 0.55 0.55 0.45 0.45 0.40 0.35 0.40 0.42 0.42 0.42 0.52 0.52 0.52 0.52 0.52 0.52 0.56 0.89 1.00 1.00 0.91 0.91 1.00 0.09 0.09 0.00 0.45 0.50 0.45 0.59 0.62 0.62 5. Use the calculated values to plot the Precision-Recall curve and the Reciever Operating Characteristics (ROC) curve. 6. Find the area under the curve (AUC). rt it and ulate the as can be 6. Find the area under the curve (AUC). 7. Once completed, save the file and subm on blackboard. Precision-Recall Curve Receiver Operating Characteristics Curve ROC AUC 1.00 0.5205 0.80 0.70 0.60 TPR values to plot the and the Reciever stics (ROC) curve. the curve (AUC). 0.90 0.50 0.40 0.30 the curve (AUC). ve the file and submit it 0.30 0.20 0.10 0.00 0.00 0.10 0.20 0.30 0.40 0.50 FPR 0.60 ecall Curve Characteristics Curve 0.60 0.70 0.80 0.90 1.00 Instance P(+|A) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.98 0.94 0.93 0.90 0.87 0.87 0.85 0.83 0.78 0.72 0.71 0.71 0.69 0.69 0.36 0.35 0.34 0.32 0.12 0.10 True Class (Human) Benign Malignant Benign Malignant Benign Malignant Benign Malignant Malignant Malignant Benign Benign Malignant Malignant Malignant Malignant Benign Benign Benign Malignant Predicited (AI) Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign Benign TP FP 9 11 TP FP 1 1 2 0 1 1 2 2 3 3 3 3 4 3 4 4 4 5 4 6 6 6 6 6 6 8 6 8 6 9 6 10 7 10 8 10 9 10 9 11 Model 1 A1 A2 A3 A4 A5 A6 AUC 0.01 0.04 0.025 0.1215 0.234 0.09 0.5205 TN FN 0 0 TN FN Recall (Sensitivity) TPR 0.11 0.11 0.22 0.22 0.33 0.33 0.44 0.44 0.44 0.44 0.67 0.67 0.67 0.67 0.67 0.67 0.78 0.89 1.00 1.00 Precision 11 10 10 8 8 7 9 7 8 6 8 6 8 5 7 5 6 5 5 5 5 3 5 3 3 3 3 3 2 3 1 3 1 2 1 1 1 0 0 0 1.00 0.50 0.67 0.50 0.50 0.50 0.57 0.50 0.44 0.40 0.50 0.50 0.43 0.43 0.40 0.38 0.41 0.44 0.47 0.45 Precision vs Recall Curve 1.00 1.00 0.90 0.90 0.80 0.80 0.70 0.70 0.60 0.60 Precision TPR ROC 0.50 0.40 0.50 0.40 0.30 0.30 0.20 0.20 0.10 0.10 0.00 0.00 0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.00 0.20 0.40 0.60 FPR 0.80 1.00 0.00 0.20 0.40 FPR 0.00 0.09 0.09 0.18 0.27 0.27 0.27 0.36 0.45 0.55 0.55 0.55 0.73 0.73 0.82 0.91 0.91 0.91 0.91 1.00 Specificity/T NR 1.00 0.91 0.91 0.82 0.73 0.73 0.73 0.64 0.55 0.45 0.45 0.45 0.27 0.27 0.18 0.09 0.09 0.09 0.09 0.00 Precision vs Recall Curve 0.40 0.60 0.80 1.00 ACC F1 0.60 0.55 0.60 0.55 0.55 0.55 0.60 0.55 0.50 0.45 0.55 0.55 0.45 0.45 0.40 0.35 0.40 0.45 0.50 0.45 0.15 0.15 0.27 0.27 0.35 0.35 0.42 0.42 0.42 0.42 0.52 0.52 0.52 0.52 0.52 0.52 0.56 0.59 0.62 0.62 0.40 0.60 Recall 0.80 1.00 GEN101 Introductory Artificial Intelligence College of Engineering Artificial Intelligence Performance Evaluation Model Evaluation Metrics for Performance Evaluation ? How to evaluate the performance of a model? Methods for Performance Evaluation ? How to obtain reliable estimates? Methods for Model Comparison ? How to compare the relative performance among competing models? Metrics for Performance Evaluation We will focus on the predictive capability of a model • Rather than how fast it takes to classify or build models, scalability, etc. We have already seen: the Confusion Matrix PREDICTED CLASS (AI) Class=Yes Class=No ACTUAL CLASS Class=Yes a (TP) b (FN) (Human) Class=No c (FP) d (TN) Most widely-used metric: Accuracy = a+d TP + TN = a + b + c + d TP + TN + FP + FN TP: True Positive FP: False Positive TN: True Negative FN: False Negative Limitation of Accuracy Consider a 2-class problem • Number of Class 0 (e.g., Benign) examples = 9990 • Number of Class 1 (e.g., Malignant) examples = 10 If model predicts everything to be class 0, accuracy is 9990/10000 = 99.9 % Accuracy is misleading because model does not detect any class 1 example Other Measures PREDICTED CLASS ACTUAL CLASS Class=Yes Class=No Class=Yes a (TP) b (FN) Class=No c (FP) d (TN) Precision (p) = c c+d a True positive rate = a+b False positive rate = Weighted Accuracy = Recall (r) = a a+b F - measure (F) = wa + w d wa + wb+ wc + w d 1 a a+c 1 4 2 3 4 2rp 2a = r + p 2a + b + c Precision-Recall • Two other metrics that are often used to quantify model performance are precision and recall. • Precision is defined as the number of true positives divided by the total number of positive predictions. • It quantifies what percentage of the positive predictions were correct How correct your model’s positive predictions were. • Recall/Sensitivity is defined as the number of true positives divided by the total number of true positives and false negatives (all actual positives) • It quantifies what percentage of the actual positives you were able to identify How sensitive your model was in identifying positives. Precision-Recall Precision = ?? ??+?? Out of all the examples that were predicted as positive, how many are correct? How precise was the AI? Recall = ?? ??+?? Out of all the positive examples, how many were caught (recalled)? Did the AI miss? Sensitivity = ?? ??+?? Out of all the people that do have the disease, how many got positive test results? Did the AI miss anyone? Specificity = ?? ??+?? Out of all the people that do NOT have the disease, how many got negative results? Precision-Recall Example In this example: • If the classifier predicts negative, you can trust it, the example is negative. Our AI does not do a mistake in negative. It is sensitive. However, pay attention, if the example is negative, you can’t be sure it will predict it as negative (specificity=78%). • If the classifier predicts positive, you can’t trust it (precision=33%) • However, if the example is positive, you can trust the classifier will find it (will not miss it) (recall=100%). Precision-Recall Example In this example: Since the population is imbalanced: • The precision is relatively high • The recall is 100% because all the positive examples are predicted as positive. • The specificity is 0% because no negative example is predicted as negative. Precision-Recall Example In this example: • If it predicts that an example is positive, you can trust it — it is positive. • However, if it predicts it is negative, you can’t trust it, the chances are that it is still positive. Can be a useful classifier Precision-Recall Example In this example: • The classifier detects all the positive examples as positive • It also detects all negative examples as negative. • All the measures are at 100%. Why so many measures? Which is more important? Because it depends False positive is expensive. False alarm is a nightmare! False negative is expensive. Missing one is a nightmare! Precision is more important than recall Recall is more important than precision Precision-Recall Curves • The precision-recall curve is used for evaluating the performance of binary classification algorithms. • They provide a graphical representation of a classifier’s performance across many thresholds, rather than a single value • It is constructed by calculating and plotting the precision against the recall for a single classifier at a variety of thresholds. • It helps to visualize how the choice of threshold affects classifier performance and can even help us select the best threshold for a specific problem. Precision-Recall Curves Interpretation Precision-Recall Curves A model that produces a precision-recall curve that is closer to the top-right corner is better than a model that produces a precision-recall curve that is skewed towards the bottom of the plot. Precision-Recall Curves Example Which of the following P-R curves produce represent a perfect classifier? (b) (a) (c) Model Evaluation Metrics for Performance Evaluation ? How to evaluate the performance of a model? Methods for Performance Evaluation ? How to obtain reliable estimates? Methods for Model Comparison ? How to compare the relative performance among competing models? Methods for Performance Evaluation How to obtain a reliable estimate of performance? Performance of a model may depend on other factors besides the learning algorithm: Class distribution Size of training and test sets Methods of Estimation: Holdout Random subsampling Cross validation Partition data into k disjoint subsets Reserve 2/3 for training and 1/3 for testing Repeated holdout k-fold: train on k-1 partitions, test on the remaining one Leave-one-out: k=n Model Evaluation Metrics for Performance Evaluation ? How to evaluate the performance of a model? Methods for Performance Evaluation ? How to obtain reliable estimates? Methods for Model Comparison ? How to compare the relative performance among competing models? ROC (Receiver Operating Characteristic) Developed in 1950s for signal detection theory to analyze noisy signals • Characterize the trade-off between positive hits and false alarms ROC curve plots TP rate (y-axis) against FP rate (x-axis) TP rate (TPR) = ?? ??+?? FP rate (TPR) = ?? ??+?? Performance of each classifier represented as a point on ROC curve changing the threshold of the algorithm, or sample distribution changes the location of the point ROC (Receiver Operating Characteristic) 1-dimensional data set containing 2 classes (positive and negative) - any points located at x > t is classified as positive At threshold t: TPR=0.5, FPR=0.12 21 ROC (Receiver Operating Characteristic) (TPR,FPR): (0,0): declare everything to be negative class (1,1): declare everything to be positive class (0,1): ideal Diagonal line: • • Random guessing Below diagonal line: prediction is opposite of the true class Using ROC for Model Comparison No model consistently outperform the other • • • M1 is better for small FPR M2 is better for large FPR Area Under the ROC curve • Ideal: • Area = 1 • Random guess: • Area = 0.5 How to construct an ROC curve Instance P(+|A) True Class 1 0.85 + 2 0.53 + 3 0.87 - 4 0.85 - 5 0.85 - ?? TP rate (TPR) = ??+?? 6 0.95 + 7 0.76 - ?? FP rate (TPR) = ??+?? 8 0.93 + 9 0.43 - 10 0.25 + 1. Use a classifier that produces a probability for each test instance P(+|A) for each test A 2. Sort the instances according to P(+|A) in decreasing order 3. Apply threshold at each unique value of P(+|A) and count the number of TP, FP, TN, FN at each threshold How to construct an ROC curve Instance P(+|A) True Class 1 0.95 + 2 0.93 + 3 0.87 - 4 0.85 - 5 0.85 - ?? ??+?? 6 0.85 + 7 0.76 - ?? FP rate (TPR) = ??+?? 8 0.53 + 9 0.43 - 10 0.25 + 1. Use a classifier that produces a probability for each test instance P(+|A) for each test A 2. Sort the instances according to P(+|A) in decreasing order 3. Apply threshold at each unique value of P(+|A) and count the number of TP, FP, TN, FN at each threshold TP rate (TPR) = How to construct an ROC curve # Thresh old >= True Class (Human) AI TP FP TN FN 1 0.95 + + 1 0 5 4 2. Sort the instances according to P(+|A) in decreasing order 2 0.93 + - 2 0 5 3 3. Apply threshold at each unique value of P(+|A) and count the number of TP, FP, TN, FN at each threshold 3 0.87 - - 2 1 4 3 4 0.85 - - 5 0.85 - - ?? ??+?? 6 0.85 + - 3 3 2 2 7 0.76 - - 3 4 1 2 ?? FP rate (TPR) = ??+?? 8 0.53 + - 4 4 1 1 9 0.43 - - 4 5 0 1 10 0.25 + - 5 5 0 0 1. Use a classifier that produces a probability for each test instance P(+|A) for each test A TP rate (TPR) = How to construct an ROC curve Instance P(+|A) True Class TP FP TN FN FPR TPR 1 0.95 + 1 0 5 4 0 1/5 2 0.93 + 2 0 5 3 0 2/5 3 0.87 - 2 1 4 3 1/5 2/5 4 0.85 - 5 0.85 - 6 0.85 + 3 3 2 2 3/5 3/5 7 0.76 - 3 4 1 2 4/5 3/5 8 0.53 + 4 4 1 1 4/5 4/5 9 0.43 - 4 5 0 1 1 4/5 10 0.25 + 5 5 0 0 1 1 Breakout Session How to construct an ROC curve Example Instance P(+|A) True Class 1 0.95 + 2 0.93 + 3 0.87 + 4 0.85 + 5 0.83 + 6 0.80 - 7 0.76 - 8 0.53 - 9 0.43 - 10 0.25 - FPR TPR How to construct an ROC curve Example Instance P(+|A) True Class FPR TPR 1 0.95 + 0 1/5 2 0.93 + 0 2/5 3 0.87 + 0 3/5 4 0.85 + 0 4/5 5 0.83 + 0 1 6 0.80 - 1/5 1 7 0.76 - 2/5 1 8 0.53 - 3/5 1 9 0.43 - 4/5 1 10 0.25 - 1 1 How to construct an ROC curve Example Instance P(+|A) True Class FPR TPR 1 0.95 + 0 1/5 2 0.93 - 1/5 1/5 3 0.87 + 1/5 2/5 4 0.85 - 2/5 2/5 5 0.83 + 2/5 3/5 6 0.80 - 3/5 3/5 7 0.76 + 3/5 4/5 8 0.53 - 4/5 4/5 9 0.43 + 4/5 1 10 0.25 - 1 1 ROC Interpretation AUC (Area Under the Curve) • AUC stands for "Area under the ROC Curve." • It measures the entire two-dimensional area underneath the entire ROC curve from (0,0) to (1,1) • It provides an aggregate measure of performance across all possible classification thresholds. • AUC ranges in value from 0 to 1. • A model whose predictions are 100% wrong has an AUC of 0.0; which means it has the worst measure of separability • A model whose predictions are 100% correct has an AUC of 1.0; which means it has a good measure of separability • When AUC is 0.5, it means the model has no class separation capacity AUC (Area Under the Curve) AUC (Area Under the Curve) Interpretation Example Red distribution curve is of the positive class (patients with disease) and the green distribution curve is of the negative class(patients with no disease) This is an ideal situation. When two curves don’t overlap at all means model has an ideal measure of separability. It is perfectly able to distinguish between positive class and negative class. AUC (Area Under the Curve) Interpretation Example When AUC is 0.7, it means there is a 70% chance that the model will be able to distinguish between positive class and negative class. AUC (Area Under the Curve) Interpretation Example This is the worst situation. When AUC is approximately 0.5, the model has no discrimination capacity to distinguish between positive class and negative class. AUC (Area Under the Curve) Interpretation Example When AUC is approximately 0, the model is actually reciprocating the classes. It means the model is predicting a negative class as a positive class and vice versa. AUC (Area Under the Curve) Interpretation Example Which of the following ROC curves produce AUC values greater than 0.5? a c b d e ROC vs. PRC The main difference between ROC curves and precisionrecall curves is that the number of true-negative results is not used for making a PRC x-axis y-axis Curve Precision-recall (PRC) Receiver Operating Characteristics (ROC) Concept Calculation Concept Calculation Recall TP / (TP + FN) Precision TP / (TP + FP) FP / (FP + TN) Recall Sensitivity True Positive Rate TP / (TP + FN) False Positive Rate Intersection over Union (IoU) for object detection Is object detection classification or regression? What is Intersection over Union? Definition Intersection over union (IoU) is an evaluation metric used to measure the accuracy of an object detector on a particular dataset. It is a number from 0 to 1 that specifies the amount of overlap between the predicted and ground truth bounding box. It is known to be the most popular evaluation metric for tasks such as segmentation, object detection and tracking. What is Intersection over Union? Which one is best? Intersection over Union • An IoU of 0 means that there is no overlap between the boxes • An IoU of 1 means that the union of the boxes is the same as their overlap indicating that they are completely overlapping The lower the IoU The worse the prediction result Intersection over Union In order to apply Intersection over Union to evaluate an (arbitrary) object detector we need: 1. The ground-truth bounding boxes (i.e., the hand labeled bounding boxes that specify where in the image our object is). 2. The predicted bounding boxes from our model. Intersection over Union Your goal is to take the: Training images Bounding boxes construct an object detector Then evaluate its performance on the testing set. An Intersection over Union score > 0.5 is normally considered a “good” prediction. Intersection over Union Confidence Intervals Confidence Interval for Accuracy Definition Confidence, in statistics, is a way to describe probability. Confidence interval is the range of values you expect your estimate to fall between if you redo your test, within a certain level of confidence. For example: if you construct a confidence interval with a 95% confidence level, you are confident that 95 out of 100 times the estimate will fall between the upper and lower values specified by the confidence interval. Confidence level = 1 −α Confidence Interval for Accuracy Prediction can be regarded as a Bernoulli trial?? ? ? ? A Bernoulli trial has 2 possible outcomes?Possible outcomes for prediction: correct or wrong?Collection of Bernoulli trials has a Binomial distribution:?x ~ Bin(N, p) x: number of correct predictions?Example: Toss a fair coin 50 times, how many heads would turn up??Expected number of heads = Nxp = 50 x 0.5 = 25?Classification Accuracy = correct predictions total predictions Given x (# of correct predictions) or equivalently, acc = x / N, and N (# of test instances),?Can we predict p (true accuracy of model)??Confidence Interval for Accuracy For large test sets (N > 30), acc has a normal distribution with mean p and variance ?p(1-p) / N Confidence Interval for p:?Confidence Interval for Accuracy?Example Consider a model that produces an accuracy of 80% when evaluated on 100 test instances: • N=100, acc = 0.8 • Let 1-α = 0.95 (95% confidence)?• From probability table, Zα /2=1.96 N 50 100 500 1000 5000 p(lower) 0.670 0.711 0.763 0.774 0.789 p(upper) 0.888 0.866 0.833 0.824 0.811 1-α Z 0.99 2.58 0.98 2.33 0.95 1.96 0.90 1.65 Standard Normal distribution Performance Metrics for Multicalss AI Turn it into Binary Classification {red, blue, green, yellow} → n = 4 One vs. All (Rest) One vs. One •Binary Classification Problem 1: red vs [blue, green, yellow] •Binary Classification Problem 2: blue vs [red, green, yellow] •Binary Classification Problem 3: green vs [red, blue, yellow] •Binary Classification Problem 4: yellow vs [red, blue, green] •Binary Classification Problem 1: red vs. blue •Binary Classification Problem 2: red vs. green •Binary Classification Problem 3: red vs. yellow •Binary Classification Problem 4: blue vs. green •Binary Classification Problem 5: blue vs. yellow •Binary Classification Problem 6: green vs. yellow n classes → n binary classifiers n classes → n(n-1)/2 binary classifiers Performance Metrics for Multicalss AI Turn it into Binary Classification One vs. All (Rest) One vs. One Breakout Session One vs. All (Rest) One vs.

Option 1

Low Cost Option
Download this past answer in few clicks

16.89 USD

PURCHASE SOLUTION

Already member?


Option 2

Custom new solution created by our subject matter experts

GET A QUOTE