Figure 1: Multivariate outlier detection using pairwise scatter diagram

Figure 2: Mahalanobis Distance plot

Figure 3: leverage values plot

Figure 4: Regression Model Residuals plot

Figure 5: Cook’s Distance plot

Figure 6: Standard diagnostic plots

Figure 7: DFFITS plot

Figure 8: Standardized residuals plot

Figure 9: Studentized residuals plot

Figure 10: Box plot

Figure 11: Multivariate outlier detection using Scatter plot

Figure 12: Mahalanobis Distance plot

Figure 13: Leverage values plot

Figure 14: Regression Model Residuals plot

Figure 15: Cook’s Distance plot

Figure 16: Standard diagnostic plots

Figure 17: DFFITS plot

Figure 18: Standardized residuals plot

Figure 19: Studentized residuals plot

Figure 20: Box plot

Sl No.

Methods

Formula

Cut-off value

1.

Mahalanobis Distance (Md)

 

 

 

2.

Leverage Point (hi)

H = X ( X T X )-1 X T .

2 p / n, 3 p / n

 

3.

DFFITS

 

 

4.

Standardized Residuals

 

If ti | > 3 then we can say that it is outlier.

5.

Studentized Residuals

 

If ti | > 3 then we can say that it is outlier.

 

6.

 

DFBETAS

DFBETA


 

 

 

7.

 

Cook’s Distance (Di)

 

Di   > 1

 

8.

 

Proposed method (median|ε|-MAD|ε|)

 

 

 

9.

Proposed method

tn-1 (a%)

Table 1: Outlier Detection Methods

Methods

Case Number

No. of outliers

Mahalanobis Distance(MDi )

-

0

Leverage Point (hii)

-

0

DFFITS

7,11,12

3

Standardized residual

-

0

Studentized residual

11

1

 

 

DFBETAS

Intercept

7,11,12

3

x1

3,7,12,16

4

x2

3,5,11,16

4

x3

7,11,12,16

4

x4

16

1

x5

3,7,11

3

Cook’s distance (CDi)

 

-

0

Proposed method (median|ε|-MAD|ε|)

 

11

1

Proposed method

-

0

Table 2: Number of Outliers detected by various measures

Methods

Case Number

No. of outliers

Mahalanobis Distance(MDi )

22,23,24

3

Leverage Point (hii)

22,23,24

3

DFFITS

17,18,20,22,23,24

6

Standardized residual

23

1

Studentized residual

23,24

2

 

 

DFBETAS

Intercept

24

1

x1

15,23,24

3

x2

15,17,18,29,23,24

6

x3

23,24

2

x4

15,18,20,23,24

5

x5

15,17,22,23,24

5

x6

15,17,20,23,24

5

x7

15,17,20,23,24

5

Cook’s distance (CDi)

22,23,24

3

Proposed method (median|ε|-MAD|ε|)

15,20,21

3

Proposed method

20,21

2

Table 3: Number of Outliers detected by various measures

Methods

n=100 (in percentage)

n=500(in percentage)

n=1000(in percentage)

Mahalanobis Distance(MDi)

67.88

72.74

83.67

Leverage Point (hii)

68.07

78.57

86.06

DFFITS

71.05

88.46

92.85

Standardized residual

73.45

84.17

89.01

Studentized residual

65.13

73.48

84.06

DFBETAS

82.57

89.99

96.57

Cook’s distance (CDi)

68.44

75.82

89.67

Proposed method (median|ε|-MAD|ε|)

4.97

7.87

10.52

Proposed method

3.20

6.80

9.02

Table 4: Outlier detection percentage when dataset free from outlier

Methods

n=100 (in percentage)

n=500(in percentage)

n=1000(in percentage)

Mahalanobis Distance(MDi)

72.11

79.54

83.09

Leverage Point (hii)

73.28

78.56

81.88

DFFITS

87.89

92.87

96.35

Standardized residual

76.17

86.39

90.09

Studentized residual

75.65

87.36

93.14

DFBETAS

81.89

91.87

96.35

Cook’s distance (CDi)

78.99

87.05

92.06

Proposed method (median|ε|-MAD|ε|)

93.05

96.88

99.87

Proposed method

91.01

94.78

99.07

Table 5: Outlier detection percentage when dataset contain 5% outliers