Statistics — Tests of independence

Tests of independence:

Basic principle is the same as ${\chi}^2$ – goodness of fit test
* Between categorical variables

${\chi}^2$-square tests:

The standard approach is to compute expected counts, and find the
distribution of sum of square of difference between expected counts and ordinary
counts(normalized).
* Between Numerical Variables

${\chi}^2$-square test:

  • Between a categorical and numerical variable?

Null Hypothesis:

  • The two variables are independent.
  • Always a right-tail test
  • Test statistic/measure has a ${\chi}^2$ distribution, if assumptions are met:
  • Data are obtained from a random sample
  • Expected frequency of each category must be
    atleast 5
  • ### Properties of the test:
  • The data are the observed frequencies.
  • The data is arranged into a contingency table.
  • The degrees of freedom are the degrees of freedom for the row variable times the degrees of freedom for the column variable. It is not one less than the sample size, it is the product of the two degrees of freedom.
  • It is always a right tail test.
  • It has a chi-square distribution.
  • The expected value is computed by taking the row total times the column total and dividing by the grand total
  • The value of the test statistic doesn’t change if the order of the rows or columns are switched.
  • The value of the test statistic doesn’t change if the rows and columns are interchanged (transpose of the matrix