Coverage for lind/design/randomization/

Hot-keys on this page

r m x p toggle line displays

j k next/prev highlighted chunk

0 (zero) top of page

1 (one) first highlighted chunk

1"""

2Standard checks of randomization. These are mainly used in the unit test suite to sanity check

3randomization utilities in this package.

4"""

6import logging

7from typing import Union, List

9from numpy import ndarray, median, sqrt

11# set logging

12logging.basicConfig(level=logging.INFO)

13logger = logging.getLogger(__name__)

15# define public functions (ignored by jupyter notebooks)

16__all__ = [

17 "runs_test"

18]

20####################################################################################################

23def runs_test(arr: Union[ndarray, List]) -> float:

24 """

25 runs_test

27 Run tests are a very simple method of sanity checking a set of random numbers. A run is defined

28 as a series of increasing values or a series of decreasing values. The number of increasing, or

29 decreasing, values is the length of the run.

31 In a random data set, the probability that the (I+1)th value is larger or smaller than the Ith

32 value follows a binomial distribution, which forms the basis of the runs test.

34 Null Hypothesis: The sequence was produced in a random manner.

36 Frequentist test statistics can be viewed as thresholds on signal to noise ratios (see equation

37 below). For this test, the signal is difference in actual number of runs and expected number of

38 runs given sample size.

40 test statistis = Z = signal / noise = (R - R_bar) / sigma_R

42 Parameters

43 ----------

44 arr: ndarray, list

45 A 1d array or list of values to evaluate for "randomness"

47 Returns

48 -------

50 Examples

51 --------

52 >>> random_arr = np.radnom.normal(0, 10, 1000)

53 >>> z_statistic = runs_test(random_arr)

55 References

56 ----------

57 Bradley

58 * Distribution-Free Statistical Tests (1968), Chapter 12

59 NIST

60 * Engineering Statistics Handbook 1.3.5.13

62 """

63 runs, n1, n2 = 0, 0, 0

64 arr_median = median(arr)

66 # Checking for start of new run

67 for i in range(len(arr)):

68 # no. of runs

69 if (arr[i] >= arr_median > arr[i - 1]) or (arr[i] < arr_median <= arr[i - 1]):

70 runs += 1

71 # no. of positive values

72 if arr[i] >= arr_median:

73 n1 += 1

74 # no. of negative values

75 else:

76 n2 += 1

78 runs_exp = ((2 * n1 * n2) / (n1 + n2)) + 1

79 stan_dev = sqrt((2 * n1 * n2 * (2 * n1 * n2 - n1 - n2)) / (((n1 + n2) ** 2) * (n1 + n2 - 1)))

80 return (runs - runs_exp) / stan_dev

Coverage for lind/design/randomization/_checks.py : 100%

18 statements 18 run 0 missing 0 excluded