Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1""" 

2Standard checks of randomization. These are mainly used in the unit test suite to sanity check 

3randomization utilities in this package. 

4""" 

5 

6import logging 

7from typing import Union, List 

8 

9from numpy import ndarray, median, sqrt 

10 

11# set logging 

12logging.basicConfig(level=logging.INFO) 

13logger = logging.getLogger(__name__) 

14 

15# define public functions (ignored by jupyter notebooks) 

16__all__ = [ 

17 "runs_test" 

18] 

19 

20#################################################################################################### 

21 

22 

23def runs_test(arr: Union[ndarray, List]) -> float: 

24 """ 

25 runs_test 

26 

27 Run tests are a very simple method of sanity checking a set of random numbers. A run is defined 

28 as a series of increasing values or a series of decreasing values. The number of increasing, or 

29 decreasing, values is the length of the run. 

30 

31 In a random data set, the probability that the (I+1)th value is larger or smaller than the Ith 

32 value follows a binomial distribution, which forms the basis of the runs test. 

33 

34 Null Hypothesis: The sequence was produced in a random manner. 

35 

36 Frequentist test statistics can be viewed as thresholds on signal to noise ratios (see equation 

37 below). For this test, the signal is difference in actual number of runs and expected number of 

38 runs given sample size. 

39 

40 test statistis = Z = signal / noise = (R - R_bar) / sigma_R 

41 

42 Parameters 

43 ---------- 

44 arr: ndarray, list 

45 A 1d array or list of values to evaluate for "randomness" 

46 

47 Returns 

48 ------- 

49 

50 Examples 

51 -------- 

52 >>> random_arr = np.radnom.normal(0, 10, 1000) 

53 >>> z_statistic = runs_test(random_arr) 

54 

55 References 

56 ---------- 

57 Bradley 

58 * Distribution-Free Statistical Tests (1968), Chapter 12 

59 NIST 

60 * Engineering Statistics Handbook 1.3.5.13 

61 

62 """ 

63 runs, n1, n2 = 0, 0, 0 

64 arr_median = median(arr) 

65 

66 # Checking for start of new run 

67 for i in range(len(arr)): 

68 # no. of runs 

69 if (arr[i] >= arr_median > arr[i - 1]) or (arr[i] < arr_median <= arr[i - 1]): 

70 runs += 1 

71 # no. of positive values 

72 if arr[i] >= arr_median: 

73 n1 += 1 

74 # no. of negative values 

75 else: 

76 n2 += 1 

77 

78 runs_exp = ((2 * n1 * n2) / (n1 + n2)) + 1 

79 stan_dev = sqrt((2 * n1 * n2 * (2 * n1 * n2 - n1 - n2)) / (((n1 + n2) ** 2) * (n1 + n2 - 1))) 

80 return (runs - runs_exp) / stan_dev