Statistical hypotheses. Null and alternative hypotheses

The sample data obtained in research is always limited and largely random. That is why mathematical statistics is used to analyze such data, which makes it possible to generalize the patterns obtained from the sample and extend them to the entire population.

Let us emphasize once again that the data obtained as a result of an experiment on any sample serves as the basis for making judgments about the general population. However, due to random probabilistic reasons, an estimate of the parameters of a population made on the basis of experimental (sample) data will always be accompanied by an error, and therefore such estimates should be considered as conjectural, and not as definitive statements.

As G.V. points out. Sukhodolsky: “A statistical hypothesis is usually understood as a formal assumption that the similarity (or difference) of some parametric or functional characteristics by chance or, conversely, not by chance.” Similar assumptions about the properties and parameters of the general population, differences in samples, or dependencies between characteristics are called statistical hypotheses.

The essence of testing a statistical hypothesis is to establish whether the experimental data and the hypothesis put forward are consistent; is it permissible to attribute the discrepancy between the hypothesis and the result of statistical analysis of experimental data to random causes? Thus, a statistical hypothesis is a scientific hypothesis that allows statistical testing, and mathematical statistics is a scientific discipline whose task is the scientifically based testing of statistical hypotheses.

When testing statistical hypotheses, two concepts are used: the so-called null (designation H 0) and alternative hypothesis (symbol H 1).

When comparing distributions, it is generally accepted that null hypothesis H 0 is a hypothesis about similarity, and alternative H 1 - difference hypothesis. Thus, accepting the null hypothesis H 0 indicates the absence of differences, and accepting the null hypothesis H 1 indicates the presence of differences.

For example, two samples are drawn from normally distributed populations and we are faced with the task of comparing these samples. One sample has parameters and σ 1 , and the other parameters and σ 2. Null hypothesis H 0 proceeds from the assumption that = uσ 1 = σ 2, that is, the difference between two averages =0 and the difference of two standard deviations σ 1 σ 2 ,=0 (hence the name of the hypothesis - null).

Accepting an alternative hypothesis H 1 indicates the presence of differences and is based on the assumption that ≠0 and σ 1 σ 2 ,≠0.


Very often the alternative hypothesis is called experimental hypothesis, if the study aims to prove the existence of differences between samples. If the researcher wants to prove precisely the absence of differences, then the experimental hypothesis is the null hypothesis.

When comparing samples, alternative statistical hypotheses can be directional or non-directional.

If we notice that in one sample the individual values ​​of subjects for some attribute are higher, and in another - lower, then to check the differences between the samples we formulate directional hypothesis . If we want to prove that more pronounced changes occurred in one group under the influence of some experimental influences, it is also necessary to formulate a directional hypothesis. Formally it is written like this H 1: x 1 exceeds x 2. The null hypothesis looks like this: H 0: x 1 does not exceed x 2.

If we want to prove that the forms of distribution differ, then we formulate non-directional hypotheses . Formally they are written like this H 1: x 1 is different from x 2. Null hypothesis H 0: x 1 is no different from x 2.

Generally speaking, when accepting or rejecting hypotheses, various options are possible.

When testing a hypothesis, experimental data may contradict the hypothesis H 0, then this hypothesis is rejected. Otherwise, i.e. if the experimental data are consistent with the hypothesis H 0, it is not rejected. Often in such cases they say that the hypothesis H 0 is accepted (although this formulation is not entirely accurate, but it is widespread). This shows that statistical testing of hypotheses based on experimental, sample data is inevitably associated with the risk (probability) of making a false decision. In this case, errors of two kinds are possible. Error of the first kind will occur when a decision is made to reject the hypothesis H 0, although in reality it turns out to be true. Error of the second type will occur when a decision is made not to reject the hypothesis H 0, although in reality it will be incorrect. It is obvious that the correct conclusions can also be adopted in two cases. The above can be presented in the form of table 25.

Since statistics as a research method deals with data in which the patterns of interest to the researcher are distorted by various random factors, most statistical calculations are accompanied by testing of some assumptions or hypotheses about the source of this data.

Pedagogical hypothesis (scientific hypothesis information about the advantage of one or another method) in the process of statistical analysis is translated into the language of statistical science and reformulated in at least two statistical hypotheses.

There are two types of hypotheses: the first type - descriptive hypotheses that describe causes and possible consequences. Second type - explanatory : they provide an explanation of possible consequences from certain causes, and also characterize the conditions under which these consequences will necessarily follow, i.e., they explain due to what factors and conditions this consequence will occur. Descriptive hypotheses do not have foresight, but explanatory hypotheses do. Explanatory hypotheses lead researchers to assume the existence of certain regular connections between phenomena, factors and conditions.

Hypotheses in educational research may suggest that one of the means (or a group of them) will be more effective than other means. Here, a hypothetical assumption is made about the comparative effectiveness of means, methods, methods, and forms of training.

A higher level of hypothetical prediction is that the author of the study hypothesizes that a certain system of measures will not only be better than another, but that among a number of possible systems it seems optimal from the point of view of certain criteria. Such a hypothesis needs even more rigorous and therefore more detailed proof.

Kulaichev A.P. Methods and tools for data analysis in the Windows environment. Ed. 3rd, revised and additional

- M: InKo, 1999, pp. 129-131

Psychological and pedagogical dictionary for teachers and heads of educational institutions. – Rostov-n/D: Phoenix, 1998, p. 92

Based on the data collected in statistical studies, after processing, conclusions are drawn about the phenomena being studied. These conclusions are made by developing and testing statistical hypotheses. is any statement about the type or properties of the distribution of random variables observed in an experiment. Statistical hypotheses are tested statistical methods.

The hypothesis being tested is called main (zero) and is designated N 0 . In addition to the zero one, it also extends alternative (competing) hypothesis H 1, denying the main one . Thus, as a result of the test, one and only one of the hypotheses will be accepted , and the second will be rejected.

Types of errors. The hypothesis put forward is tested based on a study of a sample obtained from the general population. Due to the randomness of the sample, the test does not always reach the correct conclusion. The following situations may arise:
1. The main hypothesis is correct and it is accepted.
2. The main hypothesis is correct, but it is rejected.
3. The main hypothesis is not correct and it is rejected.
4. The main hypothesis is not true, but it is accepted.
In case 2 we talk about type I error, in the latter case we are talking about error of the second type.
Thus, from one sample it is accepted correct solution, and according to others - incorrect. The decision is made based on the value of some sampling function called statistical characteristic, statistical criterion or simply statistics. The set of values ​​for this statistic can be divided into two disjoint subsets:

  • N 0 is accepted (not rejected), called area of ​​hypothesis acceptance (admissible area);
  • subset of statistics values ​​at which the hypothesis N 0 is rejected (rejected) and the hypothesis is accepted N 1, called critical area.

Conclusions:

  1. criterion is called a random variable K, which allows you to accept or reject the null hypothesis H0.
  2. When testing hypotheses, two types of errors can be made.
    Error of the first kind is that the hypothesis will be rejected H 0 if true ("missing target"). The probability of making a type I error is denoted by α and is called level of significance. Most often in practice it is assumed that α = 0.05 or α = 0.01.
    Error of the second type is that the hypothesis H0 is accepted if it is false (a “false positive”). The probability of this type of error is denoted by β.

Classification of hypotheses

Main hypothesis N 0 about the value of the unknown parameter q of the distribution usually looks like this:
H 0: q = q 0.
Competing hypothesis N 1 may have the following form:
N 1: q < q 0 , N 1:q> q 0 or N 1: qq 0 .
Accordingly it turns out left-handed, right-handed or bilateral critical areas. Boundary points of critical areas ( critical points) are determined from the distribution tables of the corresponding statistics.

When testing a hypothesis, it is wise to reduce the likelihood of making incorrect decisions. Acceptable probability of type I error usually designated a and is called level of significance. Its value is usually small ( 0,1, 0,05, 0,01, 0,001 ...). But a decrease in the probability of a type I error leads to an increase in the probability of a second type error ( b), i.e. the desire to accept only correct hypotheses causes an increase in the number of rejected correct hypotheses. Therefore, the choice of significance level is determined by the importance of the problem posed and the severity of the consequences of an incorrect decision.
Testing a statistical hypothesis consists of the following steps:
1) definition of hypotheses N 0 and N 1 ;
2) choosing statistics and setting the significance level;
3) determination of critical points K cr and critical area;
4) calculation of the statistical value based on the sample K ex;
5) comparison of the statistics value with the critical region ( K cr And K ex);
6) decision making: if the statistical value is not in the critical area, then the hypothesis is accepted N 0 and the hypothesis is rejected H 1, and if it enters the critical region, then the hypothesis is rejected N 0 and the hypothesis is accepted N 1 . At the same time, the results of testing a statistical hypothesis should be interpreted as follows: if the hypothesis is accepted N 1 , then we can consider it proven, and if we accept the hypothesis N 0 , then they recognized that it does not contradict the results of observations. However, this property, along with N Other hypotheses may also have 0.

Classification of hypothesis tests

Let us next consider several different statistical hypotheses and mechanisms for testing them.
I) Hypothesis about the general mean of a normal distribution with unknown variance. We assume that the population has a normal distribution, its mean and variance are unknown, but there is reason to believe that the general mean is equal to a. At the significance level α, the hypothesis needs to be tested N 0: x =a. As an alternative, one of the three hypotheses discussed above can be used. IN in this case statistics is a random variable having a Student's t distribution with n– 1 degrees of freedom. The corresponding experimental (observed) value is determined t ex t cr N 1: x >a it is found according to the significance level α and the number of degrees of freedom n– 1. If t ex < t cr N 1: x ≠a the critical value is found at the significance level α / 2 and the same number of degrees of freedom. The null hypothesis is accepted if | t ex | II) The hypothesis about the equality of two average values ​​of randomly distributed populations (large independent samples). N At the significance level α, the hypothesis needs to be tested
,
0: x ≠y . If the volume of both samples is large, then we can assume that the sample means have a normal distribution and their variances are known. In this case, a random variable can be used as statistics having a normal distribution, and(M) = 0, Z(M D ) = 1. The corresponding experimental value is determined z ex . From the Laplace function table the critical value is found z cr N. Under the alternative hypothesis 1: x >y it is found from the condition(. From the Laplace function table the critical value is found) = 0,5 – a F . If< z кр z ex N, then the null hypothesis is accepted, otherwise it is rejected. Under the alternative hypothesis 1: x >y it is found from the condition(. From the Laplace function table the critical value is found 1: x ≠y the critical value is found from the condition a) = 0.5×(1 – ). The null hypothesis is accepted if |< z кр .

z ex | N III) Hypothesis about the equality of two average values ​​of normally distributed populations, the variances of which are unknown and identical (small independent samples). At the significance level α, the main hypothesis needs to be tested
,
0: x =y . We use a random variable as statistics having a Student distribution with ( + n x n y t ex– 2) degrees of freedom. The corresponding experimental value is determined t cr. From the table of critical points of the Student distribution, the critical value is found

. Everything is solved similarly to hypothesis (I). IV) Hypothesis about the equality of two variances of normally distributed populations a. In this case, at the significance level N 0: Z(need to test the hypothesis) = Z(X Y ). The statistics is a random variable having a Fisher–Snedecor distribution with 1 = f n b ). The statistics is a random variable having a Fisher–Snedecor distribution with 2 = – 1 and n m f– 1 degrees of freedom (S 2 b – large dispersion, its sample volume ). The corresponding experimental (observed) value is determined F ex . Critical value F cr N 1: Z(need to test the hypothesis) > Z(X under alternative hypothesis a) is found from the table of critical points of the Fisher–Snedecor distribution by significance level ). The statistics is a random variable having a Fisher–Snedecor distribution with and the number of degrees of freedom ). The statistics is a random variable having a Fisher–Snedecor distribution with 1 and ). The corresponding experimental (observed) value is determined < . Critical value.

2. The null hypothesis is accepted if

Instructions. To calculate, you must specify the dimension of the source data. a. In this case, at the significance level N 0: Z(need to test the hypothesis 1) = Z(need to test the hypothesis 2) = …= Z(V) The hypothesis about the equality of several variances of normally distributed populations across samples of the same size. In this case, at the significance level X l ). The statistics is a random variable having a Fisher–Snedecor distribution with = n n b ). Statistics is a random variable (, having a Cochran distribution with degrees of freedom l ). Statistics is a random variable– number of samples). This hypothesis is tested in the same way as the previous one. The table of critical points of the Cochran distribution is used.

VI) Hypothesis about the significance of the correlation. In this case, at the significance level a. In this case, at the significance level N 0: r= 0. (If the correlation coefficient equal to zero, then the corresponding quantities are not related to each other). The statistics in this case is a random variable
,
having a Student distribution with ). The statistics is a random variable having a Fisher–Snedecor distribution with = n– 2 degrees of freedom. The test of this hypothesis is carried out similarly to the test of hypothesis (I).

Instructions. Specify the amount of input data.

VII) A hypothesis about the value of the probability of an event occurring. Enough done a large number of n independent trials in which the event A happened m once. There is reason to believe that the probability of a given event occurring in one trial is equal to p 0. Required at significance level a test the hypothesis that the probability of an event A equal to hypothetical probability p 0. (Since probability is assessed by relative frequency, the hypothesis being tested can be formulated in another way: whether the observed relative frequency and the hypothetical probability differ significantly or not).
The number of trials is quite large, so the relative frequency of the event A distributed according to the normal law. If the null hypothesis is true, then its mathematical expectation is p 0, and dispersion . In accordance with this, we choose a random variable as statistics
,
which is distributed approximately according to the normal law with zero mathematical expectation and unit variance. This hypothesis is tested in exactly the same way as in case (I).

Instructions. To calculate, you must fill in the initial data.

STATISTICAL HYPOTHESES

The sample data obtained in experiments is always limited and largely random in nature. That is why mathematical statistics is used to analyze such data, which makes it possible to generalize the patterns obtained from the sample and extend them to the entire population.

The data obtained as a result of an experiment on any sample serves as the basis for making judgments about the general population. However, due to random probabilistic reasons, an estimate of the parameters of a population made on the basis of experimental (sample) data will always be accompanied by an error, and therefore such estimates should be considered as conjectural, and not as definitive statements. Such assumptions about the properties and parameters of the population are called statistical hypotheses . As G.V. points out. Sukhodolsky: “A statistical hypothesis is usually understood as a formal assumption that the similarity (or difference) of certain parametric or functional characteristics is random or, conversely, non-random.”

The essence of testing a statistical hypothesis is to determine whether the experimental data and the proposed hypothesis are consistent, and whether the discrepancy between the hypothesis and the result of statistical analysis of experimental data can be attributed to random causes. Thus, a statistical hypothesis is a scientific hypothesis that can be tested statistically, and mathematical statistics is a scientific discipline whose task is to scientifically test statistical hypotheses.

Statistical hypotheses are divided into null and alternative, directional and non-directional.

Null hypothesis(H 0) is the hypothesis of no differences. If we want to prove the significance of the differences, then the null hypothesis is required refute, otherwise it is required confirm.

Alternative hypothesis (H 1) – hypothesis about the significance of differences. This is what we want to prove, which is why it is sometimes called experimental hypothesis.

There are problems when we want to prove just insignificance differences, that is, confirm the null hypothesis. For example, if we need to make sure that different subjects receive tasks, although different, but balanced in difficulty, or that the experimental and control samples do not differ from each other in some significant characteristics. However, more often we still need to prove the significance of the differences, because they are more informative for us in the search for something new.

The null and alternative hypotheses can be directional or non-directional.

Directional hypotheses – if the characteristic values ​​are assumed to be higher in one group and lower in the other:

H 0: X 1 does not exceed X 2,

H 1: X 1 exceeds X 2.

Non-directional hypotheses – if it is assumed that the forms of distribution of the characteristic in groups differ:

H 0: X 1 no different from X 2,

H 1: X 1 is different X 2.

If we notice that in one of the groups the individual values ​​of the subjects for some characteristic, for example, social activity, are higher, and in the other group they are lower, then to test the significance of these differences we need to formulate directional hypotheses.

If we want to prove that in the group A under the influence of some experimental influences, more pronounced changes occurred than in the group B, then we also need to formulate directional hypotheses.

If we want to prove that the forms of distribution of a characteristic in groups differ A And B, then non-directional hypotheses are formulated.

Hypotheses are tested using criteria for statistical assessment of differences.

The resulting conclusion is called a statistical decision. We emphasize that such a decision is always probabilistic. When testing a hypothesis, experimental data may contradict the hypothesis H 0, then this hypothesis is rejected. Otherwise, i.e. if the experimental data are consistent with the hypothesis H 0, she doesn't deviate. In such cases it is often said that the hypothesis H 0 accepted. This shows that statistical testing of hypotheses based on experimental sample data is inevitably associated with the risk (probability) of making a false decision. In this case, errors of two kinds are possible. A type I error will occur when a decision is made to reject a hypothesis H 0, although in reality it turns out to be true. A type II error will occur when a decision is made not to reject the hypothesis H 0, although in reality it will be incorrect. It is obvious that the correct conclusions can also be adopted in two cases. Table 7.1 summarizes the above.

Table 7.1

It is possible that the psychologist may be mistaken in his statistical decision; as we see from Table 7.1, these errors can be of only two types. Since it is impossible to eliminate errors when accepting statistical hypotheses, it is necessary to minimize possible consequences, i.e. accepting an incorrect statistical hypothesis. In most cases, the only way to minimize errors is to increase the sample size.

STATISTICAL CRITERIA

Statistical test is a decisive rule that ensures reliable behavior, that is, accepting a true hypothesis and rejecting a false one with high probability.

Statistical criteria also denote the method for calculating a certain number and the number itself.

When we say that the reliability of the differences was determined by the criterion j*(the criterion is the Fisher angular transformation), then we mean that we used the method j* to calculate a specific number.

By the ratio of the empirical and critical values ​​of the criterion, we can judge whether the null hypothesis is confirmed or refuted.

In most cases, in order for us to recognize the differences as significant, it is necessary that the empirical value of the criterion exceeds the critical value, although there are criteria (for example, the Mann-Whitney test or the sign test) in which we must adhere to the opposite rule.

In some cases, the calculation formula for the criterion includes the number of observations in the sample under study, denoted as n. In this case, the empirical value of the criterion is simultaneously a test for testing statistical hypotheses. Using a special table we determine which level statistical significance differences corresponds to this empirical value. An example of such a criterion is the criterion j*, calculated based on the Fisher angle transform.

In most cases, however, the same empirical criterion value may or may not be significant depending on the number of observations in the sample under study ( n) or on the so-called number of degrees of freedom, which is denoted as v or how df.

Number of degrees of freedom v is equal to the number of classes of the variation series minus the number of conditions under which it was formed. These conditions include sample size ( n), means and variances.

Let's say a group of 50 people was divided into three classes according to the principle:

Can work on a computer;

Can only perform certain operations;

Doesn't know how to work on a computer.

The first and second groups included 20 people, the third - 10.

We are limited by one condition – sample size. Therefore, even if we have lost data on how many people do not know how to work on a computer, we can determine this, knowing that there are 20 subjects in the first and second grades. We are not free to determine the number of subjects in the third category; “freedom” extends only to the first two cells of the classification:

5. The main problems of applied statistics - description of data, estimation and testing of hypotheses

Basic concepts used in hypothesis testing

A statistical hypothesis is any assumption concerning the unknown distribution of random variables (elements). Here are the formulations of several statistical hypotheses:

1. The observation results have a normal distribution with zero mathematical expectation.
2. Observation results have a distribution function N(0,1).
3. The observation results have a normal distribution.
4. The results of observations in two independent samples have the same normal distribution.
5. The results of observations in two independent samples have the same distribution.

There are null and alternative hypotheses. The null hypothesis is the hypothesis to be tested. An alternative hypothesis is every valid hypothesis other than the null hypothesis. The null hypothesis is denoted by H 0, alternative – H 1(from Hypothesis - “hypothesis” (English)).

The choice of certain null or alternative hypotheses is determined by the applied tasks facing a manager, economist, engineer, or researcher.

Let's look at examples. Example 11. N Let the null hypothesis be hypothesis 2 from the above list, and the alternative hypothesis 1. This means that the real situation is described by a probabilistic model, according to which the results of observations are considered as realizations of independent identically distributed random variables with a distribution function

N(0,σ), where the parameter σ is unknown to the statistician. Within this model, the null hypothesis is written as follows:

0: σ = 1,

N and an alternative like this:

1: σ ≠ 1. Example 12. N(m Let the null hypothesis be still hypothesis 2 from the list above, and the alternative hypothesis be hypothesis 3 from the same list. Then, in a probabilistic model of a managerial, economic or production situation, it is assumed that the observation results form a sample from a normal distribution m, σ) for some values

N 0: m and σ. Hypotheses are written like this:

= 0, σ = 1

N 1: m(both parameters take fixed values);

≠ 0 and/or σ ≠ 1 m(i.e. either m≠ 0, or σ ≠ 1, or

≠ 0, and σ ≠ 1). Example 13. N Let N 0 – hypothesis 1 from the above list, and

N 0: m 1 – hypothesis 3 from the same list. Then the probabilistic model is the same as in example 12,

N 1: m= 0, σ is arbitrary;

≠ 0, σ is arbitrary. Example 13. N Example 14. N 0 – hypothesis 2 from the above list, and according to 1: x >y it is found from the condition(1 observation results have a distribution function), x not coinciding with the standard normal distribution function F(x).

N 0: 1: x >y it is found from the conditionThen(x) = Ф(x) in front of everyone X 1: x >y it is found from the condition(written as);

N 1: 1: x >y it is found from the condition(x) ≡ Ф(x)(x 0) ≠ Ф(x 0) at some x 0 1: x >y it is found from the condition(written as).

(i.e. it is not true that Note. Here ≡ is the sign of identical coincidence of functions (i.e. coincidence for all possible values ​​of the argument).

X Example 13. N Example 15. N 1 observation results have a distribution function 1: x >y it is found from the condition(1 observation results have a distribution function), not being normal. Then

For some m, σ;

N 1: for any m, σ is found x 0 = x 0(m, σ) such that .

Example 16. Example 13. N 0 – hypothesis 4 from the list above, according to the probability model, two samples are drawn from populations with distribution functions 1: x >y it is found from the condition(1 observation results have a distribution function) And G(1 observation results have a distribution function), being normal with parameters m 1 , σ 1 and m 2 , σ 2 respectively, and N 1 – negation N 0 . Then

N 0: m 1 = m 2, σ 1 = σ 2, and m 1 and σ 1 are arbitrary;

N 1: m 1 ≠ m 2 and/or σ 1 ≠ σ 2 .

Example 17. Let us additionally know under the conditions of Example 16 that σ 1 = σ 2 . Then

N 0: m 1 = m 2 , σ > 0, and m 1 and σ are arbitrary;

N 1: m 1 ≠ m 2, σ > 0.

Example 18. Example 13. N 0 – hypothesis 5 from the list above, according to the probability model, two samples are drawn from populations with distribution functions 1: x >y it is found from the condition(1 observation results have a distribution function) And G(1 observation results have a distribution function) accordingly, and N 1 – negation N 0 .

N 0: 1: x >y it is found from the condition(1 observation results have a distribution function) G(1 observation results have a distribution function) Then 1: x >y it is found from the condition(1 observation results have a distribution function)

N 1: 1: x >y it is found from the condition(1 observation results have a distribution function) And G(1 observation results have a distribution function) , Where

1: x >y it is found from the condition(1 observation results have a distribution function) G(1 observation results have a distribution function) - arbitrary distribution functions, and Here ≡ is the sign of identical coincidence of functions (i.e. coincidence for all possible values ​​of the argument.

with some Example 19. 1: x >y it is found from the condition(1 observation results have a distribution function) And G(1 observation results have a distribution function) Let, under the conditions of Example 17, it is additionally assumed that the distribution functions G(1 observation results have a distribution function) = 1: x >y it is found from the condition(1 observation results have a distribution functiondiffer only in shift, i.e.- A) at some A

N 0: 1: x >y it is found from the condition(1 observation results have a distribution function) G(1 observation results have a distribution function) ,

. Then 1: x >y it is found from the condition(1 observation results have a distribution function) Where

N 1: G(1 observation results have a distribution function) = 1: x >y it is found from the condition(1 observation results have a distribution function– arbitrary distribution function; 0,

. Then 1: x >y it is found from the condition(1 observation results have a distribution function) - a), a ≠

– arbitrary distribution function. Example 20. 1: x >y it is found from the condition(1 observation results have a distribution function) Let us additionally know under the conditions of Example 14 that, according to the probabilistic model of the situation N(m- normal distribution function with unit variance, i.e. looks like

N 0: m = 0 , 1). Then 1: x >y it is found from the conditionThen

(those. in front of everyone in front of everyone 1: x >y it is found from the condition(written as);

N 1: m 0

);(written as 1: x >y it is found from the condition(written as).

(i.e. it is not true that Example 21.

N 0: m = m 0 ,

N 1: m= m 1 ,

When statistically regulating technological, economic, managerial or other processes, a sample taken from a population with a normal distribution and known variance, and hypotheses are considered m = m 0 where is the parameter value m= m 1 corresponds to the established course of the process, and the transition to

indicates a disorder. Example 22. In statistical acceptance control, the number of defective product units in the sample is subject to a hypergeometric distribution; the unknown parameter is = Z/ N p N– level of defects, where Z– volume of product batch,

N 0: In statistical acceptance control, the number of defective product units in the sample is subject to a hypergeometric distribution; the unknown parameter is < – the total number of defective units in the batch. Control plans used in regulatory, technical and commercial documentation (standards, supply contracts, etc.) are often aimed at testing a hypothesis

N 1: In statistical acceptance control, the number of defective product units in the sample is subject to a hypergeometric distribution; the unknown parameter is > AQL,

L.Q. – the total number of defective units in the batch. Control plans used in regulatory, technical and commercial documentation (standards, supply contracts, etc.) are often aimed at testing a hypothesis Where AQL – acceptance level of defects, – the total number of defective units in the batch. Control plans used in regulatory, technical and commercial documentation (standards, supply contracts, etc.) are often aimed at testing a hypothesis < AQL).

– rejection level of defects (obviously Example 23. v = σ/ having a normal distribution, and(As indicators of the stability of a technological, economic, managerial or other process, a number of characteristics of the distributions of controlled indicators are used, in particular, the coefficient of variation X

N 0: v < v 0

under alternative hypothesis

N 1: v > v 0 ,

L.Q. v 0 – some predetermined boundary value.

Example 24. Let the probabilistic model of two samples be the same as in example 18, we denote the mathematical expectations of the observation results in the first and second samples M(need to test the hypothesis) And M(U) respectively. In a number of situations, the null hypothesis is tested

N 0: M(X) = M(Y)

against alternative hypothesis

N 1: M(X) ≠ M(Y).

Example 25. It was noted above great importance in mathematical statistics of distribution functions symmetric about 0, When checking symmetry

N 0: 1: x >y it is found from the condition(- 1 observation results have a distribution function) = 1 – 1: x >y it is found from the condition(1 observation results have a distribution function) in front of everyone 1 observation results have a distribution function, otherwise 1: x >y it is found from the condition arbitrary;

N 1: 1: x >y it is found from the condition(- 1 observation results have a distribution function 0 ) ≠ 1 – 1: x >y it is found from the condition(1 observation results have a distribution function 0 ) at some 1 observation results have a distribution function 0 , otherwise 1: x >y it is found from the condition arbitrary.

In probabilistic-statistical methods of decision making, many other formulations of problems for testing statistical hypotheses are used.

Some of them are discussed below. Specific task statistical hypothesis testing is fully described if the null and alternative hypotheses are specified. The choice of method for testing a statistical hypothesis, the properties and characteristics of the methods are determined by both the null and alternative hypotheses. To test the same null hypothesis under different alternative hypotheses, one should generally use various methods

. So, in examples 14 and 20, the null hypothesis is the same, but the alternative ones are different. Therefore, in the conditions of example 14, methods based on criteria of agreement with a parametric family (Kolmogorov type or omega-square type) should be used, and in the conditions of example 20, methods based on the Student’s criterion or the Cramer-Welch criterion. If in the conditions of example 14 we use the Student’s criterion, then it will not solve the problems. If, under the conditions of example 20, we use a Kolmogorov-type goodness-of-fit criterion, then, on the contrary, it will solve the problems posed, although perhaps worse than the Student’s t-test, specially adapted for this case. When processing real data, it is of great importance right choice N 0 and N hypotheses

1 . The assumptions made, for example, the normality of the distribution, must be carefully justified, in particular, by statistical methods. Note that in the vast majority of specific applied settings, the distribution of observation results is different from normal. A situation often arises when the type of the null hypothesis follows from the formulation of the applied problem, but the type of the alternative hypothesis is not clear. In such cases, the alternative hypothesis should be considered most general view N 1 . In particular, when testing hypothesis 2 (from the list above) as null, you should use N 1 from example 14, and not from example 20, unless there is special justification for the normality of the distribution of observation results under the alternative hypothesis.

Previous