Clients, consumers - this is not just a collection of information, but a full-fledged study. And the purpose of any research is a scientifically based interpretation of the studied facts. The primary material must be processed, namely, ordered and analyzed. After the survey of the respondents, the analysis of the research data takes place. This is a key step. It is a set of techniques and methods aimed at checking how true the assumptions and hypotheses were, as well as answering questions. questions asked. This stage is perhaps the most difficult in terms of intellectual efforts and professional qualifications, however, it allows you to get the maximum useful information from the collected data. Data analysis methods are diverse. The choice of a specific method depends, first of all, on what questions we want to get an answer to. Two classes of analysis procedures can be distinguished:

  • one-dimensional (descriptive) and
  • multidimensional.

The purpose of univariate analysis is to describe one characteristic of the sample in certain moment time. Let's consider in more detail.

One-Dimensional Data Analysis Types

Quantitative Research

Descriptive analysis

Descriptive (or descriptive) statistics are the basic and most common method of data analysis. Imagine that you are conducting a survey with the aim of compiling a portrait of the consumer of the product. Respondents indicate their gender, age, marital and professional status, consumer preferences, etc., and descriptive statistics provide information on the basis of which the entire portrait will be built. In addition to the numerical characteristics, a variety of graphs are created to help visualize the results of the survey. All this variety of secondary data is united by the concept of "descriptive analysis". The numerical data obtained during the study are most often presented in the final reports in the form of frequency tables. The tables can represent different types of frequencies. Let's look at an example: Potential demand for the product

  1. The absolute frequency shows how many times a particular answer is repeated in the sample. For example, 23 people would buy the proposed product worth 5,000 rubles, 41 people - worth 4,500 rubles. and 56 people - 4399 rubles.
  2. The relative frequency shows what proportion this value is of the total sample size (23 people - 19.2%, 41 - 34.2%, 56 - 46.6%).
  3. Cumulative or cumulative frequency indicates the proportion of sample elements that do not exceed a certain value. For example, a change in the percentage of respondents who are ready to purchase a particular product with a decrease in the price of it (19.2% of respondents are ready to buy goods for 5000 rubles, 53.4% ​​- from 4500 to 5000 rubles, and 100% - from 4399 to 5000 rub.).

Along with frequencies, descriptive analysis involves the calculation of various descriptive statistics. True to their name, they provide basic information about the received data. To clarify, the use of specific statistics depends on the scales in which the background information. Nominal scale used to fix objects that do not have a ranked order (gender, place of residence, preferred brand, etc.). For this kind of data array, it is impossible to calculate any significant statistical indicators, except for fashion— the most frequent value of the variable. The situation is somewhat better in terms of analysis ordinal scale . Here it becomes possible, along with fashion, to calculate medians– value that divides the sample into two equal parts. For example, if there are several price intervals for a product (500-700 rubles, 700-900, 900-1100 rubles), the median allows you to set the exact cost, more or less than which consumers are willing to purchase or, conversely, refuse to purchase. The richest in all possible statistics are quantitative scales , which are series of numerical values ​​that have equal intervals between themselves and are measurable. Examples of such scales are income level, age, shopping time, etc. In this case, the following information becomes available measures: mean, range, standard deviation, standard error of the mean. Of course, the language of numbers is rather "dry" and very incomprehensible to many. For this reason, descriptive analysis is complemented by data visualization by constructing various charts and graphs, such as histograms, line, pie or scatter plots.

Contingency and correlation tables

Contingency tables is a means of representing the distribution of two variables, designed to explore the relationship between them. Cross tables can be considered as a particular type of descriptive analysis. It is also possible to present information in the form of absolute and relative frequencies, graphical visualization in the form of histograms or scatter plots. Contingency tables are most effective in determining the relationship between nominal variables (for example, between gender and the fact of consumption of a product). In general, the contingency table looks like this. Relationship between gender and use of insurance services

Statistics is a multidiscipline because it uses methods and principles borrowed from other disciplines. So, as a theoretical basis for the formation of statistical science, knowledge in the field of sociology and economic theory. Within the framework of these disciplines, the laws of social phenomena are studied. Statistics helps to assess the scale of a phenomenon, as well as to develop a system of methods for analysis and study. Statistics is undoubtedly related to mathematics, since in order to identify patterns, evaluate and analyze the object of study, a number of mathematical operations, methods and laws, and the systematization of the results is reflected in the form of graphs and tables.

Types of statistical research

Observation as the initial stage of the study is associated with the collection of initial data on the issue under study. It is characteristic of many sciences. However, each science has its own specifics, differing in its observations. Therefore, not every observation is statistical.

Statistical research is a scientifically organized collection, summary and analysis of data (facts) on socio-economic, demographic and other phenomena and processes of public life in the state, with the registration of their most significant features in accounting documentation, scientifically organized according to a single program.

Distinctive features (specifics) of statistical research are: purposefulness, organization, mass character, consistency (complexity), comparability, documentation, controllability, practicality.

In general, a statistical study should:

To have a socially useful goal and universal (state) significance;

Relate to the subject of statistics in the specific conditions of its place and time;

Express the statistical type of accounting (and not accounting and not operational);

Carried out according to a pre-developed program with its scientifically based methodological and other support;

To carry out the collection of mass data (facts), which reflect the entire set of cause-and-effect and other factors that characterize the phenomenon in many ways;

Register in the form of accounting documents of the established form;

Guarantee the absence of observational errors or reduce them to the minimum possible;

Provide for certain quality criteria and ways to control the collected data, ensuring their reliability, completeness and content;

Focus on cost-effective technology for collecting and processing data;

To be a reliable information base for all subsequent stages of statistical research and all users of statistical information.

Studies that do not meet these requirements are not statistical. Statistical studies are not, for example, observations and studies: mothers with a playing child (personal question); spectators at a theatrical production (there is no accounting documentation for the spectacle); a researcher for physical and chemical experiments with their measurements, calculations and documentary registration (not mass-public data); a doctor for patients with the maintenance of medical cards (operational records); accountant for the movement of funds in the bank account of the enterprise (accounting); journalists for the public and private life of government officials or other celebrities (not the subject of statistics).

Statistical population - a set of units that have mass character, typicality, qualitative uniformity and the presence of variation.

The statistical population consists of materially existing objects (Employees, enterprises, countries, regions), is the object of statistical research.

Statistical observation is the first stage of statistical research, which is a scientifically organized collection of data on the studied phenomena and processes of social life.

Therefore, a statistical table is usually defined as a form of compact visual presentation of statistical data.

The analysis of tables makes it possible to solve many problems in the study of changes in phenomena over time, the structure of phenomena and their interrelations. Thus, statistical tables play the role of a universal means of rational representation, generalization and analysis of statistical information.

Externally statistical table is a system of specially constructed horizontal lines and vertical columns that have a common heading, headings of columns and lines, at the intersection of which statistical data is recorded.

Each figure in the statistical tables is a specific indicator that characterizes the size or levels, dynamics, structure or relationships of phenomena in specific conditions of place and time, that is, a certain quantitative and qualitative characteristic of the phenomenon under study.

If the table is not filled with numbers, that is, it has only a general heading, column and row headings, then we have a layout of a statistical table. It is with its development that the process of compiling statistical tables begins.

The main elements of the statistical table are subject and predicate of the table.

Table subject- this is the object of statistical study, that is, individual units of the population, their groups or the entire population as a whole.

Table predicate are statistical indicators that characterize the object under study.

The subject and indicators of the predicate of the table must be determined very precisely. As a rule, the subject is located on the left side of the table and makes up the content of the lines, and the predicate is on the right side of the table and makes up the content of the columns.

Usually, when arranging the indicators of the predicate in the table, the following rule is followed: first, absolute indicators are given that characterize the volume of the population under study, then - calculated relative indicators that reflect the structure, dynamics and relationships between the indicators.

Building analytical tables

The construction of analytical tables is as follows. Any table consists of a subject and a predicate. The subject reveals the economic phenomenon referred to in this table and contains a set of indicators that reflect this phenomenon. The predicate of the table explains which features display the subject.

Some tables reflect changes in the structure of any. Such tables contain information on the composition of the analyzed economic phenomenon both in the base and in the reporting period. Based on these data, the share (specific gravity) of each part in the total population is determined and deviations from the basic specific gravity for each part are calculated.

Separate tables may reflect the relationship between economic indicators for some reason. In such tables, information on a given economic indicator is arranged in ascending or descending order of numerical values ​​characterizing this indicator.

In the economic analysis, tables are also compiled that reflect the results of determining the influence of individual factors on the value of the analyzed generalizing (effective) indicator. When designing such tables, first they place information about the factors affecting the generalizing indicator, then information about the generalizing indicator itself, and finally about the change in this indicator in the aggregate, as well as due to the impact of each analyzed factor. Separate analytical tables reflect the results of calculating the reserves for improving economic indicators, identified as a result of the analysis. Such tables show both the actual and theoretically possible size of the influence of individual factors, as well as the possible value of the reserve for the growth of the general indicator due to the influence of each individual factor.

Finally, in the analysis of economic activity, tables are also compiled that are designed to summarize the results of the analysis.

The practice of statistics has developed the following rules for compiling tables:
  • The table should be expressive and compact. Therefore, instead of one cumbersome table for many features, it is better to make several small, but visual tables that meet the task of researching tables.
  • The title of the table, headings of columns and rows should be formulated accurately and concisely.
  • The table must necessarily indicate: the object under study, the territory, and the time to which the data in the table refer, units of measurement.
  • If some data is missing, then either put an ellipsis in the table, or write "no information", if some phenomenon did not take place, then put a dash
  • The values ​​of the same indicators are given in the table with the same degree of accuracy.
  • The table should have totals for groups, subgroups and as a whole. If the summation of data is impossible, then the multiplication sign "*" is put in this column.
  • In large tables, after every five lines, a gap is divided to make it easier to read and analyze the table.

Types of statistical tables

Among the methods, the most common tabular method(method) displaying the studied digital data. The fact is that both the initial data for the analysis, and various calculations, as well as the results of the study, are drawn up in the form of analytical tables. Tables are a very useful and visual form of displaying the numerical information used in . In analytical tables, numerical information about the studied economic phenomena is located in a certain order. tabular material much more informative and visual in comparison with the textual presentation of the material. Tables allow you to present analytical materials in the form of a single integrated system.

The type of statistical table is determined by the nature of the development of indicators of its underlying.

There are three types of statistical tables:
  • simple
  • group
  • combinational

Simple tables contain a list of individual units that make up the totality of the analyzed economic phenomenon. AT group tables digital information in the context of individual constituent parts of the studied data set is combined into certain groups in accordance with some feature. Combined tables contain separate groups and subgroups, into which are divided, characterizing the studied economic phenomenon. Moreover, such a division is carried out not on one, but on several grounds. in group tables, a simple grouping of indicators is carried out, and in combined tables, a combined grouping. Simple tables do not contain any grouping of indicators at all. The last type of tables contains only an ungrouped set of information about the analyzed economic phenomenon.

Simple tables

Simple tables have in the subject a list of population units, times or territories.

Group tables

Group tables are tables that have a grouping of population units according to one attribute in the subject.

combination tables

Combination tables have in the subject a grouping of population units according to two or more criteria.

By the nature of the development of indicators of the predicate, there are:

  • tables with a simple development of indicators of the predicate, in which there is a parallel arrangement of indicators of the predicate.
  • tables with a complex development of predicate indicators, in which a combination of predicate indicators takes place: within groups formed according to one attribute, subgroups are distinguished according to another attribute.

Table with a simple development of indicators of the predicate

The predicate of this table contains data first on the distribution of students by gender, and then by age, i.e. there are isolated characteristics on two grounds.

Table with complex development of predicate indicators

Branches

Number of students, pers.

Including

of them aged, years

of them aged, years

23 and over

23 and over

Evening

The predicate of this table not only characterizes the distribution of students according to each of the two distinguished features, but also allows us to study the composition of each group, singled out according to one feature - gender, according to another feature - the student's age, i.e. there is a combination of two features.

Therefore, tables with a complex development of predicate indicators provide more opportunities for analyzing the studied indicators and the relationships between them. A simple and complex development of indicators of a predicate can have a table of any kind: simple, group, combinational.

Depending on the stage of the statistical study, the tables are divided into:
  • development(auxiliary), the purpose of which is to summarize information on individual units of the population to obtain final indicators.
  • consolidated, the task of which is to show the results for groups and the entire population as a whole.
  • analytical tables whose task is to calculate generalizing characteristics and prepare information base for the analysis and structure and structural shifts, the dynamics of the studied phenomena and the relationship between indicators.

So, we have considered the tabular method of displaying the studied digital data, which is widely used in the course of analyzing economic phenomena, statistical data and economic activities of organizations.

Statistical data must be adequate, firstly, to the object of study, and secondly, to the time in which they are collected and used.

This chapter describes the sources of statistical data, their types and methods of obtaining, as well as methods for describing and presenting numerical and non-numerical data.

After studying this chapter, YOU should be able to:

  • -to build a program of statistical research;
  • - identify sources of statistical information;
  • -to produce a summary and grouping of statistical data and form statistical tables;
  • - to represent the results of grouping in the form of diagrams;
  • - evaluate the main characteristics: relative value, mean value, variance, standard deviation, median, mode, range.

Getting initial data

Obtaining information about the object of study is one of the main tasks of statistical research.

Statistical research should be guided by the objectives and requirements for the results. They define the methods of statistical analysis, on the basis of which the collection of initial data is organized. In the process of statistical research, one should be careful the following errors: goals are not clearly formulated, observation methods are incorrectly applied.

Obtaining initial data for a statistical study can be done in two ways:

  • -active experiment, specially organized to determine statistical dependencies;
  • - statistical observation.

An active experiment is used in feasibility studies, when, for example, the task is to optimize the modes of technological processes according to economic criteria.

When conducting a statistical study of socio-economic processes, it seems possible to use only observation. The program is the basis this method obtaining information. It consists of three main steps:

  • -definition of the object of study;
  • -selection of a population unit;
  • -determination of the system of indicators to be registered.

An object of observation is a set of units of the phenomenon under study, about which statistical information can be collected. To clearly define the object of observation, the following questions should be answered:

  • -what? (what elements will we explore);
  • -where? (where the observation will be conducted _;
  • -when? (for what period).

From the point of view of the organization of statistical observation, two main forms are distinguished: reporting and specially organized statistical observation.

Reporting as a form of observation is characterized by the fact that the statistical authorities systematically receive from enterprises, institutions and organizations in a timely manner information on the conditions and results of work for the past period, the volume and content of which are determined by the approved reporting forms.

Specially organized statistical observation is the collection of information in the form of censuses of one-time records and surveys. They are organized to study those phenomena that cannot be covered by mandatory reporting.

Types of statistical observation are distinguished by the time of data registration and by the degree of coverage of units of the studied population. According to the nature of data recording in time, observation can be classified:

  • -continuous (for example, accounting for manufactured products);
  • - periodic (accounting statements);
  • - one-time, in case of need for information, for example, a population census.

According to the degree of coverage of units of the studied population:

  • - discontinuous, selective, when not the entire population is examined, but some part of it;
  • -continuous, i.e. description of all units of the population;
  • monographic, when typical objects are described in detail.

The main methods of obtaining statistical information are indirect observation, documentary method and survey.

The ability of direct observation is characterized by the fact that representatives of state statistics bodies or other organizations record data in statistical documents after personal inspection, recalculation, measurement or weighting of units of observation.

With the documentary method of observation, various documents serve as a source. This method is used in the preparation of statistical reporting by enterprises and institutions based on primary accounting documents.

In a survey, the source of information is the answers of the interviewees. The survey can be organized in different ways: forwarding method, self-registration, correspondence method and questionnaire method.

With the expeditionary method, representatives of statistical bodies ask the person being examined and, from his words, write down information in the observation forms.

With the self-registration method, the surveyed units (enterprises or citizens) are given a survey form and given instructions on how to fill it out. Completed forms will be mailed within the specified time.

With the correspondent method of information, statistical bodies are informed by voluntary correspondents.

The questionnaire method of data collection is based on the principle of voluntary completion of questionnaires by addressees.

The concept of "statistics" comes from the Latin word "status", which in translation means - position, state, order of phenomena.

Development of political arithmetic (England) and state studies

(Germany) led to the emergence of the science of statistics.

The term "statistics" was introduced into scientific circulation by the mathematicians of the University of Göttingen in the 18th century (Gottfried Achenwal (1719-1772)).

Currently, there are about 150 definitions of statistics as a scientific discipline. One of the best definitions of statistics was given by the Austrian mathematician Abraham Wald: "Statistics is a set of methods that enable us to make optimal decisions under uncertainty."

Of the various definitions of statistics for practical medicine, the most applicable are:

"Statistics is the science of collecting, classifying, and quantifying data in order to make valid inferences, predictions, and decisions."

Statistics studies random mass phenomena. Mass phenomena- these are phenomena that occur in large quantities, but differ from each other in the magnitude of a certain feature. The larger the number of objects taken for research, the more reliable the statistical conclusions.

Statistics consists of theoretical (general) statistics and applied

(economic, social, branch) statistics.

Branch statistics include meteorological (weather forecast statistics), transport, economic, biological, medical.

The theoretical statistics are divided into descriptive(descriptive) and analytical (inductive).

Descriptive statistics are the statistics of the collection of general data. It is a set of methods for collecting, grouping, classifying source data and presenting them in a convenient form for further processing (tables, graphs).

Analytical statistics is the statistics of inferences and predictions based on the mathematical processing of the results provided by descriptive statistics. It includes methods for obtaining various statistical conclusions and conclusions with a view to their practical application.

medical statistics- this is sectoral statistics, a set of applied statistics methods that are used in scientific, practical medicine and health care.

The main tasks of medical statistics:

ü birth and death statistics;

- incidence statistics;

ü Statistics of activities of health care institutions.

Together, descriptive and analytical statistics solve the following problem:

ü collecting data and describing them in a form convenient for statistical processing;

ü processing of results by methods of theoretical (general) statistics;

ü analysis of the obtained results, forecasting, development of optimal solutions.

2. BASIC CONCEPTS OF DESCRIPTIVE STATISTICS

AND THEIR CHARACTERISTICS.

The main concepts of descriptive statistics include:

ü statistical population (general and sample);

ü the volume of the population;

ü statistical option;

ü statistical sign;

ü statistical frequency (absolute frequency);

ü frequency (relative frequency).

Population is a set of objects united by some feature for statistical study.

Aggregate types:

  1. General population (finite or infinite).
  2. Sample set (sample).

Population is a set of all objects of the statistical set selected for the study.

finite population- a statistical set in which the number of studied objects with a given feature is limited.

Example: the number of students in the academy, residents in the city, the number of measurements in experiments.

Infinite population is a statistical set in which the number of objects is equal to infinity. Used in theoretical calculations as a mathematical abstraction.

Sample population (sample)- this is the part of the general population taken for static study.

Population size is the number of objects in the collection.

The volume of the general population is indicated by the symbol N , and selective - n .

Statistical Variant is an object of a collection, a single observation or measurement.

Options are denoted by Latin letters x, y, z with subscripts indicating the number of options.

Example: x 1 - object or dimension number one,

x 2 - object or dimension number two, etc.

The variant without a number is called generalizedoption and is denoted by a Latin letter with a subscript letter index, for example, x i .

Variants (objects) of the statistical population are characterized by various features, including those on the basis of which they are combined into a population.

A feature that changes its value from one object to another is called variable sign, and the phenomenon itself is called variation.

Qualitative features- These are signs that do not have a quantitative expression. These are unmeasured signs.

Example: color, taste, smell.

Quantitative features- These are measurable signs expressed by a certain number.

Example: weight, length, density, temperature.

Discrete Quantitative Features- These are quantitative signs that are expressed as whole numbers.

Example: number of students in a group, passengers on a bus, petals on a flower.

Continuous Quantitative Features- These are quantitative signs that are expressed both as whole and fractional numbers.

Example: the weight of a watermelon is 7 kg, the weight of a melon is 1.7 kg.

interval feature- this is a quantitative sign, the numerical value of which lies within certain boundaries, called intervals.

Example: when measuring the height of students, interval groups can be distinguished 160 - 169 cm, 170 - 179 cm, 180 - 190 cm.

Frequency of occurrence (absolute frequency)- a number showing how many times an object with a given numerical value of the attribute occurs in the population or its interval.

The absolute frequency is denoted by the symbol n i (µ i).

The sum of all absolute frequencies is equal to the volume of the population N for which the frequencies are calculated: ∑n i = N

Example: The number of males and females in a group must be equal to the sum of the number of students in that group.

Frequency (relative frequency)- a number equal to the ratio of the absolute frequency to the volume of the population.

The frequency is denoted by the symbol f and calculated by the formula:

in fractions of a unit: fi = ,

in percents: fi = 100%

Here n i - absolute frequency, N - the volume of the population, equal to the sum of all absolute frequencies.

The sum of all relative frequencies is equal to 1: ∑ fi = 1

Example: in a student group of fifteen people (the volume of the population N=15) 12 female students (absolute frequency n 1 =12) and 3 students (absolute frequency n 2 =3). Frequency f 1 will be equal to 12/15, and the frequency f 2 =3/15. In this case, the sum of frequencies or relative frequencies is equal to one.

In statistics, relative frequencies or frequencies are called weights.

3. SERIES OF DISTRIBUTION, THEIR TYPES AND METHODS OF REPRESENTATION.

Distribution range- this is a sequence of numbers indicating the qualitative or quantitative value of the trait and the frequency of its occurrence.

The types of distribution series are classified according to different principles.

According to the degree of ordering, the rows are divided into:

ü disordered

ü ordered

Unordered series- this is a series in which the values ​​of the attribute are recorded in the order in which the options were received during the study.

Example: When studying the height of a group of students, its values ​​were recorded in cm (175,170,168,173,179).

ordered row is a series obtained from an unordered one in which the feature values ​​are overwritten in ascending or descending order. An ordered series is called a ranked series, and the ranking procedure

(ordering) is called sorting.

Example: (Height 168,170,173,175,179)

According to the type of feature, the distribution series are divided into:

ü attributive

ü variational.

Attribute series- this is a series compiled on the basis of a qualitative trait.

Variation series- This is a series compiled on the basis of a quantitative attribute.

Variational series are divided into discrete, continuous and interval.

Variational discrete, continuous and integral series are named according to the corresponding feature, which underlies the compilation of the series. For example, a row by shoe size is discrete by body weight - continuous.

Methods for representing series in practical and scientific medicine are divided into three groups:

  1. Table view;
  2. Analytical representation (in the form of a formula);
  3. Graphical representation.