"Big Data"- a topic that is actively discussed by technology companies. Some of them have become disillusioned with big data, while others, on the contrary, use it to the maximum for business... . We hope the information will be interesting and useful.

WHAT IS BIG DATA?

Key Features
Big Data is currently one of the key drivers of development information technologies. This direction, relatively new for Russian business, has become widespread in Western countries. This is due to the fact that in the era of information technology, especially after the boom of social networks, a significant amount of information began to accumulate for each Internet user, which ultimately gave rise to the direction of Big Data.

The term "Big Data" causes a lot of controversy, many believe that it means only the amount of accumulated information, but do not forget about the technical side, this area includes storage technologies, computing, and services.

It should be noted that this area includes the processing of a large amount of information, which is difficult to process using traditional methods*.

Below is a comparison table of traditional and Big Data base.

The sphere of Big Data is characterized by the following features:
Volume - the volume, the accumulated database is a large amount of information that is laborious to process and store in traditional ways, they require new approach and advanced tools.
Velocity - speed, this sign indicates both the increasing speed of data accumulation (90% of information was collected over the past 2 years) and the speed of data processing; recently, real-time data processing technologies have become more in demand.
Variety – variety, i.e. the possibility of simultaneous processing of structured and unstructured information of different formats. The main difference between structured information is that it can be classified. An example of such information is information about client transactions.
Unstructured information includes video, audio files, free text, information coming from social networks. To date, 80% of information is included in the group of unstructured. This information needs complex analysis to make it useful for further processing.
Veracity – Reliability of data, users began to attach importance to the reliability of available data. So, Internet companies have a problem in separating the actions carried out by the robot and the person on the company's website, which ultimately leads to the difficulty of data analysis.
value - the value of the accumulated information. Big Data should be useful to the company and bring some value to it. For example, help in improving business processes, reporting or cost optimization.

If the above 5 conditions are met, the accumulated volumes of data can be classified as large.

Applications of Big Data

The scope of Big Data technologies is extensive. So, with the help of Big Data, you can learn about customer preferences, the effectiveness of marketing campaigns, or conduct risk analysis. Below are the results of an IBM Institute survey on the directions of using Big Data in companies.

As can be seen from the diagram, most companies use Big Data in the field of customer service, the second most popular direction is operational efficiency, in the field of risk management Big Data is less common at the moment.

It should also be noted that Big Data is one of the fastest growing areas of information technology, according to statistics, the total amount of received and stored data doubles every 1.2 years.
Between 2012 and 2014, the amount of data transmitted monthly mobile networks, increased by 81%. Cisco estimates that in 2014 the volume mobile traffic amounted to 2.5 exabytes (a unit of measurement of the amount of information equal to 10 ^ 18 standard bytes) per month, and already in 2019 it will be equal to 24.3 exabytes.
Thus, Big Data is already an established area of ​​technology, even despite its relatively young age, which has become widespread in many areas of business and plays an important role in the development of companies.

Big Data Technologies
Technologies used to collect and process Big Data can be divided into 3 groups:
  • Software;
  • Equipment;
  • Service.

The most common data processing (PD) approaches include:
SQL - a structured query language that allows you to work with databases. Using SQL, you can create and modify data, and the data array is managed by the appropriate database management system.
NoSQL - the term stands for Not Only SQL (not only SQL). It includes a number of approaches aimed at the implementation of the database, which differ from the models used in traditional, relational DBMS. They are convenient to use with a constantly changing data structure. For example, to collect and store information in social networks.
MapReduce – calculation distribution model. Used for parallel computing over very big sets data (petabytes* or more). In the programming interface, data is not transferred to the program for processing, but the program is transferred to the data. Therefore, the query is a separate program. The principle of operation is to sequentially process data with two Map and Reduce methods. Map selects preliminary data, Reduce aggregates them.
Hadoop - used to implement search and contextual mechanisms for high-load sites - Facebook, eBay, Amazon, etc. Distinctive feature is that the system is protected from the failure of any of the nodes of the cluster, since each block has at least one copy of the data on the other node.
SAP HANA is a high-performance NewSQL platform for data storage and processing. Provides high speed request processing. Another differentiator is that SAP HANA simplifies the system landscape, reducing the cost of supporting analytical systems.

Technological equipment includes:

  • servers;
  • infrastructure equipment.
Servers include data stores.
Infrastructure equipment includes platform acceleration tools, sources uninterruptible power supply, sets of server consoles, etc.

Service.
Services include database system architecture, infrastructure development and optimization, and data storage security.

Software, hardware, and services combine to form end-to-end platforms for data storage and analysis. Companies such as Microsoft, HP, EMC offer services for the development, deployment and management of Big Data solutions.

Application in industries
Big Data has become widespread in many business sectors. They are used in healthcare, telecommunications, trade, logistics, financial companies, as well as in public administration.
Below are some examples of Big Data applications in some of the industries.

Retail
The databases of retail stores can accumulate a lot of information about customers, inventory management system, supply of marketable products. This information can be useful in all areas of store activity.

So, with the help of the accumulated information, you can manage the supply of goods, their storage and sale. Based on the accumulated information, it is possible to predict the demand and supply of goods. Also, the data processing and analysis system can solve other problems of the retailer, for example, optimize costs or prepare reports.

Financial services
Big Data makes it possible to analyze a borrower's creditworthiness and is also useful for credit scoring* and underwriting**. The introduction of Big Data technologies will reduce the time for consideration of loan applications. With the help of Big Data, it is possible to analyze the operations of a particular client and offer banking services that are suitable for him.

Telecom
In the telecommunications industry, Big Data is widely used by mobile operators.
Operators cellular communication along with financial institutions, they have one of the largest databases, which allows them to carry out the most in-depth analysis of the accumulated information.
The main goal of data analysis is to retain existing customers and attract new ones. To do this, companies segment customers, analyze their traffic, and determine the social affiliation of the subscriber.

In addition to using Big Data for marketing purposes, technology is used to prevent fraudulent financial transactions.

Mining and oil industry
Big Data is used both in the extraction of minerals, and in their processing and marketing. Based on the information received, enterprises can draw conclusions about the efficiency of field development, track the overhaul schedule and equipment condition, and forecast demand for products and prices.

According to a Tech Pro Research survey, Big Data is most widespread in the telecommunications industry, as well as in engineering, IT, financial and government enterprises. According to the results of this survey, Big Data is less popular in education and healthcare. The survey results are presented below:

Examples of using Big Data in companies
Today, Big Data is being actively implemented in foreign companies. Companies such as Nasdaq, Facebook, Google, IBM, VISA, Master Card, Bank of America, HSBC, AT&T, Coca Cola, Starbucks and Netflix are already using Big Data resources.

The areas of application of the processed information are diverse and vary depending on the industry and the tasks to be performed.
Next, examples of the application of Big Data technologies in practice will be presented.

HSBC uses Big Data technologies to counter fraudulent transactions with plastic cards. With the help of Big Data, the company increased the efficiency of the security service by 3 times, and the recognition of fraudulent incidents by 10 times. The economic effect from the introduction of these technologies exceeded 10 million US dollars.

Antifraud* VISA allows in automatic mode calculate transactions of a fraudulent nature, the system on this moment helps prevent $2 billion worth of fraudulent payments annually.

Supercomputer Watson company IBM analyzes in real time the flow of data on money transactions. According to IBM, Watson increased the number of identified fraudulent transactions by 15%, reduced false positives by 50% and increased the amount of funds protected from transactions of this nature by 60%.

Procter & Gamble with the help of Big Data, they design new products and create global marketing campaigns. P&G has created dedicated Business Spheres offices where you can view real-time information.
Thus, the company's management has the opportunity to instantly test hypotheses and conduct experiments. P&G believe that Big Data helps in predicting the company's performance.

Retailer of office supplies officemax with the help of Big Data technologies, they analyze the behavior of customers. Big Data analysis allowed to increase B2B revenue by 13%, reduce costs by $400,000 per year.

According to Caterpillar , its distributors are missing out on $9 billion to $18 billion in revenue annually just because they don't implement Big Data technology. Big Data would allow customers to manage their fleet more efficiently by analyzing information from sensors installed on machines.

Today it is already possible to analyze the state key nodes, their degree of wear, manage fuel and maintenance costs.

Luxottica group is a manufacturer of sports eyewear with brands such as Ray-Ban, Persol and Oakley. The company uses Big Data technologies to analyze behavior potential clients and "smart" SMS marketing. As a result, Big Data Luxottica group identified more than 100 million of the most valuable customers and increased the effectiveness of the marketing campaign by 10%.

With the help of Yandex Data Factory, game developers World of Tanks analyze the behavior of the players. Big Data technologies made it possible to analyze the behavior of 100 thousand World of Tanks players using more than 100 parameters (information about purchases, games, experience, etc.). As a result of the analysis, a forecast of user churn was obtained. This information allows you to reduce user care and work with game participants in a targeted manner. The developed model turned out to be 20-30% more efficient than standard gaming industry analysis tools.

German Ministry of Labor uses Big Data to analyze incoming unemployment claims. So, after analyzing the information, it became clear that 20% of benefits were paid undeservedly. With the help of Big Data, the Ministry of Labor has reduced costs by 10 billion euros.

Toronto Children's Hospital implemented the Project Artemis project. This is an information system that collects and analyzes data on babies in real time. The system monitors 1,260 indicators of the state of each child every second. Project Artemis allows you to predict the unstable condition of the child and begin the prevention of diseases in children.

OVERVIEW OF THE GLOBAL BIG DATA MARKET

The current state of the global market
In 2014, Big Data, according to Data Collective, has become one of the priority areas for investing in the venture industry. According to the data information portal Computerra, this is due to the fact that developments in this area have begun to bring significant results for their users. Over the past year, the number of companies with implemented projects in the field of big data management has increased by 125%, the market volume has grown by 45% compared to 2013.

Most of the revenue of the Big Data market, according to Wikibon, in 2014 was made up of services, their share was equal to 40% of the total revenue (see the diagram below):

If we consider Big Data for 2014 by subtypes, then the market will look like this:

According to Wikibon, apps and analytics accounted for 36% of Big Data revenue in 2014 from Big Data apps and analytics, 17% from computing hardware, and 15% from storage technology. Least of all revenue was generated by NoSQL technologies, infrastructure equipment and providing a network of companies (corporate networks).

The most popular Big Data technologies are the in-memory platforms of SAP, HANA, Oracle, etc. The results of the T-Systems survey showed that they were chosen by 30% of the surveyed companies. The second most popular were NoSQL platforms (18% of users), companies also used analytical platforms from Splunk and Dell, they were chosen by 15% of companies. The least useful for solving Big Data problems, according to the results of the survey, were Hadoop/MapReduce products.

According to an Accenture survey, in more than 50% of companies using Big Data technologies, Big Data costs range from 21% to 30%.
According to the following Accenture analysis, 76% of companies believe that these costs will increase in 2015, and 24% of companies will not change their budget for Big Data technologies. This suggests that in these companies Big Data has already become an established area of ​​IT, which has become an integral part of the company's development.

The results of the Economist Intelligence Unit survey confirm the positive impact of Big Data implementation. 46% of companies claim that they have improved customer service by more than 10% using Big Data technologies, 33% of companies have optimized inventory and improved the productivity of key assets, 32% of companies have improved planning processes.

Big Data in different countries peace
To date, Big Data technologies are most often implemented in US companies, but now other countries of the world have begun to show interest. In 2014, according to IDC, the countries of Europe, the Middle East, Asia (excluding Japan) and Africa accounted for 45% of the Big Data software, services and equipment market.

Also, according to the CIO survey, companies from the countries of the Asia-Pacific region are rapidly mastering new solutions in the field of Big Data analysis, secure storage and cloud technologies. Latin America is in second place in terms of the number of investments in the development of Big Data technologies, ahead of Europe and the USA.
Next, a description and forecasts of the development of the Big Data market in several countries will be presented.

China
The amount of information in China is 909 exabytes, which is equal to 10% of the total amount of information in the world, by 2020 the amount of information will reach 8060 exabytes, and the share of information in global statistics will also increase, in 5 years it will be equal to 18%. The potential growth of China's Big Data has one of the fastest growing dynamics.

Brazil
By the end of 2014, Brazil has accumulated 212 exabytes of information, which is 3% of the global volume. By 2020, the volume of information will grow to 1600 exabytes, which will be 4% of the world's information.

India
According to EMC, the amount of accumulated data in India in 2014 is 326 exabytes, which is 5% of the total amount of information. By 2020, the volume of information will grow to 2800 exabytes, which will be 6% of the world's information.

Japan
The amount of accumulated data in Japan at the end of 2014 is 495 exabytes, which is 8% of the total amount of information. By 2020, the volume of information will grow to 2200 exabytes, but the market share of Japan will decrease and will amount to 5% of the total amount of information in the whole world.
Thus, the volume of the Japanese market will decrease by more than 30%.

Germany
According to EMC, the amount of accumulated data in Germany in 2014 is 230 exabytes, which is 4% of the total amount of information in the world. By 2020, the volume of information will grow to 1100 exabytes and will be 2%.
In the German market, a large share of revenue, according to Experton Group forecasts, will be generated by the services segment, whose share in 2015 will be 54%, and in 2019 will increase to 59%, the share software and equipment, on the contrary, will decrease.

In general, the market size will grow from 1.345 billion euros in 2015 to 3.198 billion euros in 2019, with an average growth rate of 24%.
Thus, based on the CIO and EMC analytics, we can conclude that the developing countries of the world will become markets for the active development of Big Data technologies in the coming years.

Main Market Trends
According to IDG Enterprise, in 2015 Big Data companies will spend an average of $7.4 million per company, large companies intend to spend approximately $13.8 million, and small and medium companies will spend $1.6 million. .
Most of the investment will be in areas such as data analysis, visualization and data collection.
According to current trends and market demand, investments in 2015 will be used to improve data quality, improve planning and forecasting, and increase data processing speed.
Companies in the financial sector, according to Bain Company's Insights Analysis, will make significant investments, so in 2015 it is planned to spend 6.4 billion US dollars on Big Data technologies, the average investment growth rate will be 22% until 2020. Internet companies plan to spend $2.8 billion, with an average growth rate of 26% increase in Big Data spending.
During the Economist Intelligence Unit survey, the priority areas for the development of Big Data in 2014 and in the next 3 years were identified, the distribution of answers is as follows:

According to IDC forecasts, market trends are as follows:

  • Over the next 5 years, the cost of cloud-based Big Data solutions will grow 3 times faster than the cost of on-premises solutions. Hybrid storage platforms will become popular.
  • Growth of applications using sophisticated and predictive analytics, including machine learning, will accelerate in 2015, the market for such applications will grow 65% faster than applications that do not use predictive analytics.
  • Media analytics will triple in 2015 and become a key growth driver for the Big Data technology market.
  • The trend to implement solutions for analyzing the constant flow of information that is applicable to the Internet of things will accelerate.
  • By 2018, 50% of users will interact with services based on cognitive computing.
Market Drivers and Limiters
IDC experts identified 3 drivers of the Big Data market in 2015:

According to the Accenture survey, data security issues are now the main barrier to the adoption of Big Data technologies, more than 51% of respondents confirmed that they are concerned about data protection and privacy. 47% of companies reported that it was impossible to implement Big Data due to a limited budget, 41% of companies indicated a lack of qualified personnel as a problem.

Wikibon predicts that the Big Data market will grow to $38.4 billion in 2015, up 36% year-on-year. In the coming years, there will be a decline in growth rates to 10% in 2017. Taking into account these forecasts, the market size in 2020 will be equal to 68.7 billion US dollars.

The distribution of the global Big Data market by business category will look like this:

As you can see from the diagram, most of the market will be occupied by technologies from the field of customer service improvement. Spot marketing will be the number two priority for companies until 2019, in 2020, according to Heavy Reading's forecast, it will give way to solutions to improve operational efficiency.
The segment “improving customer service” will also have the highest growth rate, with an increase of 49% annually.
The market forecast for Big Data subtypes will look like this:

The predominant market share, as can be seen from the chart, is occupied by professional services, applications with analytics will have the highest growth rate, their share will grow from the current 12% to 18% in 2020 and the volume this segment will be equal to 12.3 billion US dollars, the share of computing equipment, on the contrary, will fall from 20% to 14% and will be about 9.3 billion US dollars in 2020, the market of cloud technologies will gradually increase and in 2020 will reach 6, 3 billion US dollars, the share of the market of solutions for data storage, on the contrary, will decrease from 15% in 2014 to 13% in 2020 and in terms of money will be equal to 8.9 billion US dollars.
According to Bain & Company’s Insights Analysis forecast, the distribution of the Big Data market by industry in 2020 will look like this:

  • The financial industry will spend $6.4 billion on Big Data with an average growth rate of 22% per year;
  • Internet companies to spend $2.8 billion and average cost growth rate of 26% over the next 5 years;
  • The costs of the public sector will be commensurate with the costs of Internet companies, but the growth rate will be lower - 22%;
  • The telecommunications sector will grow at an average growth rate of 40% and reach $1.2 billion in 2020;

Energy companies will invest in these technologies a relatively small amount - 800 million US dollars, but the growth rate will be one of the highest - 54% annually.
Thus, companies in the financial industry will take a large share of the Big Data market in 2020, and energy will be the fastest growing sector.
Following analysts' forecasts, the total market volume will increase in the coming years. The growth of the market will be ensured by the introduction of Big Data technologies in the developing countries of the world, as can be seen from the graph below.

The predicted market size will depend on how developing countries perceive Big Data technologies, whether they will be as popular as in developed countries. In 2014, the developing countries of the world accounted for 40% of the accumulated information. According to EMC's forecast, the current market structure, dominated by developed countries, will change as early as 2017. According to EMC analytics, in 2020 the share of developing countries will be more than 60%.
According to Cisco and EMC, the developing countries of the world will actively work with Big Data, in many respects this will be due to the availability of technologies and the accumulation of a sufficient amount of information up to the level of Big Data. The world map on the next page will show the growth forecast and growth rate of Big Data by region.

ANALYSIS OF THE RUSSIAN MARKET

Current state Russian market

According to the results of a study by CNews Analytics and Oracle, the level of maturity of the Russian Big Data market for Last year rose. Respondents representing 108 large enterprises from different industries showed more a high degree awareness of these technologies, as well as the existing understanding of the potential of such solutions for their business.
As of 2014, according to IDC, Russia has accumulated 155 exabytes of information, which is only 1.8% of the world's data. The volume of information by 2020 will reach 980 exabytes and will occupy 2.2%. Thus, the average growth rate of the volume of information will be 36% per year.
IDC estimates the Russian market at $340 million, of which $100 million is SAP solutions, approximately $240 million is similar solutions from Oracle, IBM, SAS, Microsoft, etc.
The growth rate of the Russian Big Data market is at least 50% per year.
It is predicted that the positive dynamics in this sector of the Russian IT market will continue, even in the context of a general stagnation of the economy. This is due to the fact that businesses continue to demand solutions that can improve work efficiency, as well as optimize costs, improve forecasting accuracy and minimize possible company risks.
The main providers of services in the field of Big Data in the Russian market are:
  • Oracle
  • Microsoft
  • cloudera
  • Hortonworks
  • Teradata.
Overview of the market by industry and the experience of using Big Data in companies
According to CNews, only 10% of companies in Russia have started using Big Data technologies, while the share of such companies in the world is about 30%. Readiness for Big Data projects is growing in many sectors of the Russian economy, according to a report from CNews Analytics and Oracle. More than a third of the surveyed companies (37%) have started working with Big technologies Data, among which 20% are already using such solutions, and 17% are starting to experiment with them. The second third of respondents in this moment are considering such a possibility.

In Russia, Big Data technologies are more popular in the banking sector and telecom, but they are also in demand in the mining industry, energy, retail, logistics companies and the public sector.
Next, examples of the use of Big Data in Russian realities will be considered.

Telecom
Telecom operators have one of the largest databases, which allows them to carry out the most in-depth analysis of the accumulated information.
One of the areas of application of Big Data technology is subscriber loyalty management.
The main goal of data analysis is to retain existing customers and attract new ones. To do this, companies segment customers, analyze their traffic, and determine the social affiliation of the subscriber. In addition to using information for marketing purposes, telecom uses technology to prevent fraudulent financial transactions.
Vimpelcom is one of the brightest examples of this industry. The company uses Big Data to improve the quality of service at the level of each subscriber, reporting, data analysis for network development, combating spam and personalizing services.

Banks
A significant proportion of Big Data users is occupied by specialists from the financial industry. One of the successful experiments was carried out at the Ural Bank for Reconstruction and Development, where information base began to be used to analyze customers, the bank began to offer specialized loan offers, deposits and other services. During the year of using these technologies, the company's retail loan portfolio grew by 55%.
Alfa-Bank analyzes information from social networks, processes loan applications, analyzes the behavior of users of the company's website.
Sberbank has also begun processing a data array to segment customers, prevent fraud, cross-sell, and manage risk. In the future, it is planned to improve the service and analyze the actions of customers in real time.
The All-Russian Regional Development Bank analyzes the behavior of plastic card holders. This allows you to identify transactions that are atypical for a particular client, thereby increasing the likelihood of detecting theft of funds from plastic cards.

Retail
In Russia, Big Data technologies have been implemented by both online and offline trading companies. Today, according to CNews Analytics, Big Data is used by 20% of retailers. 75% of retail professionals consider Big Data necessary for developing a competitive strategy for promoting a company. According to Hadoop statistics, after the introduction of Big Data technology, profit in trade organizations grows by 7-10%.
M.Video specialists talk about the improvement of logistics planning after the implementation of SAP HANA, also, as a result of its implementation, the preparation of annual reports was reduced from 10 days to 3, the speed of daily data loading was reduced from 3 hours to 30 minutes.
Wikimart uses these technologies to generate recommendations for site visitors.
One of the first offline stores to introduce Big Data analysis in Russia was Lenta. With the help of Big Data, retail began to study information about customers from cash receipts. The retailer collects information to build behavioral models that enable more informed decision making at the operational and business level.

Oil and gas industry
In this industry, the scope of Big Data is quite wide. Big Data technologies can be applied in the extraction of minerals from the bowels. With their help, you can analyze the mining process itself and the most effective ways extracting it, tracking the drilling process, analyzing the quality of raw materials, and processing and marketing the final product. In Russia, these technologies are already being used by Transneft and Rosneft.

State bodies
In countries such as Germany, Australia, Spain, Japan, Brazil and Pakistan, Big Data technologies are used to solve national problems. These technologies help public authorities more effectively provide services to the population, provide targeted social support.
In Russia, these technologies began to be mastered by such government agencies as Pension Fund, the Federal Tax Service and the Compulsory Medical Insurance Fund. The potential for implementing projects using Big Data is large; these technologies could help improve the quality of services, and, as a result, the standard of living of the population.

Logistics and transport
Big Data can also be used by transport companies. With the help of Big Data technologies, it is possible to track the fleet of cars, take into account fuel costs, and monitor customer requests.
Russian Railways implemented Big Data technologies together with SAP. These technologies helped to reduce the reporting time by 43.5 times (from 14.5 hours to 20 minutes) and improve the accuracy of cost allocation by 40 times. Also, Big Data was introduced into the processes of planning and tariff regulation. In total, the companies use more than 300 systems based on SAP solutions, 4 data centers are involved, and the number of users is 220,000.

Main market drivers and constraints
Drivers for the development of Big Data technologies in the Russian market are:
  • Increased user interest in the possibilities of Big Data as a way to increase the company's competitiveness;
  • Development of methods for processing media files at the global level;
  • Transfer of servers processing personal information to the territory of Russia, in accordance with the adopted law on the storage and processing of personal data;
  • Implementation of the industry plan for software import substitution. This plan includes government support domestic manufacturers software, as well as the provision of preferences for domestic IT products when purchasing at public expense.
  • In the new economic situation, when the dollar has almost doubled, there will be a trend towards an increasing use of the services of Russian providers cloud services than foreign ones.
  • Creation of technology parks that contribute to the development of the information technology market, including the Big Data market;
  • State program for the introduction of grid systems, which are based on Big Data technologies.

The main barriers to the development of Big Data in the Russian market are:

  • Ensuring the security and confidentiality of data;
  • Lack of qualified personnel;
  • Insufficiency of accumulated information resources up to the level of Big Data in most Russian companies;
  • Difficulties in introducing new technologies into established ones Information Systems companies;
  • The high cost of Big Data technologies, which leads to a limited number of enterprises that have the opportunity to implement these technologies;
  • Political and economic uncertainty leading to capital flight and freeze investment projects on Russian territory;
  • Rising prices for imported products and a surge in inflation, according to IDC, hinder the development of the entire IT market.
Russian market forecast
As of today, the Russian Big Data market is not as popular as in developed countries. Most Russian companies show interest in it, but do not dare to take advantage of their opportunities.
Examples of large companies that have already benefited from the use of Big Data technologies are increasing awareness of the possibilities of these technologies.
Analysts also have quite optimistic forecasts for the Russian market. IDC believes that the share of the Russian market will increase over the next 5 years, in contrast to the market in Germany and Japan.
By 2020, the volume of Big Data in Russia will grow from the current 1.8% to 2.2% of the global data volume. The amount of information will grow, according to EMC, from the current 155 exabytes to 980 exabytes in 2020.
At the moment, Russia continues to accumulate the volume of information to the level of Big Data.
According to a CNews Analytics survey, 44% of surveyed companies work with data no larger than 100 terabytes*, and only 13% work with volumes above 500 terabytes.

Nevertheless, the Russian market, following global trends, will increase. As of 2014, IDC estimates the market size at $340 million.
The market growth rate for previous years was 50% per year, if it remains at the same level, then in 2018 the market volume will reach 1.7 billion US dollars. The share of the Russian market in the world market will be about 3%, having increased from the current 1.2%.

The most receptive industries to the use of Big Data in Russia include:

  • Retail and banks, for them, first of all, it is important to analyze the customer base, evaluate the effect of marketing campaigns;
  • Telecom - customer base segmentation and traffic monetization;
  • Public sector - reporting, analysis of applications from the public, etc.;
  • Oil companies - monitoring of work and planning of production and marketing;
  • Energy companies - creation of intelligent electric power systems, operational monitoring and forecasting.
In developed countries, Big Data has become widespread in the fields of healthcare, insurance, metallurgy, Internet companies and manufacturing enterprises, most likely in the near future, Russian companies from these areas will also appreciate the effect of Big Data implementation and will adapt these technologies in their industries.
In Russia, as well as in the world, in the near future there will be a trend towards data visualization, analysis of media files and the development of the Internet of things.
Despite the general stagnation of the economy, in the coming years, analysts predict further growth in the Big Data market, primarily due to the fact that the use of Big Data technologies gives its users a competitive advantage in terms of increasing the operational efficiency of the business, attracting an additional flow of customers, minimizing risks and implementation of data forecasting technologies.
Thus, we can conclude that the Big Data segment in Russia is at the formation stage, but the demand for these technologies is increasing every year.

Main results of the market analysis

World market
At the end of 2014, the Big Data market is characterized by the following parameters:
  • the market volume amounted to 28.5 billion US dollars, an increase of 45% compared to the previous year;
  • most of the revenue of the Big Data market was made up of services, their share was equal to 40% of the total revenue;
  • 36% of revenue came from Big Data applications and analytics, 17% from computing hardware and 15% from storage technologies;
  • The in-memory platforms of companies such as SAP, HANA and Oracle are the most popular for solving Big Data problems.
  • the number of companies with implemented projects in the field of Big Data management increased by 125%;
The market forecast for the next years is as follows:
  • in 2015 the market volume will reach 38.4 billion US dollars, in 2020 - 68.7 billion US dollars;
  • the average growth rate will be 16% annually;
  • average company spending on Big Data technologies will be $13.8 million for large companies and $1.6 million for small and medium-sized businesses;
  • technologies will have the greatest prevalence in the areas of customer service and targeted marketing;
  • in 2017, the global market structure will change towards the predominance of user companies from developing countries.
Russian market
The Russian Big Data market is at the stage of formation, the results of 2014 are as follows:
  • the market volume reached 340 million US dollars;
  • the average market growth rate in previous years was 50% annually;
  • the total amount of accumulated information was 155 exabytes;
  • 10% of Russian companies have started using Big Data technologies;
  • Big Data technologies were more popular in the banking sector, telecom, Internet companies and retail.
The forecast for the Russian market for the coming years is as follows:
  • the volume of the Russian market in 2015 will reach 500 million US dollars, and in 2018 - 1.7 billion US dollars;
  • the share of the Russian market in the world market will be about 3% in 2018;
  • the amount of accumulated data in 2020 will be 980 exabytes;
  • data will grow to 2.2% of global data in 2020;
  • technologies of data visualization, analysis of media files and the Internet of things will gain the greatest popularity.
Based on the results of the analysis, we can conclude that the Big Data market is still in its early stages of development, and in the near future we will observe its growth and the expansion of the capabilities of these technologies.

Thank you for taking the time to read this voluminous work, subscribe to our blog - we promise many new interesting publications!

Speaker: Philip Katz


Interviewer: Alexey Karlinsky

Many times we believed the promises of science fiction about an incredible future, and each time our hopes were shattered by a dull present. We still live on earth and our cars don't fly through the air. “We were deceived again!” - we think, and behind all these fantasies we once again miss the moment when the future really comes.

This time it happened with the advent of Big Data. We can ignore them, but we can no longer deny their impact on our lives. Phillip Katz, an architect and Big Data specialist, tells how Big Data has quietly changed our cities and the way we live in them.

A multidisciplinary specialist, an architect by education, Philip is a Big Data specialist. Graduate of the Kazan University of Architecture, Strelka Institute of Media, Architecture and Design, one of the founders of the Branch Point project. He teaches at the St. Petersburg National Research University of Information Technologies, Mechanics and Optics and is engaged in data analysis for Rambler&Co.

close

Philip, please tell us how Big Data technologies are used in architectural design and urban planning today?

Let's start with the fact that four years ago, when I studied at Strelka, in Russia, at least, no one knew about Big Data. The world is just talking about them. A year later, in Russia, everyone knew about them and had been ill with them. It seems to me that this is largely a traditional dynamic - when new technology rises to a pedestal, is praised, and then quite quickly skepticism appears against her. Technology is knocked off its pedestal, and after that they integrate into society in a more relaxed mode.

If we talk about architectural or urban planning analytics, then it seems to me that today this is a kind of compromise between modern technologies and traditional analysis. For example, a year ago I was helping a friend of mine enter an architecture competition for students in the United States. For them, the city manager provided GIS files with quite good description data: transport routes, the volume of these routes, where puddles appear every year, where it floods every five years, where there are blocks with a high level of taxes, where there are blocks with a high percentage of blacks. In the United States, the detail of the statistics is high and the data is summarized quite well, so even at the level of the competition project, we could get some things ready-made. They did not have to be collected or analyzed.

Most of the most useful analytics, in my opinion, boils down to the fact that you take some data as facts and design based on it. And although the data may be the same for everyone, they are still read and understood in completely different ways.

Google claims that their self-driving cars can reduce the number of car accidents and help to use fuel and space on the roads more efficiently / photo: Google.com

How have you used Big Data technologies in your practice?

We for a long time did the project "Branch Point" with my colleagues Edik Khaiman and Sasha Boldyreva - they tried to somehow discuss and develop digital design and, of course, then our common postulated dream and ultimate goal was design based on parameters. At the same time, our ultimate dream was precisely to find new formal solutions based on some tricky code that would meet our requirements, but the form of the result would not be the one that we laid down, but some unexpected - beautiful .

Analytics is a kind of art, where in each case the algorithm for working with data is a picture

In the mature age of the project, we all understood that this dream was not only unattainable, but rather the idea that a building should be completely designed based on data alone was controversial. It is rather something to strive for, but understand that you will never get there.

Here an important dialectic moment arises for me. Suppose we are making an algorithm and understand that, first of all, due to genetic requirements, it requires fairly simple, but still formal parameters. And in a complex system, and a building or a district is a complex system, many such parameters immediately appear that need to be brought to a common denominator. You always need a primary formal gesture, some form: a cylinder or a parallelepiped, pyramids and so on.

If we look at the work of Zaha Hadid, then there is always some elegant formal gesture at the heart of the project. It can then be modified digitally, but always remains at the heart of everything and belongs to the pen of the author. The genetic algorithm can then choose the best of the resulting options, but it will never be able to invent them.

That is, at the heart of the design will always be the human will. How, in this case, will the degree of human involvement in design change with the development of Big Data?

In the future, I see some kind of analytical engine - large and complex quantum computer, for example, or telepaths and parapsychologists, immersed in deprivation chambers, who predict something or suggest something worth paying attention to.

I think a person will never be squeezed out of the process. All these things (Big Data analysis methods) are called decision assistance algorithms, and their essence boils down to pulling out anomalies in the dynamics of processes as efficiently as possible and minimizing the percentage of technical labor per person. An analyst must be an expert in working with them, and algorithms can bring him everything on a silver platter, except, in fact, the solution. Of course, there is a technical threshold for entering this discipline, but analytics itself is an art form, where the algorithm for working with data is a picture. Masterpiece.

Drones equipped with a camera can independently patrol a given area and transfer images to the information center in real time / photo: Kevin Baird / Flickr.com

Big Data cannot cover all the information. How to deal with what is not taken into account when analyzing Big Data?

Indeed, analysts are often criticized for describing only those who are connected to the Internet, and those who are not connected to the Internet are knocked out of the analysis. This is absolutely true, but it has its own defense logic. Speaking cynically, if we do not know the problems of a grandmother who is embarrassed to write on the Internet because she is not used to it, then we can ignore her problems, simply because if we use this approach, then either the grandmother or her grandson will support her, eventually write.

Another problem lies in the fact that any technology for collecting or storing data is always the first error factor. At the same time, it is impossible in principle to track all the multifactoriality - why people played this way and not otherwise. At first, Big Data does not provide an answer. They allow you to ask serious questions.

How does the opportunity to ask questions in new ways change our perception of the city?

Edward Hyman once coined the term "plagopolis". The idea is that the modern city is becoming more and more proactive and dynamic. Today it is a kind of environment with its own flows, movements, where the liquid that overflows in the vessels is self-regulating all the time. At the same time, you can only grab a point and fix it very conditionally. It will instantly change itself and change other points around it. For me, this idea is a pretty practical thing to work with. Now it becomes clear that we can no longer perceive the city as something mechanical.

Is this idea accepted in Russian urban planning?

At the level of urban planning in this Russian sense, this is not obvious. One way or another, we start by drawing paths, streets, and we believe that this will be the case in the end. At best, we begin to think that we should check how to do it correctly, and then it will either be the way we draw, or people themselves will redo everything later.

Big Data does not provide an answer. They allow you to ask serious questions.

In general, allegations based on stereotypes and abstract ideas are very annoying today. Moreover, architects and urban planners, first of all, drive me crazy. They simply say that "pedestrians are better than motorists" or that "creative business will turn an industrial park into heaven on earth." I would like to have a basic calculation behind any of these things, because it may be so, but it may not be so, and in most cases it is somehow wrong.

How then can Big Data help us better understand the city?

The city is always an elephant from a fairy tale about the blind, who try to describe it by touch. We always work in the same way - someone grabs by the ass, someone by the ear, someone by the trunk. And everyone at the same time says that he sees an elephant. In our case, we all also believe that we are sighted and know what a city is.

Big Data protects us from touching only in one place, gives us the opportunity to roughly imagine the general shape of an elephant and understand that we are touching approximately this place, but there are others. I get huge reports on the city and I can always get into some specific ten lines of data, look and ask: why is that? Usually this becomes the beginning for some kind of investigation, research, history.

GIS data combined with spatial modeling algorithms help predict the level of isolation in a selected area / photo: Trevor Patt / Flickr.com

Are these reflections inspired by Big Data somehow expressed in real projects later on?

There is a so-called "urban acupuncture" method. Its essence lies in the fact that the city is looking for, as it were, pain nodes, and in these small knots - in spaces of a maximum of a block, and preferably in one building, or even on some small area between buildings - some kind of change is made. Due to the size of the budget, it is completely microscopic, and the changes for the city as a whole, if these nodes are correctly calculated, are huge.

Although "Urban acupuncture" today is rather a speculative project, already now there are smart spatial solutions, with traffic lights in a single system, for example. They, coupled with smart roads, allow you to change the space, and this can give unexpected exhausts. Even today, the robotization of industries is taking place, and this also adds value. If nowdroneswill begin to transport goods, then urban logisticssmerdzhitsya (from English to merge "merge"A.K.)- and there are numbers, and here are numbers. It will definitely be much easier to work with this than with live truckers.

The technology I'm currently inspired by, and I hope something architectural will come out of it, is new project Amazon when it's worth smart speaker in the center of the house, who listens to all your questions and answers them. Kind of like Siri, only in the house. This technology is likely to change the city's sense of space more than any algorithm.

So the city will rely more and more on software?

Exactly. Now I / O and various interfaces for obtaining information by a person change a lot institutionally. From my point of view, the service to call a cheap taxi changes my life much more than 90 percent of urban planning decisions. Taxis change a lot in my perception of the city. Despite all the previous experience, with the advent of Yandex. Taxi and the competition of taxi services turned out that our taxi drivers are polite, and the money is specific, and they react quickly - not at all like in any New York.

The cheap taxi service changes my life much more than 90 percent of urban planning decisions

I think the most important service that could make huge profits from uberification is prostitution. The hypothetical user is shy, and maybe that's why many people do not use the services of prostitutes - it seems to them something dangerous, scary and incomprehensible. Sitting on their phone - it would certainly be much easier for them. Of course, this would immediately take bread from the pimps and completely change the business. Just colossal! I think this will happen in some liberal country soon.

Do you think people will be able to work with Big Data technologies personally in the future?

I think it's all leading up to this. Technological complexity will increase, and this is understandable, but in practice, we will learn how to pack it properly. Slick interfaces(from English sleekthin, gracefulA.K.)today, to some extent, simplify our perception of how everything happens. Here's a button, here's a pipka, and that's it. Today, the more you can hide from the average person without losing function, the better, because people are a little intimidated by all this complexity. Although the known technology, as in Minority Report, did not appear, but sensually the film very correctly describes what will happen now.

What will it be? What do you think big data will face in the near future?

They appeared as a kind of fashionable topic and are now slowly fading away, because the most obvious things have already been done. Further, it will be necessary to work out the technical mechanisms in the methodology - not in a romantic, but in a utilitarian form. In five years, I'm sure there will be a fairly well-paid and, perhaps, rather boring position of some kind of digital analyst in the mayor's office, at ministries and businesses.

At the same time, Big Data has a certain disease. There are people who understand what they are doing, and there are people who feed on it, who do not really understand how Big Data works. A hole between professional technologists and people who understand why all this can happen always exists in any business, in any science, and this, of course, is a certain problem. People who know the technology side and experiment with new solutions rarely do really useful things, and people who know how to apply these developments also cannot create a quality product alone. Therefore, the only way to develop when working with Big Data is to find new ways of interaction between specialists.

MegaFon has developed and provided for use by Russian Railways subsidiaries a test version of the service for analyzing passenger traffic based on big data, RBC reports with reference to Maxim Motin, a representative of the operator. The tool helps to determine the size and detailed specifications transportation market, as well as the share of the transport company in it in a mode close to real time.

Now preparatory work is underway to implement a system for analyzing Big Data, Oleg Yemchenko, head of the ERP systems department (system for enterprise resource planning) of the information technology department of the FPC RZD, confirmed. “This can only be realized in a specific project in 2016,” Yemchenko said.

The Megafon geoanalytics service was launched back in 2013, the initial goal was to predict network loads. With its help, you can estimate the exact volume of passenger traffic, get information about the routes (who, when, from where and where it goes), layout by type of transport. The service also evaluates the solvency of passengers and the nature of travel (business trips, tourism, personal needs). All data is anonymised.

It is possible to analyze more than 10,000 events per second using more than a thousand parameters, said Roman Postnikov, director of segment marketing and customer analytics at MegaFon. Over the past three years, more than 5 petabytes of information has been accumulated - a volume comparable to more than 30 billion photos on Facebook. Postnikov assures that each client has its own list of parameters for analysis, that is, in fact, we are talking about a universal cloud solution, which can be used by completely different types of customers who need to analyze large amounts of data.

Megafon calculated that transport companies In Russia, more than 1.2 billion rubles are spent annually on passenger traffic research. “At the same time, the companies themselves can collect only a part of the data available to them, and our service makes it possible to see the whole picture of the market as a whole,” says Postnikov. Even if, thanks to the introduction of the service, the carrier will be able to increase its share in the overall passenger transportation market by 1.5-2%, then these are billions of rubles, he says.

Big Data solutions can also be used to manage urban infrastructure. The Expert Center of the Electronic State, the Government of Moscow is going to conclude a contract under which the city will receive aggregated depersonalized geospatial data of users of local telecom operators in 11 different sections within two years. The consumers of this information will be the State Unitary Enterprise "NI and PI of the General Plan of Moscow", the Department of Transport and Road Infrastructure Development, the Department of Culture and other metropolitan departments.

In Bashkiria, for the first time, “big data” was used in the analysis of the tourist flow. The State Committee for Tourism of the Republic of Belarus ordered a study from the Ural Center for Monitoring and Analytics, which was carried out on the basis of the dynamics of movements of mobile phone subscribers.

According to studies, from January to November 2018, the republic was visited by 1.656 million tourists, 60% of whom are men aged 30 to 45, as a rule, employees of commercial organizations with higher education, with an income of 40 thousand rubles a month. The average length of stay is 3.8 days.

The peak of the tourist flow falls on the summer. In June 2018, the number of people entering was 179 thousand people, in July - 215 thousand people. The minimum figure was observed in February - 118 thousand people.

Guests came from various regions of Russia. The largest share of visitors - Moscow, Moscow region, Tatarstan - 11% each. Residents of the Orenburg region, Chelyabinsk and Samara regions accounted for the share of the tourist flow in 9%, 7%, 6%. Further Sverdlovsk region and KhMAO - 3.8% each, the Tyumen region - 3%, the Perm Territory and Udmurtia - each slightly more than 2%.

Foreign tourists came from neighboring countries, as well as India, Spain, Italy, Yemen, Germany, Turkey, Egypt, Nigeria, Israel, USA, Czech Republic, Saudi Arabia, Bulgaria, Iran, China and Finland.

A sociological study was also conducted in the form of surveys of tourists. 37% of respondents chose a hotel or a hotel to stay. 17% stayed with friends or relatives, 11% preferred hostels. According to the purposes of travel, the tourist flow was distributed as follows: trips to relatives (30%), business tourism (28%), health tourism (18%), sightseeing (12%), active (8%), pilgrimage tourism (0.2%) .

40% of tourists came to Bashkiria not for the first time. 20% came on the recommendation of friends (colleagues, relatives). 24% profit on a business trip. The least used sources of information when choosing the direction of travel for respondents were Internet portals (3.4%), social networks (1.2%), advertising in the media (0.5%).

In the current 2019, the tourist attractiveness of certain regions of the republic will also be analyzed, the state committee informed.

"Geoanalytics using the capabilities mobile operators is an advanced method of counting the tourist flow. Currently, only Moscow has such experience, and let me remind you that the latter occupies the first place in the national tourist rating in the Volga Federal District, Bashkortostan - the second, ”said Azamat Galin, deputy head of the State Committee for Tourism and Entrepreneurship of the Republic of Belarus.

According to the Turstat portal, at the end of 2018, Bashkiria entered the Top 15 in the rating of domestic and inbound tourism, taking 13th place with the number of tourists over 2.5 million people (+13% compared to 2017).

These initiatives of the Government of Bashkiria are very interesting and useful for studying the tourist flow and planning their activities in order to promote the region's tourism products through the comprehensive provision of services to tourists, including using IT technologies.

By the way, the news mentions Nizhny Nogorod. We previously reported that this city has implemented the "Guest Card" project, which will allow you to track the movement of tourists visiting the sights of the city, their interests, tourists will be able to receive various discounts, as well as free use public transport.

All these initiatives are being implemented in the regions isolated and isolated, without federal participation.

WHAT ARE YOU TALKING ABOUT?

The bottom line is that the issue of applying electronic visas for foreign citizens arriving in the Russian Federation is currently being resolved. According to the Association "Safety of Tourism", the use of such visas using special digital technologies without integrating the system of migration and registration of tourists in hotels and the services mentioned above using the "guest card" does not make sense. This is not a government approach.

In our opinion, a systematic, state approach should include taking into account all these elements. A tourist must register at the border once, having received an electronic tag, and then move around the country, register in hotels (already without migration registration), visit museums without problems, receive various discounts, use public transport for free or with discounts. And at the same time, this approach will allow both to ensure national security by recording the movements of foreigners, and to free hoteliers from the headache of registration and migration accounting, and tourism management bodies in the constituent entities of the Russian Federation to receive information about the most popular objects of the region (city) and, on its basis, form tourist offers, thereby getting the maximum benefit.

AND EVERYTHING IS FOR THIS!

Namely, the Government Decree Russian Federation August 6, 2015 No. 813, which approved the Regulations on state system migration and registration records, the implementation of which can significantly affect hospitality and increase the inbound tourist flow in general. This is exactly what the Chairman of the Board of the Association "Safety of Tourism" spoke about on December 06, 2018 in the Federation Council Sergei Gruzd participants of the round table on the topic "Actual issues of using electronic visas for foreign citizens arriving in the Russian Federation and improving the legislation of the Russian Federation in this area"

Recall that the issues of improving migration and registration records, simplifying the visa regime, developing and implementing a single biometric identifier for travel will be the subject of discussion within the International Forum "Tourism Safety" - TSIF - 2019.This Forum is a key professional event where representatives of authorities, the professional community and business discuss topical issues of ensuring the safety of tourism on one platform. The format of the Forum provides for 4 breakout sessions.

2.5 billion gigabytes of data. Analyst companies predict that the amount of data generated annually will reach 43 trillion gigabytes by 2020. Among all this information: tweets, reposts and videos, there is one that many companies use to develop services. People have already found the use of big data in marketing to evaluate the desires of customers. Big Data is also used in medicine to improve diagnostics and in the banking environment to create personalized offers. Big data is also used in the automotive sector, helping drivers reach their destination faster. How? This is what we will talk about today.

Help to avoid traffic jams

Data helps drivers get to their destination in the truest sense of the word. We are talking about navigators - they build the shortest route without traffic jams and roadworks.

Navigators send their coordinates to the app provider's system every few seconds. Based on the received data, the algorithm builds a track, that is, a route with information about the speed of movement. Based on the sum of tracks received from many drivers, traffic jams are detected.

Cars are part of the network and form a stable flow of information. At the same time, they can exchange data with the surrounding infrastructure. Surveillance cameras installed at intersections in the city can also be used to detect traffic congestion. Researchers are working on various options for implementing such solutions.

For example, to create car-to-car and car-to-infrastructure communications, scientists propose using OBU (On-Board Units) modules, which determine the car's position and speed in limited time intervals. This information will go to the RSU (Roadside Unit) and then to the clusters responsible for data aggregation and processing.

Clusters receive data via API and interpret. For example, if several users of the navigator application move at a low speed in one area, the system understands that movement is difficult in it. You can read more about one of the proposed algorithms.

Users can also send data to the service on their own: information about accidents, repairs, potholes, etc. The aggregator piece by piece collects the information received into a single picture and, comparing the data with GPS coordinates, puts down points of road congestion. Based on these data, navigation routes are being built.

When a route is built, the app monitors it to keep up-to-date with the situation on the way. The algorithm is responsible for building a route that would be free from traffic jams. If there is a traffic jam on the route, then the algorithm looks for another way. If a alternatives(even with traffic jams, but faster ones) is not found, then the route remains the same. A simplified form of this algorithm is shown below:

Block diagram of a variant of the algorithm for route monitoring

Scientists are confident that the accuracy of such systems will increase significantly when all or almost all cars begin to communicate with each other and exchange data. In the future, they will change the rules of behavior on the road. This opinion is shared by Tim Lomax, an analyst at the Texas A&M Transportation Institute.

“If cars start talking to each other, we won't need traffic lights,” says Tim. “The car, approaching the intersection, will report its intention to cross it, and the surrounding vehicles will know how to avoid a collision.” Lomax says that this will be a step towards the widespread use of self-driving cars.

They will take you to the place

Self-driving cars are another area where big data could have a significant impact. Self-driving cars are part of the Internet of Things and are leading to an increase in the amount of data generated. In order to build a route, the autopilot must understand what roads it will have to travel on and what it will meet on the way. To do this, cars, in addition to their own sensors, draw information from the so-called environment maps. In the future, this list will be replenished with other traffic participants and infrastructure elements: traffic lights, buildings, even trees.