In 2016, hopes for a full-fledged generational change in GPUs finally came true, which had previously been hampered by the lack of production capabilities necessary to produce chips with a significantly higher transistor density and clock speeds than the proven 28 nm process technology allowed. The 20nm technology we had hoped for two years ago turned out to be commercially unprofitable for chips as large as discrete GPUs. Because TSMC and Samsung, who could have been contractors for AMD and NVIDIA, did not use FinFET at 20nm, the potential performance per watt increase over 28nm was such that both companies preferred to wait for mainstream adoption 14/16- nm norms, already using FinFET.

However, years of tedious waiting have passed, and now we can evaluate how GPU manufacturers have disposed of the capabilities of the updated technical process. As practice has once again shown, "nanometers" by themselves do not guarantee high energy efficiency of the chip, so the new architectures of NVIDIA and AMD turned out to be very different in this parameter. And additional intrigue was introduced by the fact that companies no longer use the services of one factory (TSMC), as it was in previous years. AMD chose GlobalFoundries to manufacture Polaris GPUs based on 14nm FinFET technology. NVIDIA, on the other hand, is still collaborating with TSMC, which has a 16nm FinFET process, on all Pascal chips except for the low-end GP107 (which Samsung makes). It was Samsung's 14nm FinFET line that was once licensed by GlobalFoundries, so the GP107 and its rival Polaris 11 give us a convenient opportunity to compare the engineering achievements of AMD and NVIDIA on a similar production base.

However, we will not prematurely dive into technical details. In general, the proposals of both companies based on next-generation GPUs are as follows. NVIDIA has created a complete line of Pascal architecture accelerators based on three consumer grade GPUs - GP107, GP106 and GP104. However, the place of the flagship adapter, which will certainly receive the name GeForce GTX 1080 Ti, currently vacant. A candidate for this position is a card with a GP102 processor, which is currently used only in NVIDIA's prosumer TITAN X accelerator. Tesla computing accelerators.

AMD's success so far is more modest. Two processors of the Polaris family were released, based on which products belong to the lower and middle categories of gaming graphics cards. The upper echelons will be occupied by the upcoming Vega family of GPUs, which are expected to feature a comprehensively upgraded GCN architecture (while Polaris is not so different from the 28nm Fiji and Tonga chips from this point of view).

NVIDIA Tesla P100 and new TITAN X

Through the efforts of Jensen Huang, the permanent head of NVIDIA, the company is already positioning itself as a manufacturer of general-purpose computing processors no less than a manufacturer of gaming GPUs. The signal that NVIDIA is taking the supercomputing business more seriously than ever was the division of the Pascal GPU line into gaming positions on the one hand and computing positions on the other.

Once the 16nm FinFET process went live at TSMC, NVIDIA put its first effort into the GP100 supercomputer chip, which debuted before Pascal's consumer product line.

The GP100 features an unprecedented number of transistors (15.3 billion) and shader ALUs (3840 CUDA cores). It is also the first accelerator to be equipped with HBM2 memory (16 GB) combined with a silicon-based GPU. The GP100 is used as part of the Tesla P100 accelerators, initially limited to supercomputers due to a special form factor with the NVLINK bus, but later NVIDIA released the Tesla P100 in a standard expansion board format PCI Express.

Initially, experts assumed that the P100 could appear in gaming graphics cards. NVIDIA, apparently, did not deny this possibility, because the chip has a full-fledged pipeline for rendering 3D graphics. But it is now clear that it is unlikely to ever go beyond the computing niche. For graphics, NVIDIA has a sister product - GP102, which has the same set of shader ALUs, texture mappers and ROPs as GP100, but lacks the ballast in the form of a large number of 64-bit CUDA cores, not to mention other architectural changes (less schedulers, truncated L2 cache, etc.). The result is a more compact (12 billion transistors) core, which, together with the abandonment of HBM2 memory in favor of GDDR5X, allowed NVIDIA to expand the GP102 to a wider market.

Now GP102 is reserved for the TITAN X prosumer accelerator (not to be confused with the GeForce GTX TITAN X based on the Maxwell architecture GM200 chip), which is positioned as a board for reduced precision calculations (in the range from 8 to 32 bits, among which 8 and 16 are NVIDIA's favorite deep training) even more than for games, although wealthy gamers can purchase a video card for $1,200. Indeed, in our gaming tests, the TITAN X does not justify its cost with a 15-20 percent advantage over the GeForce GTX 1080, but it comes to the rescue overclocking. If we compare the overclocked GTX 1080 and TITAN X, then the latter will already be 34% faster. However, the new gaming flagship based on the GP102 will most likely have fewer active computing units or lose support for any computing functions (or both).

All in all, releasing massive GPUs like the GP100 and GP102 early in the 16nm FinFET process is a big achievement for NVIDIA, especially considering the challenges the company faced during the 40nm and 28nm period.

NVIDIA GeForce GTX 1070 and 1080

NVIDIA launched its line of GeForce 10-series gaming accelerators in its usual sequence - from the most powerful models to more budgetary ones. The GeForce GTX 1080 and other Pascal architecture gaming cards that have since been released most clearly show that NVIDIA has taken full advantage of the 14/16nm FinFET process to make chips more dense and energy efficient.

In addition, by creating Pascal, NVIDIA not only increased performance in various computational tasks (as the example of GP100 and GP102 showed), but also supplemented the Maxwell chip architecture with functions that optimize graphics rendering.

Briefly note the main innovations:

  • improved color compression with ratios up to 8:1;
  • the PolyMorph Engine's Simultaneous Multi-Projection function, which allows you to create up to 16 projections of the scene geometry in one pass (for VR and systems with multiple displays in the NVIDIA Surround configuration);
  • the ability to interrupt (preemption) in the process of executing a draw call (when rendering) and a stream of commands (during calculations), which, together with the dynamic allocation of GPU computing resources, provides full support asynchronous computing (Async Compute) - an additional source of performance in games under the DirectX 12 API and reduced latency in VR.

The last point is especially interesting, since Maxwell chips were technically compatible with asynchronous computing (simultaneous work with the computational and graphical command queues), but the performance in this mode left much to be desired. Asynchronous computations in Pascal work as expected, allowing more efficient loading of the GPU in games with a separate thread for physics calculations (although, admittedly, for NVIDIA chips the problem of fully loading shader ALUs is not as acute as for AMD GPUs).

The GP104 processor used in the GTX 1070 and GTX 1080 is the successor to the GM204 (the second-tier chip in the Maxwell family), but NVIDIA has achieved such high clock speeds that the GTX 1080 outperforms the GTX TITAN X (based on a larger GPU) on average by 29%, and all this within a more conservative thermal package (180 vs. 250 watts). Even the GTX 1070, which is more heavily sliced ​​than the GTX 970 was sliced ​​compared to the GTX 980 (and the GTX 1070 uses GDDR5 instead of the GDDR5X in the GTX 1080), is still 5% faster than GTX TITAN X.

NVIDIA has updated the display controller in Pascal, which is now compatible with DisplayPort 1.3 / 1.4 and HDMI 2.b interfaces, which means it allows you to output a picture with a higher resolution or refresh rate over a single cable - up to 5K at 60 Hz or 4K at 120 Hz. 10/12-bit color representation provides support for dynamic range (HDR) on the few screens yet that have this capability. The dedicated Pascal hardware block is capable of encoding and decoding HEVC (H.265) video at up to 4K resolution, 10-bit color (12-bit decoding) and 60Hz.

Finally, Pascal removed the limitations of the previous version of the SLI bus. The developers raised the frequency of the interface and released a new, two-channel bridge.

You can read more about these features of the Pascal architecture in our GeForce GTX 1080 review. However, before moving on to other new products of the past year, it is worth mentioning that in the 10th GeForce line, NVIDIA will release reference design cards for the entire life of the respective models for the first time. They are now called Founders Edition and are sold for more than the recommended retail price for partner cards. For example, the GTX 1070 and GTX 1080 have recommended prices of $379 and $599 (already higher than the GTX 970 and GTX 980 in their youth), while the Founders Editions are priced at $449 and $699.

GeForce GTX 1050 and1060

The GP106 chip spread the Pascal architecture to the mainstream gaming accelerator segment. Functionally, it is no different from the older models, and in terms of the number of computing units it is half of the GP104. True, the GP106, unlike the GM206 (which was half of the GM204), uses a 192-bit memory bus. In addition, NVIDIA removed the SLI connectors from the GTX 1060 board, upsetting fans of a gradual upgrade of the video subsystem: when this accelerator exhausts its capabilities, you can no longer add a second video card to it (except for those DirectX 12 games that allow you to distribute the load between the GPU bypassing drivers).

The GTX 1060 was originally equipped with 6 GB GDDR5, a fully functional GP106 chip, and went on sale for $249/299 (partner cards and Founders Edition, respectively). But then NVIDIA released a video card with 3 GB of memory and a suggested price of $199, which also reduced the number of compute units. Both graphics cards have an attractive TDP of 120W, and in terms of speed they are analogous to the GeForce GTX 970 and GTX 980.

The GeForce GTX 1050 and GTX 1050 Ti belong to the lowest category mastered by the Pascal architecture. But no matter how modest they may look against the background of older brothers, NVIDIA has made the biggest step forward in the budget niche. The GTX 750/750 Ti that occupied it before belong to the first iteration of the Maxwell architecture, so the GTX 1050/1050 Ti, unlike other accelerators in the Pascal family, have advanced not one, but one and a half generations. With a significantly larger GPU and memory running at higher frequencies, the GTX 1050/1050 Ti boosted performance over their predecessors more than any other Pascal series (90% difference between the GTX 750 Ti and GTX 1050 Ti).

Although the GTX 1050/1050 Ti use slightly more power (75W vs 60W), they still fall within the power range of a PCI Express card without an auxiliary power connector. NVIDIA did not release junior accelerators in the Founders Edition format, and the recommended retail prices were $109 and $139.

AMD Polaris: Radeon RX 460/470/480

AMD's response to Pascal was the Polaris family of chips. The Polaris line now includes only two chips, on the basis of which AMD produces three video cards (Radeon RX 460 , RX 470 and RX 480), in which the amount of onboard RAM additionally varies. As you can easily see even from the model numbers, the 400-series Radeon has left the upper echelon of performance unoccupied. AMD will have to fill it with products based on Vega silicon. Back in the 28 nm era, AMD acquired the habit of testing innovations on relatively small chips and only then implementing them in flagship GPUs.

It should be noted right away that in the case of AMD, the new family of graphics processors is not identical new version underlying GCN (Graphics Core Next) architecture, but reflects a combination of architecture and other product features. For GPUs built according to the new process technology, AMD has abandoned the various "islands" in the code name (Northern Islands, South Islands, etc.) and designates them with the names of stars.

Nevertheless, the GCN architecture in Polaris received another, third update in a row, due to which (along with the transition to the 14nm FinFET process) AMD significantly increased performance per watt.

  • The Compute Unit, the elementary form of organizing shader ALUs in GCN, has undergone a number of changes related to prefetching and caching of instructions, L2 cache accesses, which together increased the specific performance of the CU by 15%.
  • There is support for half-precision (FP16) calculations, which are used in computer vision and machine learning programs.
  • GCN 1.3 provides direct access to the internal instruction set (ISA) of stream processors, due to which developers can write the most "low-level" and fast code - as opposed to DirectX and OpenGL shader languages, abstracted from hardware.
  • Geometry processors are now able to exclude polygons of zero size or polygons that have no projection pixels early in the pipeline, and have an index cache that reduces resource consumption when rendering small duplicate geometry.
  • Double L2 cache.

In addition, AMD engineers have put a lot of effort into making Polaris run at the highest frequency possible. The GPU frequency is now controlled with minimal latency (latency is less than 1 ns), and the card adjusts the voltage curve at each PC boot to take into account the variation in parameters between individual chips and silicon aging during operation.

However, the move to 14nm FinFET has not been smooth sailing for AMD. Indeed, the company was able to increase the performance per watt by 62% (judging by the results of the Radeon RX 480 and Radeon R9 380X in gaming tests and the nameplate TDP of the cards). However maximum frequencies Polaris do not exceed 1266 MHz, and only a few of the manufacturing partners have achieved more with additional work on cooling and power systems. On the other hand, GeForce video cards still hold the lead in terms of speed-to-power ratio, which NVIDIA achieved back in the Maxwell generation. It seems that AMD at the first stage was not able to reveal all the capabilities of the new generation process technology, or the GCN architecture itself already requires deep modernization - the last task was left to Vega chips.

Polaris-based accelerators occupy a price range from $109 to $239 (see table), although in response to the appearance of the GeForce GTX 1050/1050 Ti, AMD reduced the prices of the two lower cards to $100 and $170, respectively. On the this moment in each category of price/performance, there is a similar balance of power between competing products: the GeForce GTX 1050 Ti is faster than the Radeon RX 460 with 4GB of RAM, the GTX 1060 with 3GB of memory is faster than the RX 470, and the full-fledged GTX 1060 is ahead of the RX 480. Together so AMD video cards They are cheaper, which means they are popular.

AMD Radeon Pro Duo

The report on the past year in the field of discrete GPUs will not be complete if we ignore one more of the "red" graphics cards. While AMD has yet to release a flagship single-GPU replacement for the Radeon R9 Fury X, the company has one proven move left to continue conquering new frontiers - installing two Fiji chips on a single board. This card, the release of which AMD has repeatedly postponed, nevertheless appeared on sale shortly before the GeForce GTX 1080, but fell into the category of professional accelerators Radeon Pro and was positioned as a platform for creating games in the VR environment.

For gamers at $1,499 (more expensive than a pair of Radeon R9 Fury Xs at launch), the Radeon Pro Duo isn't an option, and we haven't even had a chance to test it. It's a pity, because from a technical point of view, the Radeon Pro Duo looks intriguing. The name card TDP increased by only 27% compared to the Fury X, despite the fact that the peak frequencies AMD processors reduced by 50 MHz. Previously, AMD has already managed to release a successful dual-processor video card - Radeon R9 295X2, so the specifications announced by the manufacturer do not cause much skepticism.

What to expect in 2017

The main expectations for the coming year are related to AMD. NVIDIA will likely limit itself to releasing a flagship GP102-based gaming card called the GeForce GTX 1080 Ti, and perhaps fill another vacancy in the GeForce 10 series with the GTX 1060 Ti. Otherwise, the line of Pascal accelerators has already been formed, and the debut of the next architecture, Volta, is scheduled only for 2018.

As in the CPU realm, AMD has focused its efforts on developing a truly breakthrough GPU microarchitecture, while Polaris has become just a staging post on the way to the latter. Presumably, already in the first quarter of 2017, the company will for the first time release to the mass market its best silicon, Vega 10 (and with it, or subsequently, one or more junior chips in the line). The most reliable evidence of its capabilities was the announcement of the MI25 computing card in the Radeon Instinct line, which is positioned as an accelerator for deep learning tasks. According to the specifications, it is based on nothing less than Vega 10. The card develops 12.5 TFLOPS of processing power in single precision calculations (FP32) - more than the TITAN X on GP102 - and is equipped with 16 GB of HBM2 memory. The TDP of the video card lies within 300 watts. One can only guess about the actual speed of the processor, but it is known that Vega will bring the most massive update to the GPU microarchitecture since the release of the first GCN-based chips five years ago. The latter will significantly improve the performance per watt and allow for more efficient use of computing power shader ALUs (in which AMD chips traditionally do not lack) in gaming applications.

There are also rumors that AMD engineers have now mastered the 14nm FinFET process to perfection and the company is ready to release a second version of Polaris graphics cards with a significantly lower TDP. It seems to us that if this is true, then the updated chips will rather go to the Radeon RX 500 line than receive increased indices in the existing 400 series.

Application. Current lines of AMD and NVIDIA discrete video adapters

Manufacturer AMD
Model Radeon RX 460 Radeon RX 470 Radeon RX 480 Radeon R9 Nano Radeon R9 Fury Radeon R9 Fury X
GPU
Name Polaris 11 Polaris 10 Polaris 10 fiji xt Fiji PRO fiji xt
microarchitecture GCN 1.3 GCN 1.3 GCN 1.3 GCN 1.2 GCN 1.2 GCN 1.2
Process technology, nm 14nm FinFET 14nm FinFET 14nm FinFET 28 28 28
Number of transistors, million 3 000 5 700 5 700 8900 8900 8900
1 090 / 1 200 926 / 1 206 1 120 / 1 266 — / 1 000 — / 1 000 — / 1 050
Number of shader ALUs 896 2 048 2 304 4096 3584 4096
56 128 144 256 224 256
Number of ROPs 16 32 32 64 64 64
RAM
Bus width, bit 128 256 256 4096 4096 4096
Chip type GDDR5 SDRAM GDDR5 SDRAM GDDR5 SDRAM HBM HBM HBM
1 750 (7 000) 1 650 (6 600) 1 750 (7 000) / 2 000 (8 000) 500 (1000) 500 (1000) 500 (1000)
Volume, MB 2 048 / 4 096 4 096 4 096 / 8 192 4096 4096 4096
I/O bus PCI Express 3.0 x8 PCI Express 3.0 x16 PCI Express 3.0 x16 PCI Express 3.0 x16 PCI Express 3.0 x16 PCI Express 3.0 x16
Performance
2 150 4 940 5 834 8 192 7 168 8 602
Performance FP32/FP64 1/16 1/16 1/16 1/16 1/16 1/16
112 211 196/224 512 512 512
Image Output
DL DVI-D, HDMI 2.0b, DisplayPort 1.3/1.4 DL DVI-D, HDMI 2.0b, DisplayPort 1.3/1.4 HDMI 1.4a, DisplayPort 1.2 HDMI 1.4a, DisplayPort 1.2 HDMI 1.4a, DisplayPort 1.2
TDP, W <75 120 150 175 275 275
109/139 179 199/229 649 549 649
8 299 / 10 299 15 999 16 310 / 18 970 ND ND ND
Manufacturer NVIDIA
Model GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 3 GB GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 TITAN X
GPU
Name GP107 GP107 GP106 GP106 GP104 GP104 GP102
microarchitecture Pascal Pascal Maxwell Maxwell Pascal Pascal Pascal
Process technology, nm 14nm FinFET 14nm FinFET 16nm FinFET 16nm FinFET 16nm FinFET 16nm FinFET 16nm FinFET
Number of transistors, million 3 300 3 300 4 400 4 400 7 200 7 200 12 000
Clock frequency, MHz: Base Clock / Boost Clock 1 354 / 1 455 1 290 / 1 392 1506/1708 1506/1708 1 506 / 1 683 1 607 / 1 733 1 417 / 1531
Number of shader ALUs 640 768 1 152 1 280 1 920 2 560 3 584
Number of texture overlays 40 48 72 80 120 160 224
Number of ROPs 32 32 48 48 64 64 96
RAM
Bus width, bit 128 128 192 192 256 256 384
Chip type GDDR5 SDRAM GDDR5 SDRAM GDDR5 SDRAM GDDR5 SDRAM GDDR5 SDRAM GDDR5X SDRAM GDDR5X SDRAM
Clock frequency, MHz (bandwidth per contact, Mbps) 1 750 (7 000) 1 750 (7 000) 2000 (8000) 2000 (8000) 2000 (8000) 1 250 (10 000) 1 250 (10 000)
Volume, MB 2 048 4 096 6 144 6 144 8 192 8 192 12 288
I/O bus PCI Express 3.0 x16 PCI Express 3.0 x16 PCI Express 3.0 x16 PCI Express 3.0 x16 PCI Express 3.0 x16 PCI Express 3.0 x16 PCI Express 3.0 x16
Performance
Peak performance FP32, GFLOPS (based on maximum specified frequency) 1 862 2 138 3 935 4 373 6 463 8 873 10 974
Performance FP32/FP64 1/32 1/32 1/32 1/32 1/32 1/32 1/32
Bandwidth random access memory, GB/s 112 112 192 192 256 320 480
Image Output
Image output interfaces DL DVI-D, DisplayPort 1.3/1.4, HDMI 2.0b DL DVI-D, DisplayPort 1.3/1.4, HDMI 2.0b DL DVI-D, DisplayPort 1.3/1.4, HDMI 2.0b DL DVI-D, DisplayPort 1.3/1.4, HDMI 2.0b DL DVI-D, DisplayPort 1.3/1.4, HDMI 2.0b DL DVI-D, DisplayPort 1.3/1.4, HDMI 2.0b
TDP, W 75 75 120 120 150 180 250
Suggested retail price at the time of release (US, without tax), $ 109 139 199 249/299 (Founders Edition / partner cards) 379/449 (Founders Edition / partner cards) 599/699 (Founders Edition / partner cards) 1 200
Recommended retail price at the time of release (Russia), rub. 8 490 10 490 ND 18,999 / — (Founders Edition / partner cards) ND / 34,990 (Founders Edition / partner cards) ND / 54,990 (Founders Edition / partner cards)

The integrated graphics processor plays an important role for both gamers and undemanding users.

The quality of games, movies, watching videos on the Internet and images depends on it.

Principle of operation

The graphics processor is integrated into the computer motherboard - this is what the built-in graphics looks like.

As a rule, they use it to remove the need to install a graphics adapter -.

This technology helps to reduce the cost of the finished product. In addition, due to the compactness and low power consumption of such processors, they are often installed in laptops and low-power desktop computers.

Thus, integrated graphics processors have filled this niche so much that 90% of laptops on US store shelves have just such a processor.

Instead of a conventional video card in integrated graphics, the computer's RAM itself often serves as an auxiliary tool.

True, this solution somewhat limits the performance of the device. Yet the computer itself and the GPU use the same bus for memory.

So such a “neighborhood” affects the performance of tasks, especially when working with complex graphics and during gameplay.

Kinds

Integrated graphics have three groups:

  1. Shared-memory graphics is a device based on shared memory management with the main processor. This greatly reduces the cost, improves the energy saving system, but degrades performance. Accordingly, for those who work with complex programs, integrated GPUs of this kind are more likely to not work.
  2. Discrete graphics - a video chip and one or two video memory modules are soldered on the motherboard. Thanks to this technology, image quality is significantly improved, and it also becomes possible to work with three-dimensional graphics with the best results. True, you will have to pay a lot for this, and if you are looking for a high-performance processor in all respects, then the cost can be incredibly high. In addition, the electricity bill will rise slightly - the power consumption of discrete GPUs is higher than usual.
  3. Hybrid discrete graphics - a combination of the two previous types, which ensured the creation of the PCI Express bus. Thus, access to the memory is carried out both through the soldered video memory and through the operational one. With this solution, the manufacturers wanted to create a compromise solution, but it still does not eliminate the shortcomings.

Manufacturers

As a rule, large companies are engaged in the manufacture and development of embedded graphics processors -, and, but many small enterprises are also connected to this area.

It's easy to do. Look for Primary Display or Init Display First. If you do not see something like this, look for Onboard, PCI, AGP or PCI-E (it all depends on the installed buses on the motherboard).

By selecting PCI-E, for example, you enable the PCI-Express video card, and disable the built-in integrated one.

Thus, to enable the integrated video card, you need to find the appropriate parameters in the BIOS. Often the activation process is automatic.

Disable

Disabling is best done in BIOS. This is the simplest and most unpretentious option, suitable for almost all PCs. The only exceptions are some laptops.

Again, find Peripherals or Integrated Peripherals in BIOS if you are working on a desktop.

For laptops, the name of the function is different, and not the same everywhere. So just look for something related to graphics. For example, the desired options can be placed in the Advanced and Config sections.

Shutdown is also carried out in different ways. Sometimes it is enough just to click “Disabled” and set the PCI-E video card to the first in the list.

If you are a laptop user, don't be alarmed if you cannot find a suitable option, you may not have such a function a priori. For all other devices, the same rules are simple - no matter how the BIOS itself looks, the filling is the same.

If you have two video cards and they are both shown in the device manager, then the matter is quite simple: right-click on one of them and select “disable”. However, keep in mind that the display may go out. And, most likely, it will.

However, this is also a solvable problem. It is enough to restart the computer or by.

Perform all subsequent settings on it. If this method does not work, roll back your actions using safe mode. You can also resort to the previous method - through the BIOS.

Two programs - NVIDIA Control Center and Catalyst Control Center - configure the use of a specific video adapter.

They are the most unpretentious in comparison with the other two methods - the screen is unlikely to turn off, you will not accidentally knock down the settings through the BIOS either.

For NVIDIA, all settings are in the 3D section.

You can choose your preferred video adapter for the entire operating system, and for certain programs and games.

In the Catalyst software, an identical function is located in the "Power" option under the "Switchable Graphics" sub-item.

Thus, switching between GPUs is not difficult.

There are different methods, in particular, both through programs and through BIOS. Turning on or off one or another integrated graphics may be accompanied by some failures, mainly related to the image.

It may go out or just appear distorted. Nothing should affect the files themselves in the computer, unless you clicked something in the BIOS.

Conclusion

As a result, integrated graphics processors are in demand due to their cheapness and compactness.

For this, you will have to pay the level of performance of the computer itself.

In some cases, integrated graphics are simply necessary - discrete processors are ideal for working with three-dimensional images.

In addition, the industry leaders are Intel, AMD and Nvidia. Each of them offers its own graphics accelerators, processors and other components.

The latest popular models are Intel HD Graphics 530 and AMD A10-7850K. They are quite functional, but have some flaws. In particular, this applies to the power, performance and cost of the finished product.

You can enable or disable a graphics processor with a built-in kernel, or you can do it yourself through BIOS, utilities and various programs, but the computer itself can do it for you. It all depends on which video card is connected to the monitor itself.

Basic components of a video card:

  • exits;
  • interfaces;
  • cooling system;
  • graphics processor;
  • video memory.

Graphic technologies:

  • dictionary;
  • architecture GPU: functions
    vertex/pixel units, shaders, fillrate, texture/raster units, pipelines;
  • GPU architecture: technology
    manufacturing process, GPU frequency, local video memory (size, bus, type, frequency), solutions with multiple video cards;
  • visual features
    DirectX, high dynamic range (HDR), FSAA, texture filtering, high resolution textures.

Glossary of basic graphic terms

Refresh Rate

Like in a movie theater or on a TV, your computer simulates movement on a monitor by displaying a sequence of frames. The refresh rate of the monitor indicates how many times per second the picture will be updated on the screen. For example, 75 Hz corresponds to 75 updates per second.

If the computer processes frames faster than the monitor can output, then games may experience problems. For example, if the computer calculates 100 frames per second, and the monitor refresh rate is 75 Hz, then due to overlays, the monitor can only display part of the picture during its refresh period. As a result, visual artifacts appear.

As a solution, you can enable V-Sync (vertical sync). It limits the number of frames a computer can produce to the monitor's refresh rate, preventing artifacts. If you enable V-Sync, the number of frames rendered in the game will never exceed the refresh rate. That is, at 75 Hz, the computer will output no more than 75 frames per second.

Pixel

The word "Pixel" stands for " pic ture el ement" is an image element. It is a tiny dot on the display that can glow in a certain color (in most cases, the hue is displayed by a combination of three basic colors: red, green and blue). If the screen resolution is 1024×768, then you can see a matrix of 1024 pixels in width and 768 pixels in height. Together, pixels make up an image. The picture on the screen is updated from 60 to 120 times per second, depending on the type of display and the data provided by the output of the video card. CRT monitors update the display line by line, while LCD flat panel monitors can update each pixel individually.

Vertex

All objects in the 3D scene are made up of vertices. A vertex is a point in 3D space with x, y, and z coordinates. Several vertices can be grouped into a polygon: most often a triangle, but more complex shapes are possible. The polygon is then textured to make the object look realistic. The 3D cube shown in the illustration above has eight vertices. More complex objects have curved surfaces that actually consist of a very large number of vertices.

Texture

A texture is simply a 2D image of arbitrary size that is overlaid on a 3D object to simulate its surface. For example, our 3D cube has eight vertices. Before texture mapping, it looks like a simple box. But when we apply the texture, the box becomes colored.

Shader

Pixel shaders allow the graphics card to produce impressive effects, such as this water in Elder Scrolls: Oblivion.

Today there are two types of shaders: vertex and pixel. Vertex shaders can modify or transform 3D objects. Pixel shader programs allow you to change the colors of pixels based on some data. Imagine a light source in a 3D scene that makes the illuminated objects glow brighter, and at the same time, casts shadows on other objects. All this is implemented by changing the color information of the pixels.

Pixel shaders are used to create complex effects in your favorite games. For example, shader code can make the pixels surrounding a 3D sword glow brighter. Another shader can process all the vertices of a complex 3D object and simulate an explosion. Game developers are increasingly turning to complex shader programs to create realistic graphics. Almost every modern graphic-rich game uses shaders.

With the release of the next application programming interface (API, Application Programming Interface) Microsoft DirectX 10, a third type of shader called geometry shaders will be released. With their help, it will be possible to break objects, modify and even destroy them, depending on the desired result. The third type of shaders can be programmed in exactly the same way as the first two, but its role will be different.

Fill Rate

Very often on the box with the video card you can find the value of the fill rate. Basically, fillrate indicates how fast the GPU can render pixels. Older video cards had a triangle fill rate. But today there are two types of fill rate: pixel fill rate and texture fill rate. As already mentioned, the pixel fill rate corresponds to the pixel output rate. It is calculated as the number of raster operations (ROP) multiplied by the clock frequency.

ATi and nVidia calculate texture fill rates differently. Nvidia thinks that speed is obtained by multiplying the number of pixel pipelines by the clock speed. And ATi multiplies the number of texture units by the clock speed. In principle, both methods are correct, since nVidia uses one texture unit per pixel shader unit (that is, one per pixel pipeline).

With these definitions in mind, let's move on and discuss the most important GPU features, what they do, and why they're so important.

GPU architecture: features

The realism of 3D graphics is very dependent on the performance of the graphics card. The more pixel shader blocks the processor contains and the higher the frequency, the more effects can be applied to the 3D scene to improve its visual perception.

The GPU contains many different functional blocks. By the number of some components, you can estimate how powerful the GPU is. Before moving on, let's look at the most important functional blocks.

Vertex Processors (Vertex Shader Units)

Like pixel shaders, vertex processors execute shader code that touches vertices. Since a larger vertex budget allows you to create more complex 3D objects, the performance of vertex processors is very important in 3D scenes with complex or large numbers of objects. However, vertex shader units still do not have such an obvious impact on performance as pixel processors.

Pixel processors (pixel shaders)

A pixel processor is a component of the graphics chip dedicated to processing pixel shader programs. These processors perform calculations relating to pixels only. Since pixels contain color information, pixel shaders can achieve impressive graphical effects. For example, most of the water effects you see in games are created using pixel shaders. Typically, the number of pixel processors is used to compare the pixel performance of video cards. If one card is equipped with eight pixel shader units and the other with 16 units, then it is quite logical to assume that a video card with 16 units will process complex pixel programs faster. Clock speed should also be considered, but today doubling the number of pixel processors is more efficient in terms of power consumption than doubling the frequency of a graphics chip.

Unified shaders

Unified (single) shaders have not yet come to the PC world, but the upcoming DirectX 10 standard relies on a similar architecture. That is, the code structure of vertex, geometric and pixel programs will be the same, although shaders will perform different work. The new specification can be viewed on the Xbox 360, where the GPU was custom-designed by ATi for Microsoft. It will be very interesting to see what potential the new DirectX 10 brings.

Texture Mapping Units (TMUs)

Textures should be selected and filtered. This work is done by the texture mapping units, which work in conjunction with the pixel and vertex shader units. The job of the TMU is to apply texture operations to the pixels. The number of texture units in a GPU is often used to compare the texture performance of graphics cards. It's quite reasonable to assume that a video card with more TMUs will give better texture performance.

Raster Operator Unit (ROP)

RIPs are responsible for writing pixel data to memory. The rate at which this operation is performed is the fill rate. In the early days of 3D accelerators, ROPs and fill rates were very important characteristics of graphics cards. Today, the work of ROP is still important, but the performance of the video card is no longer limited by these blocks, as it used to be. Therefore, the performance (and number) of the ROP is rarely used to evaluate the speed of a video card.

Conveyors

Pipelines are used to describe the architecture of video cards and give a very visual representation of the performance of a GPU.

The conveyor cannot be considered a strict technical term. The GPU uses different pipelines that perform different functions. Historically, a pipeline was understood as a pixel processor that was connected to its own texture mapping unit (TMU). For example, the Radeon 9700 video card uses eight pixel processors, each of which is connected to its own TMU, so the card is considered to have eight pipelines.

But it is very difficult to describe modern processors by the number of pipelines. Compared to previous designs, the new processors use a modular, fragmented structure. ATi can be considered an innovator in this area, which, with the X1000 line of video cards, switched to a modular structure, which made it possible to achieve performance gains through internal optimization. Some CPU blocks are used more than others, and in order to improve the performance of the GPU, ATi has tried to find a compromise between the number of blocks needed and the die area (it cannot be increased very much). In this architecture, the term "pixel pipeline" has already lost its meaning, since the pixel processors are no longer connected to their own TMUs. For example, the ATi Radeon X1600 GPU has 12 pixel shaders and a total of four TMUs. Therefore, one cannot say that there are 12 pixel pipelines in the architecture of this processor, just as one cannot say that there are only four of them. However, by tradition, pixel pipelines are still mentioned.

With these assumptions in mind, the number of pixel pipelines in a GPU is often used to compare video cards (with the exception of the ATi X1x00 line). For example, if we take video cards with 24 and 16 pipelines, then it is quite reasonable to assume that a card with 24 pipelines will be faster.

GPU Architecture: Technology

Process technology

This term refers to the size of one element (transistor) of the chip and the accuracy of the manufacturing process. Improvement of technical processes allows to obtain elements of smaller dimensions. For example, the 0.18 µm process produces larger features than the 0.13 µm process, so it's not as efficient. Smaller transistors operate on lower voltage. In turn, a decrease in voltage leads to a decrease in thermal resistance, which reduces the amount of heat generated. Improving the process technology allows you to reduce the distance between the functional blocks of the chip, and it takes less time to transfer data. Shorter distances, lower voltages and other improvements allow higher clock speeds to be achieved.

Somewhat complicates the understanding that both micrometers (µm) and nanometers (nm) are used today to designate the process technology. In fact, everything is very simple: 1 nanometer is equal to 0.001 micrometer, so 0.09-micron and 90-nm manufacturing processes are the same thing. As noted above, a smaller process technology allows you to get higher clock speeds. For example, if we compare video cards with 0.18 micron and 0.09 micron (90 nm) chips, then it is quite reasonable to expect a higher frequency from a 90 nm card.

GPU clock speed

GPU clock speed is measured in megahertz (MHz), which is millions of cycles per second.

The clock speed directly affects the performance of the GPU. The higher it is, the more work can be done per second. For the first example, let's take the nVidia GeForce 6600 and 6600 GT video cards: the 6600 GT graphics processor runs at 500 MHz, while the regular 6600 card runs at 400 MHz. Because the processors are technically identical, a 20% increase in clock speed on the 6600 GT results in better performance.

But clock speed is not everything. Keep in mind that performance is greatly affected by the architecture. For the second example, let's take GeForce 6600 GT and GeForce 6800 GT video cards. The GPU frequency of the 6600 GT is 500 MHz, but the 6800 GT only runs at 350 MHz. Now let's take into account that the 6800 GT uses 16 pixel pipelines, while the 6600 GT has only eight. Therefore, a 6800 GT with 16 pipelines at 350 MHz will give about the same performance as a processor with eight pipelines and twice the clock speed (700 MHz). With that said, clock speed can be used to compare performance.

Local video memory

Graphics card memory has a huge impact on performance. But different memory settings affect differently.

Video memory

The amount of video memory can probably be called the parameter of a video card, which is most overestimated. Inexperienced consumers often use the amount of video memory to compare different cards with each other, but in reality, the amount has little effect on performance compared to parameters such as memory bus frequency and interface (bus width).

In most cases, a card with 128 MB of video memory will perform almost the same as a card with 256 MB. Of course, there are situations where more memory leads to better performance, but remember that more memory will not automatically increase the speed in games.

Where volume is useful is in games with high resolution textures. Game developers include several sets of textures with the game. And the more memory there is on the video card, the higher resolution the loaded textures can have. High-resolution textures give higher definition and detail in the game. Therefore, it is quite reasonable to take a card with a large amount of memory, if all other criteria are the same. Recall once again that the width of the memory bus and its frequency have a much stronger effect on performance than the amount of physical memory on the card.

Memory bus width

Memory bus width is one of the most important aspects of memory performance. Modern buses range in width from 64 to 256 bits, and in some cases even 512 bits. The wider the memory bus, the more information it can transfer per clock. And this directly affects performance. For example, if we take two buses with equal frequencies, then theoretically a 128-bit bus will transfer twice as much data per clock as a 64-bit one. A 256-bit bus is twice as large.

Higher bus bandwidth (expressed in bits or bytes per second, 1 byte = 8 bits) gives better memory performance. That is why the memory bus is much more important than its size. At equal frequencies, a 64-bit memory bus operates at only 25% of a 256-bit one!

Let's take the following example. A video card with 128 MB of video memory but with a 256-bit bus gives much better memory performance than a 512 MB model with a 64-bit bus. It's important to note that for some cards from the ATi X1x00 series, manufacturers specify the specifications of the internal memory bus, but we are interested in the parameters of the external bus. For example, the X1600's internal ring bus is 256 bits wide, but the external one is only 128 bits wide. And in reality, the memory bus works with 128-bit performance.

Memory types

Memory can be divided into two main categories: SDR (single data transfer) and DDR (double data transfer), in which data is transferred per clock twice as fast. Today, SDR single transmission technology is obsolete. Since DDR memory transfers data twice as fast as SDR, it is important to remember that video cards with DDR memory often indicate twice the frequency, not the physical one. For example, if DDR memory is listed at 1000 MHz, that is the effective frequency that regular SDR memory must run at to give the same bandwidth. But in fact, the physical frequency is 500 MHz.

For this reason, many people are surprised when their video card memory is listed at 1200 MHz DDR, while utilities report 600 MHz. So you'll have to get used to it. DDR2 and GDDR3/GDDR4 memory work on the same principle, i.e. with double data transfer. The difference between DDR, DDR2, GDDR3 and GDDR4 memory lies in the production technology and some details. DDR2 can operate at higher frequencies than DDR memory, and DDR3 can operate at even higher frequencies than DDR2.

Memory bus frequency

Like a processor, memory (or, more accurately, the memory bus) runs at certain clock speeds, measured in megahertz. Here, increasing clock speeds directly affects memory performance. And the frequency of the memory bus is one of the parameters that are used to compare the performance of video cards. For example, if all other characteristics (memory bus width, etc.) are the same, then it is quite logical to say that a video card with 700 MHz memory is faster than a 500 MHz one.

Again, clock speed isn't everything. 700 MHz memory with a 64-bit bus will be slower than 400 MHz memory with a 128-bit bus. The performance of 400 MHz memory on a 128-bit bus corresponds approximately to 800 MHz memory on a 64-bit bus. You should also remember that GPU and memory frequencies are completely different parameters, and usually they are different.

Video card interface

All data transferred between the video card and the processor passes through the video card interface. Today, three types of interfaces are used for video cards: PCI, AGP and PCI Express. They differ in bandwidth and other characteristics. It is clear that the higher the bandwidth, the higher the exchange rate. However, only the most modern cards can use high bandwidth, and even then only partially. At some point, the speed of the interface ceased to be a "bottleneck", it is simply enough today.

The slowest bus for which video cards have been produced is PCI (Peripheral Components Interconnect). Without going into history, of course. PCI really worsened the performance of video cards, so they switched to the AGP (Accelerated Graphics Port) interface. But even the AGP 1.0 and 2x specifications limited performance. When the standard increased the speed to AGP 4x, we started to approach the practical limit of the bandwidth that video cards can use. The AGP 8x specification once again doubled the bandwidth compared to AGP 4x (2.16 GB / s), but we did not get a noticeable increase in graphics performance.

The newest and fastest bus is PCI Express. Newer graphics cards typically use the PCI Express x16 interface, which combines 16 PCI Express lanes for a total bandwidth of 4 GB/s (in one direction). This is twice the throughput of AGP 8x. The PCI Express bus gives the mentioned bandwidth for both directions (data transfer to and from the video card). But the speed of the AGP 8x standard was already enough, so we haven't seen a situation where switching to PCI Express gave a performance boost compared to AGP 8x (if other hardware parameters are the same). For example, the AGP version of the GeForce 6800 Ultra will work identically to the 6800 Ultra for PCI Express.

Today it is best to buy a card with a PCI Express interface, it will last on the market for several more years. The most productive cards are no longer produced with the AGP 8x interface, and PCI Express solutions, as a rule, are already easier to find than AGP analogs, and they are cheaper.

Multi-GPU Solutions

Using multiple graphics cards to increase graphics performance is not a new idea. In the early days of 3D graphics, 3dfx entered the market with two graphics cards running in parallel. But with the disappearance of 3dfx, the technology for working together several consumer video cards was forgotten, although ATi has been producing similar systems for professional simulators since the release of the Radeon 9700. A couple of years ago, the technology returned to the market with the advent of nVidia SLI solutions and, a little later, ATi Crossfire.

Sharing multiple graphics cards gives enough performance to run the game at high quality settings in high definition. But choosing one or the other is not easy.

Let's start with the fact that solutions based on multiple video cards require a lot of energy, so the power supply must be powerful enough. All this heat will have to be removed from the video card, so you need to pay attention to the PC case and cooling so that the system does not overheat.

Also, remember that SLI/CrossFire requires an appropriate motherboard (either for one technology or another), which is usually more expensive than standard models. The nVidia SLI configuration will only work on certain nForce4 boards, while ATi CrossFire cards will only work on motherboards with a CrossFire chipset or some Intel models. To make matters worse, some CrossFire configurations require one of the cards to be special: the CrossFire Edition. After the release of CrossFire, for some models of video cards, ATi allowed the inclusion of the technology of cooperating over the PCI Express bus, and with the release of new driver versions, the number of possible combinations increases. But still hardware CrossFire with the appropriate CrossFire Edition card gives better performance. But CrossFire Edition cards are also more expensive than regular models. Currently, you can enable CrossFire software mode (without CrossFire Edition card) on Radeon X1300, X1600 and X1800 GTO graphics cards.

Other factors should also be taken into account. Although two graphics cards working together give a performance boost, it is far from double. But you will pay twice as much money. Most often, the increase in productivity is 20-60%. And in some cases, due to additional computational costs for matching, there is no increase at all. For this reason, multi-card configurations are unlikely to pay off with cheap models, since a more expensive video card will usually always outperform a pair of cheap cards. In general, for most consumers, taking an SLI / CrossFire solution does not make sense. But if you want to enable all the quality enhancement options or play at extreme resolutions, for example, 2560x1600, when you need to calculate more than 4 million pixels per frame, then two or four paired video cards are indispensable.

Visual Features

In addition to purely hardware specifications, different generations and models of GPUs can differ in feature sets. For example, it is often said that ATi Radeon X800 XT generation cards are compatible with Shader Model 2.0b (SM), while the nVidia GeForce 6800 Ultra is compatible with SM 3.0, although their hardware specifications are close to each other (16 pipelines). Therefore, many consumers make a choice in favor of one solution or another, without even knowing what this difference means.

Microsoft DirectX and Shader Model versions

These names are most often used in disputes, but few people know what they really mean. To understand, let's start with the history of graphics APIs. DirectX and OpenGL are graphics APIs, that is, Application Programming Interfaces - open code standards available to everyone.

Before the advent of graphics APIs, each GPU manufacturer had its own mechanism for communicating with games. Developers had to write separate code for each GPU they wanted to support. A very expensive and inefficient approach. To solve this problem, APIs for 3D graphics were developed so that developers would write code for a specific API, and not for this or that video card. After that, compatibility problems fell on the shoulders of video card manufacturers, who had to ensure that the drivers were compatible with the API.

The only complication remains that today two different APIs are used, namely Microsoft DirectX and OpenGL, where GL stands for Graphics Library (graphics library). Since the DirectX API is more popular in games today, we will focus on it. And this standard influenced the development of games more strongly.

DirectX is a creation of Microsoft. In fact, DirectX includes several APIs, only one of which is used for 3D graphics. DirectX includes APIs for sound, music, input devices, and more. The Direct3D API is responsible for 3D graphics in DirectX. When they talk about video cards, they mean exactly it, therefore, in this respect, the concepts of DirectX and Direct3D are interchangeable.

DirectX is updated periodically as graphics technology advances and game developers introduce new game programming techniques. As the popularity of DirectX has grown rapidly, GPU manufacturers have begun to tailor new product releases to fit the capabilities of DirectX. For this reason, video cards are often tied to the hardware support of one or another generation of DirectX (DirectX 8, 9.0 or 9.0c).

To complicate matters further, parts of the Direct3D API can change over time without changing generations of DirectX. For example, the DirectX 9.0 specification specifies support for Pixel Shader 2.0. But the DirectX 9.0c update includes Pixel Shader 3.0. So while the cards are in the DirectX 9 class, they may support different sets of features. For example, the Radeon 9700 supports Shader Model 2.0 and the Radeon X1800 supports Shader Model 3.0, although both cards can be classified as DirectX 9 generation.

Remember that when creating new games, developers take into account the owners of old machines and video cards, because if you ignore this segment of users, then sales will be lower. For this reason, multiple code paths are built into games. A DirectX 9 class game will most likely have a DirectX 8 path and even a DirectX 7 path for compatibility. Usually, if the old path is chosen, some virtual effects that are on new video cards disappear in the game. But at least you can play even on the old hardware.

Many new games require the latest version of DirectX to be installed, even if the graphics card is from a previous generation. That is, a new game that will use the DirectX 8 path still requires the latest version of DirectX 9 to be installed on a DirectX 8 class graphics card.

What are the differences between the different versions of the Direct3D API in DirectX? The early versions of DirectX—3, 5, 6, and 7—were relatively simple in terms of the Direct3D API. Developers could select visual effects from a list, and then check their work in the game. The next major step in graphics programming was DirectX 8. It introduced the ability to program the graphics card using shaders, so for the first time developers had the freedom to program effects the way they wanted. DirectX 8 supported Pixel Shader versions 1.0 to 1.3 and Vertex Shader 1.0. DirectX 8.1, an updated version of DirectX 8, received Pixel Shader 1.4 and Vertex Shader 1.1.

In DirectX 9, you can create even more complex shader programs. DirectX 9 supports Pixel Shader 2.0 and Vertex Shader 2.0. DirectX 9c, an updated version of DirectX 9, included the Pixel Shader 3.0 specification.

DirectX 10, an upcoming version of the API, will accompany the new version of Windows Vista. DirectX 10 cannot be installed on Windows XP.

HDR lighting and OpenEXR HDR

HDR stands for "High Dynamic Range", high dynamic range. A game with HDR lighting can give a much more realistic picture than a game without it, and not all graphics cards support HDR lighting.

Before the advent of DirectX 9-class graphics cards, GPUs were severely limited by the accuracy of their lighting calculations. Until now, lighting could only be calculated with 256 (8 bits) internal levels.

When DirectX 9-class graphics cards came out, they were able to produce lighting with high fidelity - full 24 bits or 16.7 million levels.

With 16.7 million levels, and after taking the next step in DirectX 9/Shader Model 2.0-class graphics card performance, HDR lighting is also possible on computers. This is a rather complex technology, and you need to watch it in dynamics. In simple terms, HDR lighting increases contrast (dark shades appear darker, light shades lighter), while at the same time increasing the amount of lighting detail in dark and light areas. A game with HDR lighting feels more alive and realistic than without it.

GPUs that comply with the latest Pixel Shader 3.0 specification allow for higher 32-bit precision lighting calculations as well as floating point blending. Thus, SM 3.0-class graphics cards can support OpenEXR's special HDR lighting method, specifically designed for the film industry.

Some games that only support HDR lighting using the OpenEXR method will not run with HDR lighting on Shader Model 2.0 graphics cards. However, games that do not rely on the OpenEXR method will work on any DirectX 9 graphics card. For example, Oblivion uses the OpenEXR HDR method and only allows HDR lighting to be enabled on the latest graphics cards that support the Shader Model 3.0 specification. For example, nVidia GeForce 6800 or ATi Radeon X1800. Games that use the Half-Life 2 3D engine, such as Counter-Strike: Source and the upcoming Half-Life 2: Aftermath, allow you to enable HDR rendering on older DirectX 9 graphics cards that only support Pixel Shader 2.0. Examples include the GeForce 5 line or the ATi Radeon 9500.

Finally, keep in mind that all forms of HDR rendering require serious processing power and can bring even the most powerful GPUs to their knees. If you want to play the latest games with HDR lighting, high performance graphics are a must.

Full screen anti-aliasing

Full-screen anti-aliasing (abbreviated as AA) allows you to eliminate the characteristic "ladders" at the boundaries of polygons. But keep in mind that full-screen anti-aliasing consumes a lot of computing resources, which leads to a drop in frame rate.

Anti-aliasing is very dependent on video memory performance, so a fast video card with fast memory will be able to calculate full-screen anti-aliasing with less performance impact than an inexpensive video card. Anti-aliasing can be enabled in various modes. For example, 4x anti-aliasing will give a better picture than 2x anti-aliasing, but it will be a big performance hit. While 2x anti-aliasing doubles the horizontal and vertical resolution, 4x mode quadruples it.

Texture filtering

All 3D objects in the game are textured, and the larger the angle of the displayed surface, the more distorted the texture will look. To eliminate this effect, GPUs use texture filtering.

The first filtering method was called bilinear and gave characteristic stripes that were not very pleasing to the eye. The situation improved with the introduction of trilinear filtering. Both options on modern video cards work with virtually no performance degradation.

Anisotropic filtering (AF) is by far the best way to filter textures. Similar to FSAA, anisotropic filtering can be turned on at different levels. For example, 8x AF gives better filtering quality than 4x AF. Like FSAA, anisotropic filtering requires a certain amount of processing power, which increases as the AF level increases.

High resolution textures

All 3D games are built to specific specifications, and one of those requirements determines the texture memory that the game will need. All the necessary textures must fit into the memory of the video card during the game, otherwise the performance will drop dramatically, since accessing the texture in RAM gives a considerable delay, not to mention the paging file on the hard disk. So if a game developer is counting on 128MB VRAM as the minimum requirement, then the active texture set should not exceed 128MB at any time.

Modern games have multiple texture sets, so the game will run smoothly on older graphics cards with less VRAM, as well as on newer cards with more VRAM. For example, a game may contain three texture sets: for 128 MB, 256 MB, and 512 MB. There are very few games that support 512 MB of video memory today, but they are still the most objective reason to buy a video card with this amount of memory. Although the increase in memory has little to no effect on performance, you will get an improvement in visual quality if the game supports the appropriate texture set.

What you need to know about video cards?

In contact with

Task Manager Windows 10 contains detailed monitoring tools GPU (GPU). You can view the usage of each app and system-wide GPU, and Microsoft promises that indicators task manager will be more accurate than third-party utilities.

How it works

These features GPU were added in the update Fall Creators for Windows 10 , also known as Windows 10 version 1709 . If you are using Windows 7, 8 or an older version of Windows 10, you will not see these tools in your Task Manager.

Windows uses newer features in the Windows Display Driver Model to extract information directly from GPU (VidSCH) and video memory manager (VidMm) in the WDDM graphics core, which are responsible for the actual allocation of resources. It shows very accurate data no matter what API applications are using to access the GPU - Microsoft DirectX, OpenGL, Vulkan, OpenCL, NVIDIA CUDA, AMD Mantle or whatever.

That is why in task manager only WDDM 2.0 compliant systems are displayed GPUs . If you don't see it, your system's GPU is probably using an older type of driver.

You can check which version of WDDM your driver is using GPU by pressing the Windows button + R, typing in the field " dxdiag", And then press" Enter"To open the tool" DirectX Diagnostic Tool". Go to the Screen tab and look to the right of Model under Drivers. If you see a WDDM 2.x driver here, your system is compatible. If you see a WDDM 1.x driver here, your GPU incompatible.

How to View GPU Performance

This information is available in task manager , although it is hidden by default. To open it, open Task Manager by right-clicking on any empty space on the taskbar and selecting " Task Manager” or by pressing Ctrl+Shift+Esc on the keyboard.

Click the More Details button at the bottom of the window Task Manager' if you see the standard simple view.

If a GPU not showing up in task manager , in full screen mode on the tab " Processes» right click on any column heading and then enable the option « GPU ". This will add a column GPU , which allows you to see the percentage of resources GPU used by each application.

You can also enable the option " GPU core to see which GPU the application is using.

General use GPU of all applications on your system is displayed at the top of the column GPU. Click a column GPU to sort the list and see what apps are using your GPU the most at the moment.

Number in column GPU is the highest usage that the application uses for all engines. So, for example, if an application is using 50% of the GPU 3D engine and 2% of the GPU's video decoding engine, you'll just see the number 50% displayed in the GPU column.

In the column " GPU core” is displayed for each application. It shows you what physical GPU and what engine the application uses, such as whether it uses a 3D engine or a video decoding engine. You can determine which GPU matches a particular metric by checking the " Performance', which we will discuss in the next section.

How to view an application's video memory usage

If you are wondering how much video memory is being used by an application, you need to go to the Details tab in Task Manager. On the Details tab, right-click on any column heading and select Select Columns. Scroll down and enable columns " GPU », « GPU core », « " and " ". The first two are also available on the Processes tab, but the last two memory options are only available on the Details panel.

Column " Dedicated GPU memory » shows how much memory the app is using on your GPU. If your PC has a discrete NVIDIA or AMD graphics card, then this is part of its VRAM, that is, how much physical memory an application uses on your graphics card. If you have integrated graphics processor , some of your regular system memory is reserved exclusively for your graphics hardware. This shows how much of the reserved memory is being used by the application.

Windows also allows applications to store some data in regular system DRAM. Column " Shared GPU Memory ' shows how much memory the application is currently using for video devices from the computer's normal system RAM.

You can click on any of the columns to sort by them and see which app is using the most resources. For example, to see the applications using the most video memory on your GPU, click the " Dedicated GPU memory ».

How to Track GPU Share Usage

To track overall resource usage statistics GPU, go to the " Performance' and look at ' GPU» at the bottom of the sidebar. If your computer has multiple GPUs, you will see several options here GPU.

If you have multiple linked GPUs - using a feature such as NVIDIA SLI or AMD Crossfire you will see them identified by a "#" in their name.

Windows displays usage GPU in real time. Default Task Manager tries to display the most interesting four engines according to what's happening on your system. For example, you'll see different graphics depending on whether you're playing 3D games or encoding videos. However, you can click on any of the names above the graphs and select any of the other engines available.

Name of your GPU also appears in the sidebar and at the top of this window, making it easy to check what graphics hardware is installed on your PC.

You will also see dedicated and shared memory usage graphs GPU. Shared Memory Usage GPU refers to how much of the system's total memory is used for tasks GPU. This memory can be used for both normal system tasks and video recordings.

At the bottom of the window, you will see information such as the installed video driver version number, development date, and physical location. GPU on your system.

If you want to view this information in a smaller window that's easier to leave on screen, double-click anywhere inside the GPU screen, or right-click anywhere inside it and select the option Graphic summary". You can maximize the window by double-clicking on the panel, or by right-clicking in it and unchecking " Graphic summary».

You can also right-click on the chart and select Edit Graph > Single Core to view just one engine graph GPU.

To have this window permanently displayed on your screen, click "Options" > " On top of other windows».

Double click inside the panel GPU one more time and you have a minimal window that you can position anywhere on the screen.

Good day to all, my dear friends and guests of my blog. Today I would like to talk a little about the hardware of our computers. Please tell me, have you heard about such a thing as a GPU? It turns out that many people just hear such an abbreviation for the first time.

No matter how trite it may sound, but today we live in an era computer technology, and sometimes it's hard to find someone who has no idea how a computer works. So, for example, it is enough for someone to realize that the computer works thanks to CPU(CPU).

Someone will go further and find out that there is also a certain GPU. Such an intricate abbreviation, but similar to the previous one. So let's figure out what a GPU is in a computer, what they are and what differences it has with a CPU.

Not a big difference

In simple words, GPU is a graphics processing unit, sometimes referred to as a video card, which is partly a mistake. A video card is a ready-made component device, which includes the processor we are describing. It is capable of processing commands to form 3D graphics. It is worth noting that it is a key element for this, the speed and various capabilities of the video system as a whole depend on its power.

The GPU has its own distinctive features compared to its CPU counterpart. The main difference lies in the architecture on which it is built. The architecture of the GPU is built in such a way that it allows you to process large amounts of data more efficiently. The CPU, in turn, processes data and tasks sequentially. Naturally, this feature should not be taken as a minus.

Types of GPUs

There are not many types of graphics processors, one of them is referred to as discrete, and is used on individual modules. Such a chip is quite powerful, so it requires a cooling system of radiators, coolers, liquid cooling can be used in especially loaded systems.

Today we can observe a significant step in the development of graphic components, this is due to the emergence of a large number of types of GPUs. If before any computer had to be equipped with discrete graphics in order to have access to games or other graphics applications, now such a task can be performed by the IGP - integrated graphics processor.

Integrated graphics are now supplied with almost every computer (with the exception of servers), whether it be a laptop or desktop computer. The video processor itself is built into the CPU, which can significantly reduce power consumption and the price of the device itself. In addition, such graphics can be in other subspecies, for example: discrete or hybrid-discrete.

The first option implies the most expensive solution, wiring to motherboard or a separate mobile module. The second option is called hybrid for a reason, in fact, it uses a small video memory that is soldered on the board, but at the same time it is able to expand it using RAM.

Naturally, such graphic solutions cannot be equal to full-fledged discrete video cards, but even now it shows quite good performance. In any case, the developers have something to strive for, perhaps the future lies with such a decision.

Well, that's about all I have. Hope you enjoyed the article! Looking forward to seeing you again on my blog. Good luck to you. Bye Bye!