Commit graph

3517 commits

Author SHA1 Message Date
ReinUsesLisp
2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
ReinUsesLisp
ec0da3ef64 half_set_predicate: Fix HSETP2_C constant buffer offset 2019-08-04 02:50:55 -03:00
Fernando Sahmkow
e52c895559 GPU: Flush commands on every dma pusher step.
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.
2019-07-26 16:54:22 -04:00
bunnei
52f54c728d
Merge pull request #2592 from FernandoS27/sync1
Implement GPU Synchronization Mechanisms & Correct NVFlinger
2019-07-26 14:26:44 -04:00
ReinUsesLisp
77f1a676a1 decode/half_set_predicate: Fix predicates 2019-07-26 00:12:38 -03:00
Fernando Sahmkow
a452ff983d MaxwellDMA: Fixes, corrections and relaxations.
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
bunnei
b0ff3179ef
Merge pull request #2739 from lioncash/cflow
video_core/control_flow: Minor changes/warning cleanup
2019-07-25 13:04:56 -04:00
bunnei
4d26550f5f
Merge pull request #2737 from FernandoS27/track-fix
Shader_Ir: Correct tracking to track from right to left
2019-07-25 12:41:52 -04:00
bunnei
31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
bunnei
9be9600bdc
Merge pull request #2704 from FernandoS27/conditional
maxwell3d: Implement Conditional Rendering
2019-07-24 17:07:57 -04:00
ReinUsesLisp
104641db07 shader/decode: Implement S2R Tic 2019-07-22 16:16:10 -03:00
bunnei
f601f25bcc
Merge pull request #2734 from ReinUsesLisp/compute-shaders
gl_rasterizer: Implement compute shaders
2019-07-22 11:12:55 -04:00
bunnei
27e10e0442
Merge pull request #2735 from FernandoS27/pipeline-rework
Rework Dirty Flags in GPU Pipeline, Optimize CBData and Redo Clearing mechanism
2019-07-21 00:59:52 -04:00
Fernando Sahmkow
11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow
7a35178ee2 Maxwell3D: Reorganize and address feedback 2019-07-20 10:18:35 -04:00
Fernando Sahmkow
1158777737 Shader_Ir: Change Debug Asserts for Log Warnings 2019-07-19 22:15:34 -04:00
ReinUsesLisp
45c162444d shader/half_set_predicate: Fix HSETP2 implementation 2019-07-19 22:21:22 -03:00
ReinUsesLisp
6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
Lioncash
c1c89411da video_core/control_flow: Provide operator!= for types with operator==
Provides operational symmetry for the respective structures.
2019-07-18 21:03:31 -04:00
Lioncash
1780e0e3d0 video_core/control_flow: Prevent sign conversion in TryGetBlock()
The return value is a u32, not an s32, so this would result in an
implicit signedness conversion.
2019-07-18 21:03:31 -04:00
Lioncash
a162a844d2 video_core/control_flow: Remove unnecessary BlockStack copy constructor
This is the default behavior of the copy constructor, so it doesn't need
to be specified.

While we're at it we can make the other non-default constructor
explicit.
2019-07-18 21:03:30 -04:00
Lioncash
56bc11d952 video_core/control_flow: Use std::move where applicable
Results in less work being done where avoidable.
2019-07-18 21:03:30 -04:00
Lioncash
e7b39f47f8 video_core/control_flow: Use the prefix variant of operator++ for iterators
Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath.
2019-07-18 21:03:30 -04:00
Lioncash
6885e7e7ec video_core/control_flow: Use empty() member function for checking emptiness
It's what it's there for.
2019-07-18 21:03:30 -04:00
Lioncash
45fa12a05c video_core: Resolve -Wreorder warnings
Ensures that the constructor members are always initialized in the order
that they're declared in.
2019-07-18 21:03:30 -04:00
Lioncash
47df844338 video_core/control_flow: Make program_size for ScanFlow() a std::size_t
Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface.
2019-07-18 21:03:29 -04:00
Lioncash
3df9558593 video_core/control_flow: Place all internally linked types/functions within an anonymous namespace
Previously, quite a few functions were being linked with external
linkage.
2019-07-18 21:03:29 -04:00
Lioncash
1109db86b7 video_core/shader/decode: Prevent sign-conversion warnings
Makes it explicit that the conversions here are intentional.
2019-07-18 21:03:29 -04:00
bunnei
63bda67a34
Merge pull request #2738 from lioncash/shader-ir
shader-ir: Minor cleanup-related changes
2019-07-18 13:52:01 -04:00
Fernando Sahmkow
5a06e33859 Shader_Ir: correct clang format 2019-07-18 10:09:26 -04:00
Fernando Sahmkow
43f57d668c GPU: Add missing puller methods.
This adds some missing puller methods. We don't assert them as these are 
nop operations for us.
2019-07-18 08:54:42 -04:00
Fernando Sahmkow
3a3fee5abf MaxwellDMA/KeplerCopy: Downgrade DMA log message to Trace.
This log was just to know which games used DMA. It's no longer 
important.
2019-07-18 08:31:38 -04:00
Fernando Sahmkow
d3b71ff80d Gl_Texture_Cache: Remove assert on component type in GetFormatTuple
Textures can have different components types in different orders. This 
assert was completely inprecise and the effectiveness of such is better 
handled by case and within the texture cache.
2019-07-18 08:20:31 -04:00
Fernando Sahmkow
0b65e9335e Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
This commit reduces the sevirity of asserts for FP precision and 
rounding as this are well known and have little to no consequences in 
gpu's accuracy.
2019-07-18 08:17:19 -04:00
ReinUsesLisp
74632c76ce gl_shader_decompiler: Rename bufferImage to imageBuffer
The online OpenGL documentation is wrong. The type definition is
imageBuffer.
2019-07-18 01:16:44 -03:00
ReinUsesLisp
87909d327f gl_shader_cache: Fix newline on buffer preprocessor definitions 2019-07-18 01:16:15 -03:00
ReinUsesLisp
e7bdf8b22a textures: Fix texture buffer size calculation 2019-07-18 01:07:08 -03:00
ReinUsesLisp
84027f4808 gl_texture_cache: Do not set texture parameters to buffers 2019-07-18 01:06:26 -03:00
ReinUsesLisp
73b2dc6d4f gl_texture_cache: Add missing break in CreateTexture 2019-07-18 01:04:18 -03:00
Fernando Sahmkow
4be61013a1 GL_State: Feedback and fixes 2019-07-17 17:29:56 -04:00
Fernando Sahmkow
5ad889f6fd Maxwell3D: Address Feedback 2019-07-17 17:29:55 -04:00
Fernando Sahmkow
7826f0afd9 Texture_Cache: Rebase Fixes 2019-07-17 17:29:54 -04:00
Fernando Sahmkow
8cdbfe69b1 GL_Rasterizer: Corrections to Clearing. 2019-07-17 17:29:54 -04:00
Fernando Sahmkow
0ff4a5fa39 Maxwell3D: Correct marking dirtiness on CB upload 2019-07-17 17:29:53 -04:00
Fernando Sahmkow
fec32fed18 GL_Rasterizer: Rework RenderTarget/DepthBuffer clearing 2019-07-17 17:29:52 -04:00
Fernando Sahmkow
a081dea8ab Maxwell3D: Implement State Dirty Flags. 2019-07-17 17:29:51 -04:00
Fernando Sahmkow
0d3db58657 Maxwell3D: Rework CBData Upload 2019-07-17 17:29:50 -04:00
Fernando Sahmkow
f2e7b29c14 Maxwell3D: Rework the dirty system to be more consistant and scaleable 2019-07-17 17:29:49 -04:00
Fernando Sahmkow
e42bcf2314 maxwell3d: Implement Conditional Rendering
Conditional Rendering takes care of conditionaly clearing or drawing
depending on a set of queries. This PR implements the query checks to
stablish if things can be rendered or not.
2019-07-17 17:13:19 -04:00
Fernando Sahmkow
223a535f3f
Merge pull request #2740 from lioncash/bra
shader/decode/other: Correct branch indirect argument within BRA handling
2019-07-17 14:25:08 -04:00
Lioncash
bebbdc2067 shader_ir: std::move Node instance where applicable
These are std::shared_ptr instances underneath the hood, which means
copying them isn't as cheap as a regular pointer. Particularly so on
weakly-ordered systems.

This avoids atomic reference count increments and decrements where they
aren't necessary for the core set of operations.
2019-07-16 19:49:23 -04:00
Lioncash
60926ac16b shader_ir: Rename Get/SetTemporal to Get/SetTemporary
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
2019-07-16 19:47:43 -04:00
Lioncash
44d87ff641 shader_ir: Remove unused includes
Removes unnecessary header dependencies.
2019-07-16 19:47:42 -04:00
Fernando Sahmkow
d614193e49 Shader_Ir: Correct tracking to track from right to left 2019-07-16 15:06:59 -04:00
Fernando Sahmkow
b56e7f870a
Merge pull request #2565 from ReinUsesLisp/track-indirect
shader/track: Track indirect buffers
2019-07-16 14:58:35 -04:00
Lioncash
e2d7dda166 shader/decode/other: Correct branch indirect argument within BRA handling
This appears to have been a copy/paste error introduced within
8a6fc529a9
2019-07-16 12:20:45 -04:00
ReinUsesLisp
2a4044a858 gl_shader_cache: Fix clang-format issues 2019-07-15 20:33:51 -03:00
ReinUsesLisp
6b0d017675 gl_shader_decompiler: Stub local memory size 2019-07-15 17:38:25 -03:00
ReinUsesLisp
56bca83bde gl_shader_cache: Address review commentaries 2019-07-15 17:38:25 -03:00
ReinUsesLisp
bbecd13697 gl_shader_cache: Address CI issues 2019-07-15 17:38:25 -03:00
ReinUsesLisp
725ba6cf63 gl_rasterizer: Implement compute shaders 2019-07-15 17:38:25 -03:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
bunnei
b77a1ed67a
Merge pull request #2705 from FernandoS27/tex-cache-fixes
GPU: Fixes to Texture Cache and Include Microprofiles for GL State/BufferCopy/Macro Interpreter
2019-07-14 22:44:36 -04:00
ReinUsesLisp
afa8096df5 shader: Allow tracking of indirect buffers without variable offset
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
2019-07-14 22:36:44 -03:00
bunnei
3477b92289
Merge pull request #2675 from ReinUsesLisp/opengl-buffer-cache
buffer_cache: Implement a generic buffer cache and its OpenGL backend
2019-07-14 19:03:43 -04:00
Fernando Sahmkow
2ac7472d3f Texture_Cache: Address Feedback 2019-07-14 17:42:39 -04:00
Fernando Sahmkow
0f54b541f4 Texture_Cache: Remove some unprecise fallback case and clang format 2019-07-14 12:00:32 -04:00
Fernando Sahmkow
5818959e54 Texture_Cache: Force Framebuffer reset if an active render target is unregistered. 2019-07-14 12:00:31 -04:00
Fernando Sahmkow
913b7a6872 GPU: Add a microprofile for macro interpreter 2019-07-14 12:00:30 -04:00
Fernando Sahmkow
a9943222f2 GL_State: Add a microprofile timer to OpenGL state. 2019-07-14 12:00:30 -04:00
Fernando Sahmkow
5c1e1a148e Gl_Texture_Cache: Measure Buffer Copy Times 2019-07-14 12:00:29 -04:00
Fernando Sahmkow
5d31bab69a Texture_Cache: Correct Linear Structural Match. 2019-07-14 12:00:28 -04:00
Fernando Sahmkow
4882c058fd
Merge pull request #2690 from SciresM/physmem_fixes
Implement MapPhysicalMemory/UnmapPhysicalMemory
2019-07-14 09:16:46 -04:00
Fernando Sahmkow
0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
bunnei
bb67091c77
Merge pull request #2609 from FernandoS27/new-scan
Implement a New Shader Scanner, Decompile Flow Stack and implement BRX BRA.CC
2019-07-11 17:36:23 -04:00
ReinUsesLisp
0eb0c24269 gl_shader_decompiler: Fix gl_PointSize redeclaration 2019-07-11 16:10:59 -03:00
ReinUsesLisp
aca40de224 gl_shader_decompiler: Fix conditional usage of GL_ARB_shader_viewport_layer_array 2019-07-11 04:27:00 -03:00
bunnei
fd066ffbce
Merge pull request #2697 from lioncash/doc
gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
2019-07-10 16:38:09 -04:00
bunnei
7fb7054bc8
Merge pull request #2686 from ReinUsesLisp/vk-scheduler
vk_scheduler: Drop execution context in favor of views
2019-07-10 16:35:48 -04:00
bunnei
206ec29f17
Merge pull request #2691 from lioncash/override
video_core: Add missing override specifiers
2019-07-10 16:25:43 -04:00
Fernando Sahmkow
f2549739d1 shader_ir: Add comments on missing instruction.
Also shows Nvidia's address space on comments.
2019-07-09 17:15:45 -04:00
Michael Scire
a1845d1dd3 prefer system reference over global accessor 2019-07-09 08:11:35 -07:00
Fernando Sahmkow
2de7649311 shader_ir: limit explorastion to best known program size. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow
e7c6045a03 control_flow: Correct block breaking algorithm. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow
dc4a93594c control_flow: Assert shaders bigger than limit. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow
e7a88f0ab3 control_flow: Address feedback. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow
34357b110c shader_ir: Correct parsing of scheduling instructions and correct sizing 2019-07-09 08:14:41 -04:00
Fernando Sahmkow
cfb3db1a32 shader_ir: Correct max sizing 2019-07-09 08:14:40 -04:00
Fernando Sahmkow
d45fed3030 shader_ir: Remove unnecessary constructors and use optional for ScanFlow result 2019-07-09 08:14:40 -04:00
Fernando Sahmkow
01b21ee1e8 shader_ir: Corrections, documenting and asserting control_flow 2019-07-09 08:14:39 -04:00
Fernando Sahmkow
d5533b440c shader_ir: Unify blocks in decompiled shaders. 2019-07-09 08:14:39 -04:00
Fernando Sahmkow
926b80102f shader_ir: Decompile Flow Stack 2019-07-09 08:14:38 -04:00
Fernando Sahmkow
459fce3a8f shader_ir: propagate shader size to the IR 2019-07-09 08:14:37 -04:00
Fernando Sahmkow
8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
Fernando Sahmkow
c218ae4b02 shader_ir: Remove the old scanner. 2019-07-09 08:14:36 -04:00
Fernando Sahmkow
8af6e6a052 shader_ir: Implement a new shader scanner 2019-07-09 08:14:36 -04:00
Lioncash
c04785c928 gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
must_reconfigure isn't a parameter for this function any more, so it can
be replaced with current_state.

While we're at it, we can make the parameters of the declaration match
the same name as the ones in the definition.
2019-07-09 02:08:15 -04:00
Michael Scire
697206092e Prevent merging of device mapped memory blocks.
This sets the DeviceMapped attribute for GPU-mapped memory blocks,
and prevents merging device mapped blocks. This prevents memory
mapped from the gpu from having its backing address changed by
block coalesce.
2019-07-08 22:52:05 -07:00
ReinUsesLisp
c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
Tobias
be020f7621
Delete decode_integer_set.cpp 2019-07-07 21:40:33 +02:00
ReinUsesLisp
d0966b9f7c shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
Lioncash
cbdd6cd1c0 vk_sampler_cache: Remove unused includes
These are no longer used within this header, so they can be removed.
2019-07-07 13:40:36 -04:00
Lioncash
4b27680639 video_core: Add missing override specifiers 2019-07-07 13:38:39 -04:00
ReinUsesLisp
86a874a2fc vk_scheduler: Drop execution context in favor of views
Instead of passing by copy an execution context through out the whole
Vulkan call hierarchy, use a command buffer view and fence view
approach.

This internally dereferences the command buffer or fence forcing the
user to be unable to use an outdated version of it on normal usage.
It is still possible to keep store an outdated if it is casted to
VKFence& or vk::CommandBuffer.

While changing this file, add an extra parameter for Flush and Finish to
allow releasing the fence from this calls.
2019-07-07 03:30:22 -03:00
ReinUsesLisp
79a23ca5f0 buffer_cache: Avoid [[nodiscard]] to make clang-format happy 2019-07-06 01:17:05 -03:00
ReinUsesLisp
83050c9495 buffer_cache: Try to fix MinGW build 2019-07-06 01:14:05 -03:00
ReinUsesLisp
f7691ebe57 gl_rasterizer: Fix nullptr dereference on disabled buffers 2019-07-06 00:37:56 -03:00
ReinUsesLisp
7ecf64257a gl_rasterizer: Minor style changes 2019-07-06 00:37:55 -03:00
ReinUsesLisp
9cdc576f60 gl_rasterizer: Fix vertex and index data invalidations 2019-07-06 00:37:55 -03:00
ReinUsesLisp
1fa21fa192 gl_buffer_cache: Implement with generic buffer cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp
32c0212b24 buffer_cache: Implement a generic buffer cache
Implements a templated class with a similar approach to our current
generic texture cache. It is designed to be compatible with Vulkan and
OpenGL,
2019-07-06 00:37:55 -03:00
ReinUsesLisp
2bcae41a73 gl_buffer_cache: Remove global system getters 2019-07-06 00:37:55 -03:00
ReinUsesLisp
02ab844934 gl_device: Query SSBO alignment 2019-07-06 00:37:55 -03:00
ReinUsesLisp
d14fbfb9b5 gl_buffer_cache: Implement flushing 2019-07-06 00:37:55 -03:00
ReinUsesLisp
345f852bdb gl_rasterizer: Drop gl_global_cache in favor of gl_buffer_cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp
8155b12d3d gl_buffer_cache: Rework to support internalized buffers 2019-07-06 00:37:55 -03:00
ReinUsesLisp
f8ba72d491 gl_buffer_cache: Store in CachedBufferEntry the used buffer handle 2019-07-06 00:37:55 -03:00
ReinUsesLisp
b54fb8fc4c gl_buffer_cache: Return used buffer from Upload function 2019-07-06 00:37:55 -03:00
ReinUsesLisp
a6d2f52fc3 gl_rasterizer: Add some commentaries 2019-07-06 00:37:55 -03:00
ReinUsesLisp
2b9d4088ec gl_rasterizer: Make DrawParameters rasterizer instance const 2019-07-06 00:37:55 -03:00
ReinUsesLisp
2e39c20da5 gl_rasterizer: Move index buffer uploading to its own method 2019-07-06 00:37:55 -03:00
Fernando Sahmkow
d20ede40b1 NVServices: Styling, define constructors as explicit and corrections 2019-07-05 15:49:32 -04:00
Fernando Sahmkow
b391e5f638 NVFlinger: Correct GCC compile error 2019-07-05 15:49:31 -04:00
Fernando Sahmkow
0335a25d1f NVServices: Make NVEvents Automatic according to documentation. 2019-07-05 15:49:29 -04:00
Fernando Sahmkow
7d1b974bca GPU: Correct Interrupts to interrupt on syncpt/value instead of event, mirroring hardware 2019-07-05 15:49:26 -04:00
Fernando Sahmkow
f2e026a1d8 gpu_asynch: Simplify synchronization to a simpler consumer->producer scheme. 2019-07-05 15:49:20 -04:00
Fernando Sahmkow
0706d633bf nv_host_ctrl: Make Sync GPU variant always return synced result. 2019-07-05 15:49:20 -04:00
Fernando Sahmkow
600dddf88d Async GPU: do invalidate as synced operation
Async GPU: Always invalidate synced.
2019-07-05 15:49:19 -04:00
Fernando Sahmkow
c13433aee4 Gpu: use an std mutex instead of a spin_lock to guard syncpoints 2019-07-05 15:49:18 -04:00
Fernando Sahmkow
eef55f493b Gpu: Mark areas as protected. 2019-07-05 15:49:16 -04:00
Fernando Sahmkow
a45643cb3b nv_services: Stub CtrlEventSignal 2019-07-05 15:49:15 -04:00
Fernando Sahmkow
8942047d41 Gpu: Implement Hardware Interrupt Manager and manage GPU interrupts 2019-07-05 15:49:14 -04:00
Fernando Sahmkow
82b829625b video_core: Implement GPU side Syncpoints 2019-07-05 15:49:11 -04:00
Zach Hilman
772c86a260
Merge pull request #2601 from FernandoS27/texture_cache
Implement a new Texture Cache
2019-07-05 13:39:13 -04:00
Fernando Sahmkow
3b9d89839d texture_cache: Address Feedback 2019-07-05 09:46:53 -04:00
Fernando Sahmkow
30b176f92b texture_cache: Correct Texture Buffer Uploading 2019-07-04 19:38:19 -04:00
Zach Hilman
ad50cd7df9 gl_shader_cache: Make CachedShader constructor private
Fixes missing review comments introduced.
2019-07-03 20:39:46 -04:00
Zach Hilman
da5a537029
Merge pull request #2563 from ReinUsesLisp/shader-initializers
gl_shader_cache: Use static constructors for CachedShader initialization
2019-07-03 20:20:05 -04:00
Fernando Sahmkow
4705d1b523 rasterizer_cache: Protect inherited caches from submission level 2019-07-01 04:32:01 -04:00
ReinUsesLisp
6e1db6b703 texture_cache: Pack sibling queries inside a method 2019-06-29 20:47:46 -03:00
ReinUsesLisp
8eae66907e texture_cache: Use std::vector reservation for sampled_textures 2019-06-29 20:10:31 -03:00
ReinUsesLisp
f6f1a8f26a texture_cache: Style changes 2019-06-29 19:52:37 -03:00
ReinUsesLisp
dd9ace502b texture_cache: Use std::array for siblings_table 2019-06-29 18:54:13 -03:00
ReinUsesLisp
3f3c3ca5f9 texture_cache: Address feedback 2019-06-29 17:29:39 -03:00
Fernando Sahmkow
223ca80753 texture_cache: Correct variable naming. 2019-06-25 19:35:08 -04:00
Fernando Sahmkow
5aeabd9a17 gl_texture_cache: Correct asserts 2019-06-25 19:26:59 -04:00
Fernando Sahmkow
88bc39374f texture_cache: Corrections, documentation and asserts 2019-06-25 18:36:19 -04:00
Fernando Sahmkow
c0abc7124d surface_params: Corrections, asserts and documentation. 2019-06-25 18:03:25 -04:00
Fernando Sahmkow
fb234560b0 copy_params: use constexpr for constructor 2019-06-25 17:42:50 -04:00
Fernando Sahmkow
18d24fbdd0 gl_texture_cache: Corrections and fixes 2019-06-25 17:40:08 -04:00
Fernando Sahmkow
36665ce0b2 gl_resource_manager: Correct MakeStreamCopy 2019-06-25 17:32:04 -04:00
Fernando Sahmkow
58c8a44e7a texture_cache: Query MemoryManager from the system 2019-06-25 17:26:00 -04:00
ReinUsesLisp
7565389700 texture_cache: Include "core/core.h" 2019-06-24 02:15:57 -03:00
ReinUsesLisp
e723441e37 gl_texture_cache: Explicitly add indirect include 2019-06-24 02:13:55 -03:00
ReinUsesLisp
34841a41c3 texture_cache/surface_view: Address feedback 2019-06-24 02:09:56 -03:00
ReinUsesLisp
0837290992 texture_cache/surface_base: Address feedback 2019-06-24 02:08:52 -03:00
ReinUsesLisp
75de730e28 video_core/surface: Address feedback 2019-06-24 02:07:11 -03:00
ReinUsesLisp
10a83653ee decode/texture: Address feedback 2019-06-24 02:05:05 -03:00
ReinUsesLisp
4504302abc renderer_opengl/utils: Remove unused includes and unused forward declaration 2019-06-24 02:03:37 -03:00
ReinUsesLisp
4b2ff1e00e gl_texture_cache: Address some feedback 2019-06-24 02:01:44 -03:00
ReinUsesLisp
0b6df52109 gl_shader_disk_cache: Address feedback 2019-06-24 01:59:32 -03:00
ReinUsesLisp
b8b05a484a gl_shader_decompiler: Address feedback 2019-06-24 01:56:38 -03:00
ReinUsesLisp
4d63f97945 shader_bytecode: Include missing <array> 2019-06-24 01:51:02 -03:00
bunnei
a9f3c54871
Merge pull request #2579 from ReinUsesLisp/fix-aoffi-test
gl_device: Fix TestVariableAoffi test
2019-06-21 15:28:55 -04:00
Fernando Sahmkow
d1812316e1 texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
Fernando Sahmkow
51ba60b27e shader_cache: Correct versioning and size calculation. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
97c8c9f49a texture_cache: Eliminate linear textures fallthrough 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
6acdae0e4c texture_cache: Correct format R16U as sibling 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
d7587842eb texture_cache: Implement texception detection and texture barriers. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
198a0395bb texture_cache: Corrections to buffers and shadow formats use. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
fed773a86c texture_cache: Implement Irregular Views in surfaces 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
082740d34d surface: Correct format S8Z24 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
03d489dcf5 texture_cache: Initialize all siblings to invalid pixel format. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
9422cf7c10 gl_texture_cache: Use Stream Buffers instead of Persistant for Buffer Copies. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
fac3706253 gl_texture_cache: Correct Image Blit 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
7232a1ed16 decoders: correct block calculation 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
3dd7643214 texture_cache: Use siblings textures on Rebuild and fix possible error on blitting 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
4db28f72f6 texture_cache: Remove old rasterizer cache 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
2d83553ea7 texture_cache: Implement siblings texture formats. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
cb728797b0 fermi2d: Correct Origin Mode 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
a56f687793 texture_cache: correct texture buffer on surface params 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
b01f9c8a70 texture_cache: eliminate accelerated depth->color/color->depth copies due to driver instability. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
561ce29c98 texture_cache: correct mutex locks 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
b7de31ac97 shader_ir: Fix image copy rebase issues 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
6f69f06873 texture_cache: Don't Image Copy if component types differ 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
9f755218a1 texture_cache: move some large methods to cpp files 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
3809041c24 texture_cache: Optimize GetSurface and use references on functions that don't change a surface. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
60bf761afb texture_cache: Implement Buffer Copy and detect Turing GPUs Image Copies 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
228f516bb4 texture_cache uncompress-compress is untopological.
This makes conflicts between non compress and compress textures to be 
auto recycled. It also limits the amount of mipmaps a texture can have 
if it goes above it's limit.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow
9251354152 texture_cache: Correct copying between compressed and uncompressed formats 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
0966665fc2 texture_cache: Only load on recycle with accurate GPU.
Testing so far has proven this to be quite safe as texture memory read 
added a 2-5ms load to the current cache.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow
ea1525dab1 Fix rebase errors 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
bdf9faab33 texture_cache: Handle uncontinuous surfaces. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
e60ed2bb3e texture_cache: return null surface on invalid address 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
fcac55d5bf texture_cache: Add checks for texture buffers. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow
175aa343ff texture_cache: Fermi2D reform and implement View Mirage
This also does some fixes on compressed textures reinterpret and on the
Fermi2D engine in general.
2019-06-20 21:38:33 -03:00
ReinUsesLisp
1bf4154e7d gl_shader_decompiler: Implement image binding settings 2019-06-20 21:38:33 -03:00
ReinUsesLisp
9097301d92 shader: Implement bindless images 2019-06-20 21:38:33 -03:00
ReinUsesLisp
06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp
007ffbef1c gl_rasterizer: Track texture buffer usage 2019-06-20 21:38:33 -03:00
ReinUsesLisp
58c0d37422 video_core: Make ARB_buffer_storage a required extension 2019-06-20 21:36:12 -03:00
ReinUsesLisp
07f7ce1da2 gl_rasterizer_cache: Use texture buffers to emulate texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp
b8c75a845b maxwell_3d: Partially implement texture buffers as 1D textures 2019-06-20 21:36:12 -03:00
ReinUsesLisp
6c81c8f5b7 gl_shader_decompiler: Allow 1D textures to be texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp
4e81fc8296 shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
d267948a73 texture_cache: loose TryReconstructSurface when accurate GPU is not on.
Also corrects some asserts.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
6162cb922e texture_cache: Document the most important methods. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
4530511ee4 texture_cache: Try to Reconstruct Surface on bigger than overlap.
This fixes clouds in SMO Cap Kingdom and lens on Cloud Kingdom.
Also moved accurate_gpu setting check to Pick Strategy
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
a79831d9d0 texture_cache: Implement Guard mechanism 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
7731a0e2d1 texture_cache: General Fixes
Fixed ASTC mipmaps loading
Fixed alignment on openGL upload/download
Fixed Block Height Calculation
Removed unalign_height
2019-06-20 21:36:12 -03:00
ReinUsesLisp
c2ed348bdd surface_params: Ensure pitch is always written to avoid surface leaks 2019-06-20 21:36:12 -03:00
ReinUsesLisp
9098905dd1 gl_framebuffer_cache: Use a hashed struct to cache framebuffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
d65a4af895 texture_cache return invalid buffer on deactivated color_mask 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
6bd034eae9 engine_upload: Addapt to new Texture Cache 2019-06-20 21:36:12 -03:00
ReinUsesLisp
2131f71573 surface_params: Optimize CreateForTexture
Instead of using Common::AlignUp, use Common::AlignBits to align the
texture compression factor.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
41b4674458 gl_texture_cache: Make main views be proxy textures instead of a full view. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
07cc7e0c12 texture_cache: Add ASync Protections 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
1bbc9debfb Remove Framebuffer reconfiguration and restrict rendertarget protection 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
5192521dc3 texture_cache: Implement GPU Dirty Flags 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
94f2be5473 texture_cache: Optimize GetMipBlockHeight and GetMipBlockDepth 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
a4a58be2d4 texture_cache: Implement L1_Inner_cache 2019-06-20 21:36:12 -03:00
ReinUsesLisp
345e73f2fe video_core: Use un-shifted block sizes to avoid integer divisions
Instead of storing all block width, height and depths in their shifted
form:

block_width = 1U << block_shift;

Store them like they are provided by the emulated hardware (their
block_shift form). This way we can avoid doing the costly
Common::AlignUp operation to align texture sizes and drop CPU integer
divisions with bitwise logic (defined in Common::AlignBits).
2019-06-20 21:36:12 -03:00
ReinUsesLisp
28d7c2f5a5 texture_cache: Change internal cache from lists to vectors 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
b347543e83 Reduce amount of size calculations. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
4e2071b6d9 texture_cache: Correct premature texceptions
Due to our current infrastructure, it is possible for a mipmap to be set 
on as a render target before a texception of that mipmap's superset be 
set afterwards. This is problematic as we rely on texture views to set 
up texceptions and protecting render targets targets for 3D texture 
rendering.

One simple solution is to configure framebuffers after texture setup but 
this brings other problems. This solution, forces a reconfiguration of 
the framebuffers after such event happens.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
ba677ccb5a texture_cache: Implement guest flushing 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
de0b1cb2b2 Fixes to mipmap's process and reconstruct process 2019-06-20 21:36:12 -03:00
ReinUsesLisp
e0002599ac surface_base: Add parenthesis to EmplaceOverview's predicate 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
324e470879 Texture Cache: Implement Blitting and Fermi Copies 2019-06-20 21:36:12 -03:00
ReinUsesLisp
549fd18ac4 surface_view: Add constructor for ViewParams 2019-06-20 21:36:12 -03:00
ReinUsesLisp
16e8625a30 surface_base: Split BreakDown into layered and non-layered variants 2019-06-20 21:36:12 -03:00
ReinUsesLisp
2b30000a1e surface_base: Silence truncation warnings and minor renames and reordering 2019-06-20 21:36:12 -03:00
ReinUsesLisp
03d10ea3b4 copy_params: Use constructor instead of C-like initialization 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
1af4414861 Correct Mipmaps View method in Texture Cache 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
d86f9cd709 Change texture_cache chaching from GPUAddr to CacheAddr
This also reverses the changes to make invalidation and flushing through
the GPU address.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
b711cdce78 Corrections to Structural Matching
The texture will now be reconstructed if the width only matches on GoB 
alignment.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
bc930754cc Implement Texture Cache V2 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
3d471e732d Correct Surface Base and Views for new Texture Cache 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
3b26206dbd Add OGLTextureView 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
6b0695b3cd Deglobalize Memory Manager on texture cahe and Implement Invalidation and Flushing using GPUVAddr 2019-06-20 21:36:11 -03:00
ReinUsesLisp
6c410104f4 texture_cache: Remove execution context copies from the texture cache
This is done to simplify the OpenGL implementation, it is needed for
Vulkan.
2019-06-20 21:36:11 -03:00
ReinUsesLisp
fa59a7b4d8 gl_texture_cache: Implement fermi copies 2019-06-20 21:36:11 -03:00
ReinUsesLisp
1b4503c571 texture_cache: Split texture cache into different files 2019-06-20 21:36:11 -03:00
ReinUsesLisp
5f3aacdc37 texture_cache: Move staging buffer into a generic implementation 2019-06-20 21:36:11 -03:00
ReinUsesLisp
2787a0c287 texture_cache: Flush 3D textures in the order they are drawn 2019-06-20 21:36:11 -03:00
ReinUsesLisp
4b396f375c gl_texture_cache: Minor changes 2019-06-20 21:36:11 -03:00
ReinUsesLisp
0cefb7bcb4 gl_texture_cache: Add copy from multiple overlaps into a single surface 2019-06-20 21:36:11 -03:00
ReinUsesLisp
84139586c9 gl_texture_cache: Attach surface textures instead of views 2019-06-20 21:36:11 -03:00
ReinUsesLisp
fb94871791 gl_texture_cache: Add fast copy path 2019-06-20 21:36:11 -03:00
ReinUsesLisp
bab21e8cb3 gl_texture_cache: Initial implementation 2019-06-20 21:36:11 -03:00
bunnei
c28694d907
Merge pull request #2591 from lioncash/record
core: Remove unused CiTrace source files
2019-06-19 22:28:26 -04:00
Lioncash
61d2498f00 core: Remove unused CiTrace source files
These source files have been unused for the entire lifecycle of the
project. They're a hold-over from Citra and only add to the build time
of the project, so they can be removed.

There's also likely no way this would ever work in yuzu in its current
form without revamping quite a bit of it, given how different the GPU on
the Switch is compared to the 3DS.
2019-06-18 16:57:59 -04:00
bunnei
c7b5c245e1
Merge pull request #2562 from ReinUsesLisp/split-cbuf-upload
video_core/engines: Move ConstBufferInfo out of Maxwell3D
2019-06-17 22:35:04 -04:00
Zach Hilman
c0e7b91145
Merge pull request #2538 from ReinUsesLisp/ssy-pbk
shader: Split SSY and PBK stack
2019-06-15 20:30:13 -04:00
ReinUsesLisp
ee81fb94cd gl_device: Fix TestVariableAoffi test
This test is intended to be invalid GLSL, but it was being invalid in
two points instead of one. The intention is to use a non-immediate
parameter in a textureOffset like function.

The problem is that this shader was being compiled as a separable
shader object and the text was writting to gl_Position without a
redeclaration, being invalid GLSL.

Address that issue by using a user-defined output attribute.
2019-06-11 23:02:50 -03:00
bunnei
f981efdf8d
Merge pull request #2572 from FernandoS27/gpu-mem
GPUVM: Correct GPU VM virtual address space
2019-06-11 21:09:57 -04:00
Fernando Sahmkow
f79823fda7 GPUVM: Correct GPU VM virtual address space 2019-06-09 17:47:15 -04:00
ReinUsesLisp
528c15051c kepler_compute: Use std::array for cbuf info 2019-06-07 20:36:22 -03:00
ReinUsesLisp
17d5fb6d06 kepler_compute: Fix block_dim_x encoding 2019-06-07 20:35:46 -03:00
ReinUsesLisp
4ec8a3df08 gl_shader_cache: Use static constructors for CachedShader initialization 2019-06-07 20:20:22 -03:00
ReinUsesLisp
5669ff3cbd gl_rasterizer: Remove unused parameters in descriptor uploads 2019-06-07 19:52:16 -03:00
ReinUsesLisp
2f2a61887a video_core/engines: Move ConstBufferInfo out of Maxwell3D 2019-06-07 19:47:15 -03:00
Zach Hilman
de33ad25f5
Merge pull request #2514 from ReinUsesLisp/opengl-compat
video_core: Drop OpenGL core in favor of OpenGL compatibility
2019-06-07 17:23:25 -04:00
ReinUsesLisp
fe8e6618f2 shader: Split SSY and PBK stack
Hardware testing revealed that SSY and PBK push to a different stack,
allowing code like this:

        SSY label1;
        PBK label2;
        SYNC;
label1: PBK;
label2: EXIT;
2019-06-07 02:18:27 -03:00
ReinUsesLisp
769a50661a shader/node: Minor changes
Reflect std::shared_ptr nature of Node on initializers and remove
constant members in nodes.

Add some commentaries.
2019-06-06 20:03:33 -03:00
ReinUsesLisp
e1b3be7ced shader: Move Node declarations out of the shader IR header
Analysis passes do not have a good reason to depend on shader_ir.h to
work on top of nodes. This splits node-related declarations to their own
file and leaves the IR in shader_ir.h
2019-06-06 20:02:37 -03:00
ReinUsesLisp
bf4dfb3ad4 shader: Use shared_ptr to store nodes and move initialization to file
Instead of having a vector of unique_ptr stored in a vector and
returning star pointers to this, use shared_ptr. While changing
initialization code, move it to a separate file when possible.

This is a first step to allow code analysis and node generation beyond
the ShaderIR class.
2019-06-05 20:41:52 -03:00
bunnei
a20ba09bfd
Merge pull request #2520 from ReinUsesLisp/vulkan-refresh
vk_device,vk_shader_decompiler: Miscellaneous changes
2019-06-05 18:10:00 -04:00
bunnei
55c5029171
Merge pull request #2540 from ReinUsesLisp/remove-guest-position
gl_shader_decompiler: Remove guest "position" varying
2019-06-05 18:07:23 -04:00
bunnei
0bcc305797
Merge pull request #2512 from ReinUsesLisp/comp-indexing
gl_shader_decompiler: Pessimize uniform buffer access on AMD's prorpietary driver
2019-06-05 18:02:30 -04:00
Zach Hilman
81e09bb121
Merge pull request #2545 from lioncash/timing
core/core_timing_util: Use std::chrono types for specifying time units
2019-06-05 15:52:37 -04:00
Zach Hilman
dd4fe0dab1
Merge pull request #2534 from ReinUsesLisp/shader-cleanup
gl_shader_cache: Minor style changes
2019-06-05 15:28:34 -04:00
Lioncash
42f5fd0ab3 core/core_timing_util: Use std::chrono types for specifying time units
Makes the interface more type-safe and consistent in terms of return
values.
2019-06-04 20:31:24 -04:00
Fernando Sahmkow
a32c52b1d8 shader_bytecode: Mark EXIT as flow instruction 2019-06-04 12:18:35 -04:00
ReinUsesLisp
0935c2d97b gl_shader_decompiler: Remove guest "position" varying
"position" was being written but not read anywhere besides geometry
shaders, where it had the same value as gl_Position.

This commit replaces "position" with gl_Position, reducing the
complexity of our code and the emitted GLSL code.
2019-06-03 01:01:34 -03:00
ReinUsesLisp
e72b9044a0 gl_shader_cache: Store a system class and drop global accessors 2019-05-30 14:01:40 -03:00
ReinUsesLisp
ad321564ed gl_shader_cache: Add commentaries explaining the intention in shaders creation 2019-05-30 13:58:38 -03:00
ReinUsesLisp
838b6d2ff8 gl_shader_cache: Flip if condition in GetStageProgram to reduce indentation 2019-05-30 13:56:03 -03:00
ReinUsesLisp
6ac4490751 gl_buffer_cache: Remove unused ReserveMemory method 2019-05-30 13:21:01 -03:00
ReinUsesLisp
a89cc0bafc maxwell_to_gl: Use GL_CLAMP to emulate Clamp wrap mode 2019-05-30 13:21:01 -03:00
ReinUsesLisp
b76df62c00 gl_rasterizer: Move alpha testing to the OpenGL pipeline
Removes the alpha testing code from each fragment shader invocation.
2019-05-30 13:21:01 -03:00
ReinUsesLisp
df509486c4 gl_rasterizer: Use GL_QUADS to emulate quads rendering 2019-05-30 13:21:01 -03:00
bunnei
e3608578e4
Merge pull request #2446 from ReinUsesLisp/tid
shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
2019-05-29 12:21:17 -04:00
ReinUsesLisp
21c0b4dec8 gl_device: Add commentary to AOFFI unit test source code
The intention behind this commit is to hint someone inspecting an
apitrace dump to ignore this ill-formed GLSL code.
2019-05-27 00:55:57 -03:00
ReinUsesLisp
84928e6d67 gl_shader_gen: Always declare extensions after the version declaration
This addresses a bug on geometry shaders where code was being written
before all #extension declarations were done. Ref to #2523
2019-05-27 00:51:35 -03:00
ReinUsesLisp
f424b46036 vk_device: Let formats array type be deduced 2019-05-26 03:09:06 -03:00
ReinUsesLisp
a4c5e3e339 vk_shader_decompiler: Misc fixes
Fix missing OpSelectionMerge instruction. This caused devices loses on
most hardware, Intel didn't care.

Fix [-1;1] -> [0;1] depth conversions.

Conditionally use VK_EXT_scalar_block_layout. This allows us to use
non-std140 layouts on UBOs.

Update external Vulkan headers.
2019-05-26 01:48:04 -03:00
ReinUsesLisp
dec3c981d0 vk_device: Enable features when available and misc changes
Keeps track of native ASTC support, VK_EXT_scalar_block_layout
availability and SSBO range.

Check for independentBlend and vertexPipelineStorageAndAtomics as a
required feature. Always enable it.

Use vk::to_string format to log Vulkan enums.

Style changes.
2019-05-26 01:41:34 -03:00
Lioncash
5a4564bd8e renderer_opengl/utils: Use a std::string_view with LabelGLObject()
Uses a std::string_view instead of a std::string, given the pointed to
string isn't modified and is only used in a formatting operation.

This is nice because a few usages directly supply a string literal to
the function, allowing these usages to otherwise not heap allocate,
unlike the std::string overloads.

While we're at it, we can combine the address formatting into a single
formatting call.
2019-05-24 23:50:10 -04:00
bunnei
68c9c9222d
Merge pull request #2358 from ReinUsesLisp/parallel-shader
gl_shader_cache: Use shared contexts to build shaders in parallel at boot
2019-05-24 22:42:08 -04:00
bunnei
1a2d90ab09
Merge pull request #2485 from ReinUsesLisp/generic-memory
shader/memory: Implement generic memory stores and loads (ST and LD)
2019-05-24 18:24:26 -04:00
ReinUsesLisp
d8827b07b5 gl_shader_decompiler: Use an if based cbuf indexing for broken drivers
The following code is broken on AMD's proprietary GLSL compiler:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value = values[idx & 3];
```

It index the wrong components, to fix this the following pessimized code
is emitted when that bug is present:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value;
if ((idx & 3) == 0) some_value = values.x;
if ((idx & 3) == 1) some_value = values.y;
if ((idx & 3) == 2) some_value = values.z;
if ((idx & 3) == 3) some_value = values.w;
```
2019-05-24 02:47:56 -03:00
ReinUsesLisp
46177901b8 gl_device: Add test to detect broken component indexing
Component indexing on AMD's proprietary driver is broken. This commit adds
a test to detect when we are on a driver that can't successfully manage
component indexing.

It dispatches a dummy draw with just one vertex shader that writes to an
indexed SSBO from the GPU with data sent through uniforms, it then reads
that data from the CPU and compares the expected output.
2019-05-24 02:47:56 -03:00
Lioncash
b6dcb1ae4d shader/shader_ir: Make Comment() take a std::string by value
This allows for forming comment nodes without making unnecessary copies
of the std::string instance.

e.g. previously:

Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]",
        cbuf->GetIndex(), cbuf_offset));

Would result in a copy of the string being created, as CommentNode()
takes a std::string by value (a const ref passed to a value parameter
results in a copy).

Now, only one instance of the string is ever moved around. (fmt::format
returns a std::string, and since it's returned from a function by value,
this is a prvalue (which can be treated like an rvalue), so it's moved
into Comment's string parameter), we then move it into the CommentNode
constructor, which then moves the string into its member variable).
2019-05-23 03:01:55 -03:00
Lioncash
228e58d0a5 shader/decode/*: Add missing newline to files lacking them
Keeps the shader code file endings consistent.
2019-05-23 02:55:52 -03:00
Lioncash
87b4c1ac5e shader/decode/*: Eliminate indirect inclusions
Amends cases where we were using things that were indirectly being
satisfied through other headers. This way, if those headers change and
eliminate dependencies on other headers in the future, we don't have
cascading compilation errors.
2019-05-23 02:55:52 -03:00
Lioncash
195b54602f shader/decode/memory: Remove left in debug pragma 2019-05-22 17:08:50 -04:00
Lioncash
de23847184 renderer_opengl/gl_shader_decompiler: Remove redundant name specification in format string
This accidentally slipped through a rebase.
2019-05-21 09:47:21 -04:00
ReinUsesLisp
69215b5a55 gl_shader_cache: Fix clang strict standard build issues 2019-05-20 22:46:05 -03:00
ReinUsesLisp
c03b8c4c19 gl_shader_cache: Use shared contexts to build shaders in parallel 2019-05-20 22:45:55 -03:00
ReinUsesLisp
75e7b45d69 shader/memory: Implement ST (generic memory) 2019-05-20 22:41:53 -03:00
ReinUsesLisp
f78ef617b6 shader/memory: Implement LD (generic memory) 2019-05-20 22:38:59 -03:00
bunnei
9a17b20896
Merge pull request #2494 from lioncash/shader-text
gl_shader_decompiler: Add AddLine() overloads with single function that forwards to libfmt
2019-05-20 20:42:40 -04:00
ReinUsesLisp
9c3461604c shader: Implement S2R Tid{XYZ} and CtaId{XYZ} 2019-05-20 16:36:49 -03:00
ReinUsesLisp
ada79fa8ad gl_shader_decompiler: Make GetSwizzle constexpr 2019-05-20 16:36:48 -03:00
Lioncash
58a0c13e34 gl_shader_decompiler: Tidy up minor remaining cases of unnecessary std::string concatenation 2019-05-20 14:14:48 -04:00
Lioncash
6fb29764d6 gl_shader_decompiler: Replace individual overloads with the fmt-based one
Gets rid of the need to special-case brace handling depending on the
overload used, and makes it consistent across the board with how fmt
handles them.

Strings with compile-time deducible strings are directly forwarded to
std::string's constructor, so we don't need to worry about the
performance difference here, as it'll be identical.
2019-05-20 14:14:48 -04:00
Lioncash
784d2b6c3d gl_shader_decompiler: Utilize fmt overload of AddLine() where applicable 2019-05-20 14:14:44 -04:00
Fernando Sahmkow
911fafb967 Revert #2466
This reverts a tested behavior on delay slots not exiting if the exit 
flag is set. Currently new tests are required in order to ensure this 
behavior.
2019-05-19 16:04:44 -04:00
Lioncash
91ec251c4a gl_shader_decompiler: Add AddLine() overload that forwards to fmt
In a lot of places throughout the decompiler, string concatenation via
operator+ is used quite heavily. This is usually fine, when not heavily
used, but when used extensively, can be a problem. operator+ creates an
entirely new heap allocated temporary string and given we perform
expressions like:

std::string thing = a + b + c + d;

this ends up with a lot of unnecessary temporary strings being created
and discarded, which kind of thrashes the heap more than we need to.
Given we utilize fmt in some AddLine calls, we can make this a part of
the ShaderWriter's API. We can make an overload that simply acts as a
passthrough to fmt.

This way, whenever things need to be appended to a string, the operation
can be done via a single string formatting operation instead of
discarding numerous temporary strings. This also has the benefit of
making the strings themselves look nicer and makes it easier to spot
errors in them.
2019-05-19 14:12:20 -04:00
bunnei
d49efbfb4a
Merge pull request #2441 from ReinUsesLisp/al2p
shader: Implement AL2P and ALD.PHYS
2019-05-19 14:02:58 -04:00
Hexagon12
b94b08fa6f
Merge pull request #2491 from FernandoS27/dma-fix
Dma_pusher: ASSERT on empty command_list
2019-05-19 16:27:15 +01:00
Hexagon12
f8b1e53369
Merge pull request #2452 from FernandoS27/raster-cache-fix
Correct possible error on Rasterizer Caches
2019-05-19 16:00:44 +01:00
Hexagon12
2aebbe9bf9
Merge pull request #2497 from lioncash/shader-ir
shader/shader_ir: Minor changes
2019-05-19 15:51:06 +01:00
Hexagon12
fadf66993c
Merge pull request #2495 from lioncash/cache
gl_shader_disk_cache: Minor cleanup
2019-05-19 15:50:23 +01:00
Fernando Sahmkow
9e98100c94 Dma_pusher: ASSERT on empty command_list
This is a measure to avoid crashes on command list reading as an empty
command_list is considered a NOP.
2019-05-19 10:48:31 -04:00
Hexagon12
18cdbdafa2
Merge pull request #2467 from lioncash/move
video_core/gpu_thread: Remove redundant copy constructor for CommandDataContainer
2019-05-19 15:20:37 +01:00
Hexagon12
9175bffbdb
Merge pull request #2466 from yuzu-emu/mme-exit-delay-slot
GPU/MMEInterpreter: Ignore the 'exit' flag when it's executed inside a delay slot.
2019-05-19 15:14:41 +01:00
Hexagon12
ac3775e6ae
Merge pull request #2468 from lioncash/deduction
yuzu: Remove explicit types from locks where applicable
2019-05-19 15:05:56 +01:00
Hexagon12
b54bd3f018
Merge pull request #2472 from FernandoS27/tic
maxwell_3d: reduce severity of different component formats assert.
2019-05-19 15:04:47 +01:00
Hexagon12
3bd5f01240
Merge pull request #2469 from lioncash/copyable
video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
2019-05-19 15:02:17 +01:00
Sebastian Valle
a6ed792ac4
Merge pull request #2470 from lioncash/ranged-for
video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
2019-05-19 09:01:19 -05:00
Hexagon12
4452195d41
Merge pull request #2480 from ReinUsesLisp/fix-quads
gl_rasterizer: Pass the right number of array quad vertices count
2019-05-19 14:58:49 +01:00
Hexagon12
8e9a1e4249
Merge pull request #2483 from ReinUsesLisp/fix-point-size
gl_rasterizer: Limit OpenGL point size to a minimum of 1
2019-05-19 14:57:05 +01:00
Sebastian Valle
dfddb12255
Merge pull request #2471 from lioncash/engine-upload
video_core/engines/engine_upload: Minor tidying
2019-05-19 08:54:42 -05:00
Sebastian Valle
f9ad88f9d7
Merge pull request #2484 from ReinUsesLisp/triangle-fan
maxwell_to_gl: Add TriangleFan primitive topology
2019-05-19 08:53:29 -05:00
Lioncash
e310d943b8 shader/shader_ir: Remove unnecessary inline specifiers
constexpr internally links by default, so the inline specifier is
unnecessary.
2019-05-19 08:23:15 -04:00
Lioncash
212b148923 shader/shader_ir: Simplify constructors for OperationNode
Many of these constructors don't even need to be templated. The only
ones that need to be templated are the ones that actually make use of
the parameter pack.

Even then, since std::vector accepts an initializer list, we can supply
the parameter pack directly to it instead of creating our own copy of
the list, then copying it again into the std::vector.
2019-05-19 08:23:14 -04:00
Lioncash
81e7e63080 shader/shader_ir: Remove unnecessary template parameter packs from Operation() overloads where applicable
These overloads don't actually make use of the parameter pack, so they
can be turned into regular non-template function overloads.
2019-05-19 08:23:14 -04:00
Lioncash
e09ee0ff23 shader/shader_ir: Mark tracking functions as const member functions
These don't actually modify instance state, so they can be marked as
const member functions
2019-05-19 08:23:09 -04:00
Lioncash
ce04ab38bb shader/shader_ir: Place implementations of constructor and destructor in cpp file
Given the class contains quite a lot of non-trivial types, place the
constructor and destructor within the cpp file to avoid inlining
construction and destruction code everywhere the class is used.
2019-05-19 04:02:02 -04:00
Lioncash
3356ea5bc2 gl_shader_gen: std::move objects where applicable
Avoids performing copies into the pair being returned. Instead, we can
just move the resources into the pair, avoiding the need to make copies
of both the std::string and ShaderEntries struct.
2019-05-19 03:46:54 -04:00
Lioncash
0a7f09a99b gl_shader_disk_cache: in-class initialize virtual file offset of ShaderDiskCacheOpenGL
Given the offset is assigned a fixed value in the constructor, we can
just assign it directly and get rid of the need to write the name of the
variable again in the constructor initializer list.
2019-05-19 02:55:18 -04:00
Lioncash
634b78a4c6 gl_shader_disk_cache: Default ShaderDiskCacheOpenGL's destructor in the cpp file
Given the disk shader cache contains non-trivial types, we should
default it in the cpp file in order to prevent inlining of the
complex destruction logic.
2019-05-19 02:50:50 -04:00
Lioncash
7fdc644c44 gl_shader_disk_cache: Make hash specializations noexcept
The standard library expects hash specializations that don't throw
exceptions. Make this explicit in the type to allow selection of better
code paths if possible in implementations.
2019-05-19 02:46:45 -04:00
Lioncash
683c4e523f gl_shader_disk_cache: Remove redundant code string construction in LoadDecompiledEntry()
We don't need to load the code into a vector and then construct a string
over the data. We can just create a string with the necessary size ahead
of time, and read the data directly into it, getting rid of an
unnecessary heap allocation.
2019-05-19 02:46:44 -04:00
Lioncash
5e4c227608 gl_shader_disk_cache: Make variable non-const in decompiled entry case
std::move does nothing when applied to a const variable. Resources can't
be moved if the object is immutable. With this change, we don't end up
making several unnecessary heap allocations and copies.
2019-05-19 02:46:44 -04:00
Lioncash
f417be9d3b gl_shader_disk_cache: Special-case boolean handling
Booleans don't have a guaranteed size, but we still want to have them
integrate into the disk cache system without needing to actually use a
different type. We can do this by supplying non-template overloads for
the bool type.

Non-template overloads always have precedence during function
resolution, so this is safe to provide.

This gets rid of the need to smatter ternary conditionals, as well as
the need to use u8 types to store the value in.
2019-05-19 02:46:38 -04:00
ReinUsesLisp
21ea8b2fcb gl_rasterizer: Limit OpenGL point size to a minimum of 1 2019-05-18 03:07:29 -03:00
ReinUsesLisp
52340c3294 maxwell_to_gl: Add TriangleFan primitive topology 2019-05-17 19:58:02 -03:00
ReinUsesLisp
a652e58c54 gl_rasterizer: Pass the right number of array quad vertices count 2019-05-17 17:08:34 -03:00
Fernando Sahmkow
fc975e9021 maxwell_3d: reduce sevirity of different component formats assert.
This was reduced due to happening on most games and at such constant
rate that it affected performance heavily for the end user. In general,
we are well aware of the assert and an implementation is already
planned.
2019-05-14 17:12:54 -04:00
Lioncash
b01cce716e video_core/engines/engine_upload: Amend constructor initializer list order
Silences a -Wreorder warning.
2019-05-14 13:43:28 -04:00
Lioncash
9b6d993e52 video_core/engines/engine_upload: Default destructor in the cpp file
Avoids inlining destruction logic where applicable, and also makes
forward declarations not cause unexpected compilation errors depending
on where the State class is used.
2019-05-14 13:41:41 -04:00
Lioncash
ec1c69258a video_core/engines/engine_upload: Remove unnecessary const on parameters in function declarations
These only apply in the definition of the function. They can be omitted
from the declaration.
2019-05-14 13:40:09 -04:00
Lioncash
0f83c8dffa video_core/engines/engine_upload: Remove unnecessary includes 2019-05-14 13:39:04 -04:00
Lioncash
5db1b54b58 video_core/engines/maxwell3d: Get rid of three magic values in CallMethod()
We can use the named constant instead of using 32 directly.
2019-05-14 09:02:47 -04:00
Lioncash
48ce5880a0 video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
Lessens the amount of code that needs to be read, and gets rid of the
need to introduce an indexing variable. Instead, we just operate on the
objects directly.
2019-05-14 08:53:19 -04:00
Lioncash
c212fc9b2c video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
std::memset is used to clear the entire register structure, which
requires that the Regs struct be trivially copyable (otherwise undefined
behavior is invoked). This prevents the case where a non-trivial type is
potentially added to the struct.
2019-05-14 08:47:56 -04:00
Lioncash
d6d809db87 yuzu: Remove explicit types from locks where applicable
With C++17's deduction guides, the type doesn't need to be explicitly
specified within locking primitives anymore.
2019-05-14 08:18:48 -04:00
Lioncash
c5129a3a58 video_core/gpu_thread: Remove redundant copy constructor for CommandDataContainer
std::move within a copy constructor (on a data member that isn't
mutable) will always result in a copy. Because of that, the behavior of
this copy constructor is identical to the one that would be generated
automatically by the compiler, so we can remove it.
2019-05-14 08:09:17 -04:00
Mat M
c4d549919f
Merge pull request #2462 from lioncash/video-mm
video_core/memory_manager: Minor tidying
2019-05-14 06:40:33 -04:00
Mat M
dadcf317dc
Merge pull request #2461 from lioncash/unused-var
video_core: Remove a few unused variables and functions
2019-05-14 06:36:26 -04:00
Rodrigo Locatti
940a71089d
Merge pull request #2413 from FernandoS27/opt-gpu
Rasterizer Cache: refactor flushing & optimize memory usage of surfaces
2019-05-13 23:01:59 -03:00
Sebastian Valle
9ef45f00bf
GPU/MMEInterpreter: Ignore the 'exit' flag when it's executed inside a delay slot.
It seems instructions marked with the 'exit' flag will not cause an exit when executed within a delay slot.

This was hwtested by fincs.
2019-05-12 16:38:51 -05:00
Lioncash
716fbaef74 video_core/memory_manager: Mark IsBlockContinuous() as a const member function
Corrects the typo in its name and marks the function as a const member
function, given it doesn't actually modify memory manager state.
2019-05-09 19:14:36 -04:00
Lioncash
d4bcd006b2 video_core/memory_manager: Mark the constructor as explicit
Prevents implicit converting constructions of the memory manager.
2019-05-09 19:10:26 -04:00
Lioncash
fd12788967 video_core/memory_manager: Default the destructor within the cpp file
Makes the class less surprising when it comes to forward declaring the
type, and also prevents inlining the destruction code of the class,
given it contains non-trivial types.
2019-05-09 19:10:13 -04:00
Lioncash
53afe47cec video_core/memory_manager: Amend doxygen comments
Corrects references to non-existent parameters and corrects typos.
2019-05-09 19:09:19 -04:00
Lioncash
5235b053b4 video_core/memory_manager: Remove superfluous const from function declarations
These are able to be omitted from the declaration of functions, since
they don't do anything at the type system level. The definitions of the
functions can retain the use of const though, since they make the
variables immutable in the implementation of the function where they're
used.
2019-05-09 18:59:49 -04:00
Lioncash
b6408e9671 video_core/renderer_opengl/gl_shader_cache: Correct member initialization order
Silences a -Wreorder warning.
2019-05-09 18:55:47 -04:00
Lioncash
e43ba3acd4 video_core/shader/decode/texture: Remove unused variable from GetTld4Code() 2019-05-09 18:49:56 -04:00
Lioncash
e3c45b4338 renderer_vulkan/vk_shader_decompiler: Remove unused variable from DeclareInternalFlags() 2019-05-09 18:47:48 -04:00
Lioncash
175fe8aaeb video_core/renderer_opengl/gl_shader_decompiler: Remove unused Composite() function
This isn't used at all, so it can be removed.
2019-05-09 18:45:26 -04:00
Lioncash
6d28d288a3 video_core/renderer_opengl/gl_rasterizer_cache: Remove unused variable in UploadGLMipmapTexture()
This variable is unused entirely, so it can be removed.
2019-05-09 18:42:48 -04:00
Lioncash
ba165b1092 video_core/gpu_thread: Remove unused local variable
Instead of retrieving the data from the std::variant instance, we can
just check if the variant contains that type of data.

This is essentially the same behavior, only it returns a bool indicating
whether or not the type in the variant is currently active, instead of
actually retrieving the data.
2019-05-09 18:39:21 -04:00
Lioncash
c56d893e77 video_core/textures/astc: Remove unused variables
Silences a few compilation warnings.
2019-05-09 18:33:36 -04:00
bunnei
daca045fcd
Merge pull request #2442 from FernandoS27/astc-fix
Fix Layered ASTC Textures
2019-05-09 13:23:14 -04:00
bunnei
f69d3a6351
Merge pull request #2443 from ReinUsesLisp/skip-repeated-variants
gl_shader_disk_cache: Skip stored shader variants instead of asserting
2019-05-09 13:22:42 -04:00
bunnei
c27b81cb85
Merge pull request #2429 from FernandoS27/compute
Corrections and Implementation on GPU Engines
2019-05-09 13:19:22 -04:00
Fernando Sahmkow
3a08c3207b Correct possible error on Rasterizer Caches
There was a weird bug that could happen if the object died directly and
the cache address wasn't stored.
2019-05-07 12:33:10 -04:00
Lioncash
9e15193ef8
shader/decode/texture: Remove unused variable
This isn't used anywhere, so we can get rid of it.
2019-05-04 02:10:38 -04:00
Lioncash
08b270676b
gl_rasterizer: Silence unused variable warning
Makes use of src, so it's not considered unused.
2019-05-04 02:00:17 -04:00
ReinUsesLisp
d4df803b2b shader_ir/other: Implement IPA.IDX 2019-05-02 21:46:37 -03:00
ReinUsesLisp
5321cdd276 gl_shader_decompiler: Skip physical unused attributes 2019-05-02 21:46:37 -03:00
ReinUsesLisp
28bffb1ffa shader_ir/memory: Assert on non-32 bits ALD.PHYS 2019-05-02 21:46:25 -03:00
ReinUsesLisp
fe700e1856 shader: Add physical attributes commentaries 2019-05-02 21:46:25 -03:00
ReinUsesLisp
c6f9e651b2 gl_shader_decompiler: Implement GLSL physical attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
71aa9d0877 shader_ir/memory: Implement physical input attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
b7d412c99b gl_shader_decompiler: Abstract generic attribute operations 2019-05-02 21:46:25 -03:00
ReinUsesLisp
bd81a03d9d gl_shader_decompiler: Declare all possible varyings on physical attribute usage 2019-05-02 21:46:25 -03:00
ReinUsesLisp
06b363c9b5 shader: Remove unused AbufNode Ipa mode 2019-05-02 21:46:25 -03:00
ReinUsesLisp
002ecbea19 shader_ir/memory: Emit AL2P IR 2019-05-02 21:46:25 -03:00
ReinUsesLisp
7632a7d6d2 shader_bytecode: Add AL2P decoding 2019-05-02 21:46:25 -03:00
Fernando Sahmkow
e64c41efe8 Refactors and name corrections. 2019-05-01 15:31:39 -04:00
ReinUsesLisp
4aa081b4e7 gl_shader_disk_cache: Skip stored shader variants instead of asserting
Instead of asserting on already stored shader variants, silently skip them.
This shouldn't be happening but when a shader is invalidated and it is
not stored in the shader cache, this assert would hit and save that
shader anyways when the asserts are disabled.
2019-05-01 00:36:11 -03:00
Fernando Sahmkow
95261639fb Fix Layered ASTC Textures
By adding the missing layer offset in ASTC compression.
2019-04-30 23:02:31 -04:00
bunnei
79e54abe19
Merge pull request #2100 from FreddyFunk/disk-cache-precompiled-file
gl_shader_disk_cache: Improve precompiled shader cache generation speed and size
2019-04-30 19:24:01 -04:00
bunnei
91e239d66f
Merge pull request #2435 from ReinUsesLisp/misc-vc
shader_ir: Miscellaneous fixes
2019-04-28 22:29:43 -04:00
bunnei
c52233ec8b
Merge pull request #2322 from ReinUsesLisp/wswitch
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
bunnei
9a3737120d
Merge pull request #2423 from FernandoS27/half-correct
Corrections on Half Float operations: HADD2 HMUL2 and HFMA2
2019-04-28 22:24:22 -04:00
ReinUsesLisp
2156e52014 shader_ir: Move Sampler index entry in operand< to sort declarations 2019-04-26 01:13:05 -03:00
ReinUsesLisp
b77b4b76bb shader_ir: Add missing entry to Sampler operand< comparison 2019-04-26 01:11:24 -03:00
ReinUsesLisp
0b91087a1e shader_ir/texture: Fix sampler const buffer key shift 2019-04-26 01:09:29 -03:00
FreddyFunk
1a3ff252a4 Re added new lines at the end of files 2019-04-23 23:19:28 +02:00
unknown
3091b40691 gl_shader_disk_cache: Compress precompiled shader cache file with Zstandard 2019-04-23 22:24:31 +02:00
unknown
9db2c734c9 gl_shader_disk_cache: Use VectorVfsFile for the virtual precompiled shader cache file 2019-04-23 22:24:23 +02:00
unknown
3fe542cf60 gl_shader_disk_cache: Remove per shader compression 2019-04-23 21:40:01 +02:00
Fernando Sahmkow
b3118ee316 Fixes and Corrections to DMA Engine 2019-04-23 15:28:18 -04:00
Hexagon12
8df9449bb8
Merge pull request #2422 from ReinUsesLisp/fixup-samplers
gl_state: Fix samplers memory corruption
2019-04-23 18:30:35 +03:00