Commit graph

2835 commits

Author SHA1 Message Date
Fernando Sahmkow
da440da9f5 Memory Tracking: Optimize tracking to only use atomic writes when contested with the host GPU 2023-06-28 21:32:45 +02:00
ameerj
4f160633d3 OpenGL: Limit lmem warmup to NVIDIA
🐸
2023-06-25 19:06:51 -04:00
ameerj
405eae3734 shaders: Track local memory usage 2023-06-25 18:59:33 -04:00
ameerj
82107b33a2 OpenGL: Add Local Memory warmup shader 2023-06-25 18:43:23 -04:00
Kelebek1
5da70f7197 Remove memory allocations in some hot paths 2023-06-22 08:05:10 +01:00
Fernando S
27a36cd51b
Merge pull request #10744 from Wollnashorn/af-for-all
video_core: Improved anisotropic filtering heuristics
2023-06-18 00:02:05 +02:00
Wollnashorn
2dc0ff79ec video_core: Use sampler IDs instead pointers in the pipeline config
The previous approach of storing pointers returned by `GetGraphicsSampler`/`GetComputeSampler` caused UB, as these functions can cause reallocation of the sampler slot vector and therefore invalidate the pointers
2023-06-16 13:45:14 +02:00
Wollnashorn
a3b7b5b22a video_core: Fallback to default anisotropy instead to 1x anisotropy 2023-06-15 23:16:26 +02:00
Wollnashorn
3e8cd91d54 video_core: Fixed compilation errors because of name shadowing 2023-06-15 18:46:40 +02:00
Wollnashorn
42c944b250 video_core: Add per-image anisotropy heuristics (format & mip count) 2023-06-15 18:19:32 +02:00
Liam
2c01669046 video_core: preallocate fewer IR blocks 2023-06-14 21:37:57 -04:00
Morph
925586f97b buffer_cache_base: Specify buffer type in HostBindings
Avoid reinterpret-casting from void pointer since the type is already known at compile time.
2023-06-13 00:59:42 -04:00
Matías Locatti
42b2bc204f
Merge pull request #10699 from liamwhite/conditional-barrier
shader_recompiler: remove barriers in conditional control flow when device lacks support
2023-06-12 16:50:59 -03:00
bunnei
ad8f122ab1
Merge pull request #10693 from liamwhite/f64-to-f32
shader_recompiler: translate f64 to f32 when unsupported on host
2023-06-12 12:46:54 -07:00
Liam
2f1e87dd83 shader_recompiler: translate f64 to f32 when unsupported on host 2023-06-10 12:38:49 -04:00
Liam
2bb7ea436d shader_recompiler: remove barriers in conditional control flow when device lacks support 2023-06-10 12:30:39 -04:00
Kelebek1
ace6c2318b Combine vertex/transform feedback buffer binding into a single call 2023-06-08 12:13:27 +01:00
liamwhite
cfb76d8f3e
Merge pull request #10476 from ameerj/gl-memory-maps
OpenGL: Make use of persistent buffer maps in buffer cache
2023-06-07 14:03:57 -04:00
bunnei
ae099d583c core: frontend: Refactor GraphicsContext to its own module. 2023-06-03 00:05:31 -07:00
liamwhite
381caf4c00
Merge pull request #10483 from ameerj/gl-cpu-astc
gl_texture_cache: Fix ASTC CPU decoding with compression disabled
2023-05-28 13:18:31 -04:00
ameerj
ea2e155b0b gl_texture_cache: Fix ASTC CPU decoding with compression disabled
gl_format was incorrectly being overwritten when compression was disabled
2023-05-28 13:14:51 -04:00
ameerj
cb0a410907 gl_staging_buffers: Optimization to reduce fence waiting 2023-05-28 00:38:47 -04:00
ameerj
642c14f0c7 OpenGL: Make use of persistent buffer maps in buffer cache downloads
Persistent buffer maps were already used by the texture cache, this extends their usage for the buffer cache.

In my testing, using the memory maps for uploads was slower than the existing "ImmediateUpload" path, so the memory map usage is limited to downloads for the time being.
2023-05-28 00:38:46 -04:00
Kelebek1
b0bea13ed8 Move buffer bindings to per-channel state 2023-05-27 17:04:18 +01:00
Fernando S
76f6388969
Merge pull request #10398 from liamwhite/bcn
video_core: add ASTC recompression
2023-05-24 03:55:45 +02:00
Liam
415c78b87c textures: add BC1 and BC3 compressors and recompression setting 2023-05-23 12:54:40 -04:00
Liam
8758932031 renderer_vulkan: barrier attachment feedback loops 2023-05-22 18:10:16 -04:00
Fernando Sahmkow
8a214e5530 Texture Cache: Fix ASTC textures 2023-05-09 02:42:10 +02:00
Fernando Sahmkow
8014dd8259 Texture cache: Only force flush the dma downloads 2023-05-07 23:46:12 +02:00
Fernando Sahmkow
c6cac2ffaa GPU: Add Reactive flushing 2023-05-07 23:46:12 +02:00
Kelebek1
ca6bf06ef7 Log object names with debug renderer, add a GPU address to ImageViews 2023-05-06 04:48:32 +01:00
bunnei
055ee84024
Merge pull request #10142 from FernandoS27/missing-astc
GPU: implement missing ASTC
2023-05-03 16:49:27 -07:00
bunnei
a661c547d8
Merge pull request #10088 from FernandoS27/100-gelato-flavor-test-builds-later
Y.F.C Implement Asynchronous Fence manager and Rework Query async downloads
2023-05-03 15:10:22 -07:00
Fernando Sahmkow
87a9be8dec GPU: implement missing ASTC 2023-05-03 11:33:28 -04:00
Morph
47938541c2
Merge pull request #10084 from FernandoS27/yuzu-goes-broom-broom
Y.F.C Buffer Cache Revamp
2023-05-01 11:08:02 -04:00
Fernando Sahmkow
4bc5469f52 Texture Cache: Release stagging buffers on tick frame 2023-04-29 15:31:38 +02:00
Fernando Sahmkow
a16c261131 Buffer Cache: Fully rework the buffer cache. 2023-04-29 00:46:31 +02:00
Fernando Sahmkow
e3a2ca96bd Accelerate DMA: Use texture cache async downloads to perform the copies
to host.

WIP
2023-04-29 00:18:21 +02:00
Fernando Sahmkow
3fbee093b2 TextureCache: refactor DMA downloads to allow multiple buffers. 2023-04-29 00:18:21 +02:00
Fernando Sahmkow
e29ced29fa QueryCache: rework async downloads. 2023-04-23 22:04:14 +02:00
Fernando Sahmkow
fca72beb2d Fence Manager: implement async fence management in a sepparate thread. 2023-04-23 04:48:50 +02:00
Wollnashorn
fe91066f46 video_core: Enable ImageGather with subpixel offset on Intel 2023-04-08 16:12:44 +02:00
Wollnashorn
780240e697 shader_recompiler: Add subpixel offset for correct rounding at ImageGather
On AMD a subpixel offset of 1/512 of the texel size is applied to the texture coordinates at a ImageGather call to ensure the rounding at the texel centers is done the same way as in Maxwell or other Nvidia architectures.
See https://www.reedbeta.com/blog/texture-gathers-and-coordinate-precision/ for more details why this might be necessary.

This should fix shadow artifacts at object edges in Zelda: Breath of the Wild (#9957, #6956).
2023-04-08 16:12:30 +02:00
liamwhite
638044820d
Merge pull request #9943 from vonchenplus/gentleman
video_core: Fix inline_index and draw_texture error
2023-03-13 13:45:17 -04:00
Liam
600f325d87 general: fix spelling mistakes 2023-03-12 11:33:01 -04:00
FengChen
44f10c8dee video_core: Fix ogl status error when draw_texture 2023-03-12 13:33:31 +08:00
Fernando S
49643d8134
Merge pull request #9913 from ameerj/acc-dma-refactor
AccelerateDMA: Refactor Buffer/Image copy code and implement for OGL
2023-03-11 20:04:19 +01:00
liamwhite
103380134f
Merge pull request #9925 from ameerj/gl-sync-signal
OpenGL: Prefer glClientWaitSync for OGLSync objects
2023-03-10 13:55:22 -05:00
ameerj
03137086db OpenGL: Prefer glClientWaitSync for OGLSync objects
At least on Nvidia, glClientWaitSync with a timeout of 0 (non-blocking) is faster than glGetSynciv of GL_SYNC_STATUS.
2023-03-08 20:29:25 -05:00
liamwhite
3cf88a4d6c
Merge pull request #9896 from Kelebek1/d24s8
Check all swizzle components for red, not just [0]
2023-03-08 09:16:06 -05:00