- Previous optimized impl. resulted in an integer overflow, so revert.
- This is our slow/fallback path that should never be really be used, so the optimization in unimportant.
Allow sharing return types with the rest of the code base. For example,
we use 'u128 = std::array<u64, 2>', meanwhile Google's code uses
'uint128 = std::pair<u64, u64>'.
While we are at it, use size_t instead of std::size_t.
We can use the standardized CLZ facilities to perform this. This also
allows us to make utilizing functions constexpr and eliminate the
inclusion of an intrinsics header.
Setting __GL_SHADER_DISK_CACHE_PATH we can force the cache directory to
be in yuzu's user directory to stop commonly distributed malware from
deleting our driver shader cache. And by setting
__GL_SHADER_DISK_CACHE_SKIP_CLEANUP we can have an unbounded shader
cache size.
This has only been implemented on Windows, mostly because previous tests
didn't seem to work on Linux.
Disable the precompiled cache on Nvidia's driver. There's no need to
hide information the driver already has in its own cache.
Reworks the tree header to operate off of templates as opposed to a
series of defines.
This allows all tree facilities to obey namespacing rules, and also
allows this code to be used within modules once compiler support is in
place.
This also gets rid to use a macro to define functions and structs for
necessary data types. With templates, these will be generated when
they're actually used, eliminating the need for the separate
declaration.
Squash attributes into the pointer's integer, making them an uintptr_t
pair containing 2 bits at the bottom and then the pointer. These bits
are currently unused thanks to alignment requirements.
Configure Dynarmic to mask out these bits on pointer reads.
While we are at it, remove some unused attributes carried over from
Citra.
Read/Write and other hot functions use a two step unpacking process that
is less readable to stop MSVC from emitting an extra AND instruction in
the hot path:
mov rdi,rcx
shr rdx,0Ch
mov r8,qword ptr [rax+8]
mov rax,qword ptr [r8+rdx*8]
mov rdx,rax
-and al,3
and rdx,0FFFFFFFFFFFFFFFCh
je Core::Memory::Memory::Impl::Read<unsigned char>
mov rax,qword ptr [vaddr]
movzx eax,byte ptr [rdx+rax]
- For `std::same_as`, add missing include of `<concepts>`.
- For `std::convertible_to`, create a replacement in `common/concepts.h`
and use that instead.
This would also be found in `<concepts>`, but unlike `std::same_as`,
`std::convertible_to` is not yet implemented in libc++, LLVM's STL
implementation - not even in master. (In fact, `std::same_as` is the
*only* concept currently implemented. For some reason.)
Fixes regression by 761206cf81, causing
yuzu to not build on Linux with any version of Boost except a cached
1.73 Conan version from before about a day ago.
Moves the Boost requirement out of the `REQUIRED_LIBS` psuedo-2D-array
for Conan to instead be manually configured, using Conan as a fallback
solution if the system does not meet our requirements.
Requires any update from the linux-fresh container in order to build.
**DO NOT MERGE** until someone with the MSVC toolchain can verify this
works there, too.
Fix CreateFullPath to have its intended previous behavior (whatever
that was), and deprecate it in favor of the new CreateDirs function.
Unlike CreateDir, CreateDirs is marked as [[nodiscard]] to avoid new
code ignoring its result value.
Converts creation and deletion functions over to std::filesystem,
simplifying our file-handling code.
Notably with this, CopyDir will now function on Windows.
Add a std::bit_cast-like function archiving the same runtime results as
the standard function, without compile time support.
This allows us to use bit_cast while we wait for compiler support, it
can be trivially replaced in the future.
VirtualBuffer makes use of VirtualAlloc (on Windows) and mmap() (on
other platforms). Neither of these ensure that non-trivial objects are
properly constructed in the allocated memory.
To prevent potential undefined behavior occurring due to that, we can
add a static assert to loudly complain about cases where that is done.
Makes page tables and virtual buffers able to be moved, but not copied,
making the interface more flexible.
Previously, with the destructor specified, but no move assignment or
constructor specified, they wouldn't be implicitly generated.
Hides all of the implementation details for users of the class. This has
the benefit of reducing includes and also making the fiber classes
movable again.
Allows our CI to catch more potential bugs. This also removes the
[[nodiscard]] attribute of IOFile's Open member function. There are
cases where a file may want to be opened, but have the status of it
checked at a later time.
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library.
The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data.
To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library.
Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header.
Async GPU is not properly implemented at the moment.
Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>