Report on pipeline and SPIR-V persistent cache implementation #6268
CLV-Iclucia
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I implemented persistent save/load APIs for Vulkan VkPipelineCache to enable disk serialization and reuse across application runs.
In addition, I designed and integrated a separate on-disk cache for compiled SPIR-V binaries, ensuring that redundant shader compilations are avoided and pipeline build efficiency is significantly improved across runs.
This function is still far from complete. Many problems remain to be solved.
Technical Learning
I read the documentation for APIs and guides of
VkPipelineCache, and I read this blog post to learn about the best practice for storing, loading and validating pipeline cache.I also learned the source code of NCNN to learn the whole process from compiling shaders to building the final compute pipeline, especially about current cache mechanism that computes the key of a pipeline to avoid repeated creation during a single run.
Changes Introduced
int PipelineCache::load_pipeline_cache(const char* path): This method will instructPipelineCacheobject to load pipeline cache file frompath. This method returns0if loading successfully and returns a nonzero value otherwise. If fails,PipelineCachewill try to use emptyVkPipelineCacheobject to create pipelines.int PipelineCache::save_pipeline_cache(const char* path): This method will instructPipelineCacheobject to saveVkPipelineCacheobject as a file topath. This method returns0if saving the file successfully and returns a nonzero value otherwise.void PipelineCache::set_shader_cache_dir(const char* dir): This method will set the SPIR-V code cache directory used byPipelineCacheobject todir. All the SPIR-V code produced during creation will be saved underdir. When compiling shaders,PipelineCachewill first try to look for file cache in the cache directory to skip compilation. If not specified, the default cache directory will be$LOCALAPPDATA/ncnn/shadercacheon Windows and$HOME/.ncnn/shadercacheon other platforms. Returns nonzero value if failing.int PipelineCache::clear_shader_cache() const: This method will clear the current SPIR-V code cache directory. Returns nonzero value if failing.VulkanDevice::create_pipeline: add an argument of typeVkPipelineCache*to enable creatingVkPipelineusingVkPipelineCache.int VulkanDevice::create_empty_pipeline_cache(VkPipelineCache* vk_pipeline_cache): creates aVkPipelineCacheobject with empty data. Returns nonzero value if failing.int VulkanDevice::create_pipeline_cache_with_data(const void* initial_data, size_t data_size, VkPipelineCache* vk_pipeline_cache): creates aVkPipelineCacheobject with initial data starting frominitial_datawithdata_sizebytes. Returns nonzero value if failing.test_pipeline_cache: this is a simple test for testing the functionality of pipeline cache.Implementaion details:
I use
vkGetPipelineCacheDatato get the pipeline cache data binary and combine it with a file header for validation. The header format isThis design basically follows the practice in this blog post but adds
versionandreservedfields for possible future compatibility.The design of spirv cache file is also like:
The header for this is:
The design for this will be explained later.
When compiling a shader code,
PipelineCachewill first compute a key using multiple options. It will first use the key to search internal cache(usingstd::map). If this fails, it will use the decimal string of the key as file name to search for cache file in the cache directory. If succeed it will load the code and cache it in the internal cache.Usage examples
Problems and solutions
1. Bottleneck of pipeline creation
I found that simply using
VkPipelineCachecan acceleratevkCreatePipelinegreatly, but the cost of this step seems unsignificant compared with shader compilation.So I have to implement shader SPIR-V cache to accelerate this process.
2. Cross-platform file operations
The project currently uses C++11, which does not provide a unified API for filesystem operations such as renaming or removing files. As a result, platform-specific implementations were required inside
pipelinecache.cpp.For now, I implemented platform-specific handling directly in
pipelinecache.cppfor minimal changes. A possible future improvement would be to abstract these into a dedicated cross-platform file utility module (similar to how some projects adopt afilesystem.hwrapper).3. SPIR-V cache invalidation strategy
The content of compiled SPIR-V binaries can change due to multiple factors:
If these are not accounted for, stale SPIR-V caches may cause incorrect or incompatible pipelines.
I introduced an
ncnn_versionfield in the SPIR-V cache header. This field should be updated whenever relevant changes are introduced (e.g., new GLSLang versions or internal NCNN changes affecting shader compilation). On load, the version is validated, and outdated caches are discarded. But I believe this is not the best practice. Perhaps updating this field automatically in the building system is better.4. Testing and API exposure
There is a tradeoff between providing flexible testing APIs for SPIR-V cache and keeping the public API surface minimal. Exposing too many low-level cache file operations complicates the API, while hiding them makes unit testing difficult.
The only thing I can do is use two hashes but that is far from enough.
5. Security of SPIR-V cache files
If a SPIR-V cache file is maliciously altered, and both the content and the hash are manipulated, the cache may load compromised shaders and use them to create false pipelines.
The only thing I can do is use multiple hash codes but that is far from enough.
Performance
In a single pipeline creation test, the time taken for creating a pipeline is reduced from 90ms to 0.4ms using the two caches across runs (mocked by creating and destroying GPU repeatedly) on my PC and this is mainly contributed by spirv code cache.
The CPU is AMD Ryzen 7 5800H and the GPU is Nvidia RTX 3060.
The main contribution comes from SPIR-V cache.
Beta Was this translation helpful? Give feedback.
All reactions