4D Noise v1.2
This location is for Registered Users Only.
Perhaps you need to login or register.
13.1, 12.2, 12.1, 12.0, 11.3, 11.2, 11.1, 11.0, 10 or later
Linux, Mac, Windows
This is a port of the 4D simplex noise found at https://github.com/Draradech/csworldgen/blob/master/simplexnoise.cpp
It uses the image values of the input image to generate the noise. It is not fast, but it does the job quite well.
[Version 1.1] - Fixed a compile error on Linux and Mac
[Version 1.2] - Optimized for certain Nvidia GPU's (Thanks to Nikos Koutsikos, Lead Software Engineer at Foundry)
Nikos Koutsikos made some significant improvements to this noise node that on some gpus will make it an order of magnitude faster than the 1.1 version. To sumurize the improvements, i'll let him explain it:
This kernel has regressed significantly on performance on NVIDIA GPUs since Nuke 12.1. I have investigated this and it turns out to be that our switch of BlinkScript from OpenCL to CUDA has actually triggered much worse performance on this particular kernel.
After looking into the problem, I managed to narrow it down to the 3 arrays that are defined in the raw_noise_4d() function, simplex, perm and grad4. Since they are defined and initialised within the function, this will cause the generated code to actually initialise those arrays for every pixel. And while most compilers would be able to optimise that, the CUDA compiler cannot, making it much slower than it can be (and actually slower than the CPU).
I have written an optimised version of this kernel, which stores these arrays in locals and initialises them in init(), which makes the kernel much faster on NVIDIA cards. For example on my Ampere card the original kernel on 4K was running at 2 FPS, and the optimised version runs at 15.5 FPS. Profiling the kernel shows a massive difference, initially the kernel execution was taking 580ms and the optimised version takes only 4ms!