@Weasel To solve the lag issue, as explained before, the last parameter (hashesNumber) is in charge of cutting the most intensive job to pieces.
If your graphic card is tied to your display (if it's the one your system use to render your desktop and active windows), a value of 4096 won't let free time to your OS to send desktop refresh rendering. So the system will simply kill your process (there is a watchdog for that on every modern OSes).
So, in brief, you don't have to lower the other parameters, just the last one, to a very low value (like 4 for example). There will only be more orders sent from the plotter to the GPU, but it shouldn't affect the whole througput.
The best thing to do, of course, is to dedicate the graphic card to the plotter. Use you CPU graphic card (Intel graphics most likely) to render your desktop.
Once you don't experience lags anymore, you can begin to touch to the other parameters to enhance the overall nonces/min. That part is entirely GPU dependant. From my tests, the globalWorkSize should be near the maxAllocationSize. For the localWorkSize, just try powers of 2 until you find the best value.