why not just install one of the fore mentioned profiling apps for either ati or nv and grab a frame, any frame, should be a consistent post pass ffs, and I bet it actually sticks out like a sore thumb!!!
as kb said: 16 fucking ms! ;) i told you all, but you flame instead of trying! its your own faults!
btw, you will thank peeps for mentioning this, you will see the gpu frame in all its glory! It is even useful!
you know how much 16ms is?
Just to conclude - the thread is over.
this thread is about optimizing thats why i answered mu6k ;)
16 ms is 16/1000 of a second! makes a dependance on the GPU in every way you look at it!
16 ms is 16/1000 of a second! makes a dependance on the GPU in every way you look at it!
las: you FORGOT that smash had no source at that time and just made up some numbers for "xy" :p ...thats why i posted it (not only!) and so his meaures are for your anus!
1000/60 equals 16.6period a frame....so i am losing one frame each HyonoGlow! :p thats ALMOST nothing! if the effect runs at 200+ frames on my ATI HD5870!
hardy: no no, you cant be running at 200+ fps when one part of the frame alone takes 16.6ms ..
ok, got it finally! i am too foughten to think normal right now! ;) if 16ms is one frame, it takes one frame to do the postFX alone...i see! but doesnt apply here somehow! its more like .016ms!
i guess kbs measurements apply to everythin in that scene. that would make sense atleast!
Yeah, Smash's 4.5ms on a state of the art GPU and my 16 on a midrange one, both in line with what one would expect from a shader that needs to do 256x1920x1080 lookups per frame and both measured by people who know their shit are certainly wrong. It's you and your magical doesn't-decrease-the-framerate-at-all technique that's right. Speaking of right: riiiiiiiight.
But at least everyone, remember what trap I now need to climb out of: "Don't argue with an idiot. He'll drag you down to his level and then beat you with experience".
I use the following code to gloom an HDR frame:
I start with the original HDR frame (float, float, float), let's say 1920x1080
I render to a 960x1080 target using xPixelOffset=1/960and yPixelOffset=0
I render to a 960x540 target using xPixelOffset=0 and yPixelOffset=1/540
I render to a 480x540 target using xPixelOffset=1/480 and yPixelOffset=0
until I get to 480x270 or 320x180, than I combine all even steps (1920x1080, 960x540 ...)
I think I once computed something like 16 texture lookups per target pixel, I'm know that there are faster ways (I have seen someone using the bi-liner filtering of texture to sample and average 4 pixels at once). However, I like this code and it's output... (it gives a nice big glow to bright parts)
uniform sampler2D InputMap;
varying vec2 uv;
uniform float xPixelOffset;
uniform float yPixelOffset;
uniform float Threshold;
uniform float Gain;
uniform float Saturation;
void main(void)
float w, w_total;
vec4 color;
for (float i = -3.; i < 4.; i+=0.5)
// Take sample and threshold it
vec4 sample = texture2D(InputMap, uv + vec2(i * xPixelOffset*4., i * yPixelOffset*4.));
sample = clamp((sample - Threshold) * Gain, 0., Saturation);
// Compute weight sum
w_total += w = exp(-i*i * 0.4);
color += max(sample, 0.) * w;
gl_FragColor = color / w_total;
Correction: I actually go down to 240x135
