Slick Forums

Discuss the Slick 2D Library
It is currently Thu Apr 24, 2014 4:50 am

All times are UTC




Post new topic Reply to topic  [ 21 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Thu Mar 08, 2012 12:59 pm 
Offline

Joined: Mon Feb 13, 2012 6:31 pm
Posts: 40
I've read some articles on the forum that while slick is a great library it isn't optimized for displaying many small images. I didn't really care about FPS until it started dropping when I put lots of images in my game. And now I want to fix what is wrong before source code grows too big.

I've heard someone was using VBO in the forum and improved game speed by factor of 10. And was wondering. will implementing VBO or VA or other rendering method greatly enhances tile based game's FPS? The screen of my game is drawn with about 20x16 64x64pixcel tiles and there will be like total of 500-1000 images to show in a frame(excluding tiles). And how hard it is to change the rendering method to VBO? I've almost no knowledge of openGL programming..


Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 1:47 pm 
Offline
Game Developer
User avatar

Joined: Thu Mar 03, 2011 6:22 pm
Posts: 534
Whoever said that SLick wasn't optimized for it obviously did thought Slick is a magic unicorn able to determine how to sort the rendering the developer codes xD

Anyway, 2 easy rules for optimzing (Not only in Slick, Slick already has startUse() and endUser(). It's up to the developer to use it wisely):
  • Sort your rendering. This means if you switch around textures every time (probalty because you load a bunch of small images rather then using bigger ones with sub images)
  • As mention in first rule, optimze your textures to bigger images where you store graphics which belong together. Like player animations or misc

_________________
Current Projects:
Image Mr. Hat I
Image Vegan Vs. Zombies
Projects:
RadicalFish Engine - Build on top of Slick2D, Ideas, Bugs? Open an Issue ticket!


Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 2:20 pm 
Offline

Joined: Mon Feb 13, 2012 6:31 pm
Posts: 40
Thanks for the advice :D Yes, I(m thinking of making several sprites sheets and put&sort all images into them as you suggested. But since it will be already some work and have to rewrite many codes, I was wondering if it's worth to use VBO or VA instead(which I heard very fast) or it's just a myth or exegerated they are 10x faster compared to slick's rendering method?


Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 3:10 pm 
Offline
Regular
User avatar

Joined: Thu Mar 12, 2009 9:52 am
Posts: 142
Location: Portugal, Lisbon
Also abuse the usage of ImageBuffer condensing a set of overlayed images into one.

... 20*16 * 64,64 pixels thats 1280x1024 pixels... if its static dont even use tiles, make it one single image, if its built on several small images, use the image buffer to build one big image.

If something is outside the screen dont even issue the render order, no need to spend time doing stuff that isnt going to be shown.

Also use a ResourceManager to load stuff into memory BEFORE you enter the state (have a loading state), never do it in render or updates. It consumes a bit more of memory, but my game loads everything on start and has a memory footprint of 120 megabytes. I dont care, you have systems with 4gb+ so I really dont care about loading lost of images.

1000 images... I've used around 300 small images at the same time as a stress tests to my game and the frame rate dropped arounds 15% give or take.

Ill do a small stress test when I get Home so I can see the effect of 1000 small images being rendered, I'll get back to you then with more advice if I find any.


Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 3:51 pm 
Offline
Slick Zombie

Joined: Sat Jan 27, 2007 7:10 pm
Posts: 1482
VBOs will likely require a lot more code rewriting than sprite sheets.

VBOs are mainly useful if your bottleneck lies in vertex count. Since we are just rendering quads for sprites, the different probably won't be huge. You can try it, but it would be better to optimize your game in other ways.

  • Batch, batch, batch. If you can fit all of your textures, fonts, UIs, etc. into the same sprite sheet, do it. You can go up to 2048x2048 for modern cards. It will decrease texture binds and improve rendering tenfold.
  • Use power of two images whenever possible. Pull the latest changes from the development branch and make use of InternalTextureLoader's new getTextureCount method to see how many textures you have going. (Every game should start with 1 because of the default font.)
  • Try setting the renderer to VERTEX_ARRAY_RENDERER before you create your game container.
  • Image.getGraphics will create a new OpenGL texture, so try not to call it within render(). Instead, re-use one or two offscreen images by calling g.clear() before rendering to them.
  • Avoid Image.draw altogether. It does the following: translate the context, rotate the context, bind the texture, glBegin, draw a quad, glEnd, unrotate the context, untranslate the context. It's slow. Instead, use startUse/drawEmbedded/endUse as much as possible even if you don't plan on using sprite sheets.
  • Minimize transforms, especially push/pop transforms
  • Every time Slick changes the graphics context, it pushes all attributes (background color, world clip, line width, etc). This happens when you render to FBO, call SlickCallable, etc. Ideally, you should reduce these too, as they are somewhat expensive -- if you really need to, write your own FBO class that doesn't push any attributes.
  • If your game's map/background/etc covers the entire screen, then you don't need to clear each frame. Change GameContainer.setClearEachFrame to false. Even if you need to clear the frame, you should do it manually with Graphics.clear() as the GameContainer clear is very slightly slower (game container clears the color AND depth buffer, Graphics just clears the color buffer).
  • Most games will be fill-rate limited. Reduce the amount of pixels you are actually drawing, disable blending (glDisable(GL11.GL_BLEND) whenever it's not needed, etc. If you have a 800x600 background image that you draw in five blended passes, that will significantly affect fill rate, and maybe you should reconsider. ;)
  • For maximum efficiency, write your own sprite rendering class so you know exactly what's going on under the hood. With mine I can render several thousand sprites in a single draw call, rather than several thousand calls to glBegin/glVertex/glEnd. It's still very much a work in progress, and I hope to add shader/VBO support at some point, but for now it may help you:
    SpriteBatch
    SpriteFont

Enjoy. If you are smart about your rendering, you shouldn't need shaders/VBOs/etc. at all.


To Spiegel: Condensing loaded images into a single sprite sheet using ImageBuffer is really inefficient. Here's what would happen under the hood:
  • Decode each image file into a byte buffer. If the image isn't power of two, pad the byte buffer with enough transparent pixels to make it POT (e.g. a 260x260 image becomes a 512x512 OpenGL texture).
  • Create a new OpenGL texture for each decoded image, using POT size.
  • Read the pixels of each image individually (very, very slow) and place them into the ImageBuffer array
  • Copy the whole ImageBuffer array into a ByteBuffer
  • Create a new OpenGL texture, again ensuring that it is POT

And if you don't release() the old textures, you're wasting memory.

The best solution is to place your images in the same texture using Photoshop. If you really need to do it on the fly, I'd suggest you dig a little into OpenGL and Slick's internals:
  • Use Slick's ImageDataFactory to decode the images into byte buffers. Better yet, use PNGDecoder to avoid the byte buffer padding that will occur.
  • Create a single empty OpenGL texture that is power of two by using code from InternalTextureLoader. The glTexImage2D call would look like this:
    Code:
    glTexImage2D(GL11.GL_TEXTURE_2D, 0, GL11.GL_RGBA,
            texWidth, texHeight, 0, GL11.GL_UNSIGNED_BYTE,
            GL11.GL_RGB, BufferUtils.createByteBuffer(texWidth * texHeight * 3));
  • For each loaded image, upload the pixels to your texture using glTexSubImage2D.

Hopefully future changes to InternalTextureLoader and the image decoding classes will make this more exposed, so that it's easier to handle. :)


Last edited by davedes on Sun Mar 11, 2012 4:14 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 4:01 pm 
Offline
Regular
User avatar

Joined: Thu Mar 12, 2009 9:52 am
Posts: 142
Location: Portugal, Lisbon
davedes wrote:
To Spiegel: Condensing loaded images into a single sprite sheet using ImageBuffer is really inefficient. Here's what would happen under the hood:
  • Decode each image file into a byte buffer. If the image isn't power of two, pad the byte buffer with enough transparent pixels to make it POT (e.g. a 260x260 image becomes a 512x512 OpenGL texture).
  • Create a new OpenGL texture for each decoded image, using POT size.
  • Read the pixels of each image individually (very, very slow) and place them into the ImageBuffer array
  • Copy the whole ImageBuffer array into a ByteBuffer
  • Create a new OpenGL texture, again ensuring that it is POT

And if you don't release() the old textures, you're wasting memory.


Well I have to disagree, I've done it to simulate a small mini-map that related to games map and it improve my rendering by more than 15%.
Of course I did it only once and placed the result in an image itself, I dont know if you were talking about doing that at each render/update cycle, but I sure wasnt. I tend to have all needed resources in memory for that specific level already loaded when I enter the gameState, releasing them when I exit that gameState.
Memory wasted is a problem, but that's why I use a resourcemanager so that he does all this magic for me.

EDIT: Not disagreeing with you on being inneficient doing it on the fly, if that's what he's asking then your right.
I also agree that going under the hood for open gl before looking at other solutions may be over enginering stuff.


Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 4:24 pm 
Offline
Slick Zombie

Joined: Sat Jan 27, 2007 7:10 pm
Posts: 1482
Spiegel wrote:
EDIT: Not disagreeing with you on being inneficient doing it on the fly, if that's what he's asking then your right.
I also agree that going under the hood for open gl before looking at other solutions may be over enginering stuff.

Going "under the hood" is much easier than you think, plus it's more efficient (i.e. your game will load faster), better practice, and useful knowledge if you ever plan to pursue game development.

Like I said, hopefully we can make this easier on the end user, so that instead of hacking around with the source code for InternalTextureLoader it would look like this:
Code:
//pseudo code
ImageData img1 = ImageDecoder.getDecoder().decode("myfile.png");
ImageData img2 = ImageDecoder.getDecoder().decode("myfile2.png");
Texture tex = InternalTextureLoader.get().createTexture(1024, 1024);
//pack the first texture at top left
tex.upload(0, 0, img1);
//pack the second texture to the right of the first, with a 2 pixel gap
tex.upload(img1.getWidth() + 2, 0, img2);


Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 4:45 pm 
Offline
Regular
User avatar

Joined: Thu Mar 12, 2009 9:52 am
Posts: 142
Location: Portugal, Lisbon
davedes wrote:
Going "under the hood" is much easier than you think, plus it's more efficient (i.e. your game will load faster), better practice, and useful knowledge if you ever plan to pursue game development.


Not refuting that, I myself use it whenever I can get away with it, actually that was the way I learned openGL(again) and even created a small 2d engine just to gain knowledge on how to do it, its fun if you want to do it not so much if you HAVE to do it :) .

I also never said that going under the hood was hard in any way, any whatsoever really (slick enables this very well), I only said that before he travels that road, at least consider using the tools he already has. Seems like a sensible step I would say. 8)


Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 5:38 pm 
Offline

Joined: Mon Feb 13, 2012 6:31 pm
Posts: 40
Thanks guys :D 1000 images in a frame, it was a bit exaggerating but the game do have lots of non-static images. And I tried to put them in sprite sheets but found out it isn't efficient as there will be like 1k+ different items/tiles/monsters scattered around the map and each has different image, some of the images are big too. So even if I made lots of sprite sheets, there will be a big chance next image will be loaded from different sprite sheet. I'm still looking for a solution.

@davedes
Thanks again. I actually googled and read that slick could use VERTEX_ARRAY_RENDERER before you reply, and it was your other post in this forum xD I had a good result with the new renderer, think it improved my FPS by 5-15%. Also, using your SpriteBatch class, I did some test showing 2000 same small image(32x32) in a frame. I got 800FPS using slick's image.draw(), and 2400FPS o.o using SpriteBatch. But when I test again by showing 2000 different images(ranging from 32x32 to 512x512), I got 50 FPS with slick.image() and 65 FPS with SpriteBatch. Is it expected?


Top
 Profile  
 
PostPosted: Thu Mar 08, 2012 7:53 pm 
Offline
Slick Zombie

Joined: Sat Jan 27, 2007 7:10 pm
Posts: 1482
elonoa wrote:
Thanks guys :D 1000 images in a frame, it was a bit exaggerating but the game do have lots of non-static images. And I tried to put them in sprite sheets but found out it isn't efficient as there will be like 1k+ different items/tiles/monsters scattered around the map and each has different image, some of the images are big too. So even if I made lots of sprite sheets, there will be a big chance next image will be loaded from different sprite sheet. I'm still looking for a solution.

You should definitely be using texture atlases as much as you can. Three texture atlases is going to load and perform a lot faster than 300 separate images.

If you absolutely can't organize your images in batches, then some lower level stuff like shaders and multi-texturing may help here. This is not within the scope of slick (yet), so you'll have to do some digging yourself.

Quote:
Also, using your SpriteBatch class, I did some test showing 2000 same small image(32x32) in a frame. I got 800FPS using slick's image.draw(), and 2400FPS o.o using SpriteBatch.

Nice! Here are my results rendering 17,000 32x32 sprites to the screen.

SpriteBatch.drawImage: 60 FPS (7 render calls, SpriteBatch size of 18,000)
Image.drawEmbedded: 50 FPS (using startUse/endUse)
Image.draw: 39 FPS

Test case:
http://pastebin.com/s1YN20pk

ball image:
http://i41.tinypic.com/k4upzp.png

Controls: 1-9 change ball count, 0 subtract 100 balls, + add 100 balls, SPACE toggle SpriteBatch, R toggle renderInUse (if not using spriteBatch), V toggle vsync, W/Q change batch size, B apply new batch size.

Quote:
But when I test again by showing 2000 different images(ranging from 32x32 to 512x512), I got 50 FPS with slick.image() and 65 FPS with SpriteBatch. Is it expected?

Yes. SpriteBatch is best suited for rendering a series of images that point to the same texture. You won't see very much of a benefit from rendering many different textures. At that point it's basically doing what Image.draw is doing: bind new texture, send quad to GPU, repeat.

The idea of sprite batching is as follows, for anyone wondering:
Whenever a user tries to draw an image, store the vertex data (position, texcoords, colors) in a single large array. We continue adding to the single array until:
  • The user calls flush(), forcing us to render all the vertex data we have so far accumulated
  • We have reached the end of our array
  • The user is rendering a different texture than the last one, meaning we need to flush() and bind the new texture before continuing

Flushing the batch sends the vertex data to the GPU. A larger sized SpriteBatch will be able to hold more vertex data before flushing, thus reducing the amount of render calls needed to draw all objects in the batch. If we are constantly switching textures, or if the batch size is very small, there will be a lot more flushing. You can find out how many flushes are being used per frame with the public renderCalls variable (set it to zero at the start of the frame, and after you call the final flush check its value).

Cheers.


Top
 Profile  
 
PostPosted: Fri Mar 09, 2012 8:43 am 
Offline
Game Developer
User avatar

Joined: Thu Mar 03, 2011 6:22 pm
Posts: 534
is it normal that your test gives me more fps with draw embedded o.o Because it does :D

Your classes are missing some references and you spritebatch misses a "flush" method^^ I guess it's the render method. And oh What dies StyledText?

Anyway, really cool stuff. For Sprite I use a texture atlas btw. I don't load any glyphs but use the bytes of the text checking them on acii and then rendering the stuff. The problem... My font renderer must be able to manipulate the vertex data between bind and unbind :( But with davedes ImageTransform this should be work better now (tho it already works fast as hell).

_________________
Current Projects:
Image Mr. Hat I
Image Vegan Vs. Zombies
Projects:
RadicalFish Engine - Build on top of Slick2D, Ideas, Bugs? Open an Issue ticket!


Top
 Profile  
 
PostPosted: Fri Mar 09, 2012 8:46 am 
Offline

Joined: Mon Feb 13, 2012 6:31 pm
Posts: 40
Quote:
You should definitely be using texture atlases as much as you can. Three texture atlases is going to load and perform a lot faster than 300 separate images.


I tried doing so but failed again. 512x512 texture size limit is too small for my game and I need lots of switching between sprite sheets. If it were simple 2d game, I can make groups of same sprites using same sprite sheet and draw them at once, then move to next group using different sprite sheet etc.. But in my 2.5d game, draw order is heavily depended on sprite's z axis so I will need to switch between the groups a lot to properly show the game map.

In my case where it's hard to make batches of same images and has to switch texture binding a lot, will VBO help improve performance?

I did learn that batching and using texture atlas is the way to go for fast rendering and rewrote my gui. It's just it won't seem to work for rendering my map :( Would be easier if I can make large texture atlas but one of my testers PC is old and will crash without mercy if it's over 512x512.


Top
 Profile  
 
PostPosted: Fri Mar 09, 2012 4:59 pm 
Offline
Slick Zombie

Joined: Sat Jan 27, 2007 7:10 pm
Posts: 1482
R.D. wrote:
is it normal that your test gives me more fps with draw embedded o.o Because it does :D

Your classes are missing some references and you spritebatch misses a "flush" method^^ I guess it's the render method. And oh What dies StyledText?

StyledText is an 'attribute string', e.g. for efficiently rendering text of varying colors/fonts along the same baseline. You can comment that method out. What do you mean about the flush method?

Don't know why it's rendering slower on your machine. How many sprites? Render calls? What's the batch size (in parenthesis)?

Quote:
Anyway, really cool stuff. For Sprite I use a texture atlas btw. I don't load any glyphs but use the bytes of the text checking them on acii and then rendering the stuff. The problem... My font renderer must be able to manipulate the vertex data between bind and unbind :( But with davedes ImageTransform this should be work better now (tho it already works fast as hell).

If you used SpriteFont, you could render the individual glyphs (with drawEmbedded if you don't want to use SpriteBatch) and still use Hiero font generation + proper kerning + etc. :) I'll try to add similar functionality to AngelCodeFont.


Top
 Profile  
 
PostPosted: Fri Mar 09, 2012 7:59 pm 
Offline
Slick Zombie

Joined: Sat Jan 27, 2007 7:10 pm
Posts: 1482
elonoa wrote:
Would be easier if I can make large texture atlas but one of my testers PC is old and will crash without mercy if it's over 512x512

Yikes. Check to see if your tester can even support shaders and VBOs before you start thinking about implementing them!

Quote:
In my case where it's hard to make batches of same images and has to switch texture binding a lot, will VBO help improve performance?

You can try it, but like I said most of the bottleneck here is because of two things: constantly switching textures and fill rate limitations. Using VBO won't be very substantial compared to batching/shaders.

If your tiled map doesn't change much, you can render it once into an offscreen image and then render that offscreen image to the screen. i.e. instead of rendering 2000 quads per frame, you are rendering 4 (four 512x512 stitched together to fill the screen).

But shaders is probably the best solution.


EDIT:
R.D. wrote:
I don't load any glyphs but use the bytes of the text checking them on acii and then rendering the stuff. The problem... My font renderer must be able to manipulate the vertex data between bind and unbind :( But with davedes ImageTransform this should be work better now (tho it already works fast as hell).

I pushed some features to AngelCodeFont and FontTest that now allow for rendering individual glyphs (with proper kerning and all that). I also overloaded Image.drawEmbedded with a rotation parameter, so you shouldn't need to use ImageTransform anymore. :)


Top
 Profile  
 
PostPosted: Fri Mar 09, 2012 11:56 pm 
Offline

Joined: Mon Feb 13, 2012 6:31 pm
Posts: 40
Thanks again for the advice,I'm going to try again to see if I can batch more images while learning about shaders. And good point on my tester, I'd hate to see it not working after making VBO render my game xD


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 21 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group