Monday, November 30, 2009

Hand Posture and Gesture Recognition

Human-computer interaction is currently a active field in computer science, with Microsoft working on project Natal, and Universities developing virtual reality systems. One of the most basic problems in HCI is hand posture and hand gesture recognition. While these to terms are often used interchangebly, they are fundamentally different. Hand posture is a static model of the hand, which a gesture is dynamic, and involves change in pose. We use both of these naturally in a normal conversation to convey ideas and emotions. There are two ways to implement such a system. One is data-glove based, and the other is vision based. A data-glove based system will give the most tracking results, but requires the most hardware to implement. In addition, most devices are clunky, and may limit how natural such a system feels. The second method is a vision-based approach. This involves using a simple web cam to provide an image of the hand and extract pose data from it. This approach is the cheapest, and most lightweight, but the results can be unreliable. Part of the problem is that most vision based systems are oriented in 2 dimensions. Using 2D data to classify gestures is much more innacurate than using 3D coordinates, but then the problem arises of how to determine depth from a single image. Multiple cameras can be used, but current stereo matching algorithims are not efficient enough to be used in real time. The solution is to take advantage of a variety of monocular depth cues to determine the depth with the most likely probability. These cues include shadows, occlusion, scale change, and more.

Another problem with determining hand poses is getting it to function in real time. A non-realtime application destroys the interactivity and complicates the gestures. This problem can be solved by taking advantage of multi-threading technology. Intel's newest processors, including the Core 2 Duo, and the Core i7 both have the ability to preform parallel processing (executing multiple commands at the same time). In fact, the Core i7 allows 8 commands to be executed at the same time. Since this is a relatively new technology, most programs don't take advantage of this. Utilizing this technology can optimize the program up to 8 times.

When detecting hand poses, images must be passed through different filters, some o which are independant of each other. By performing a different filter for each thread, you can increase detection times. This means that instead of waiting for 16 image operations to complete, you only have to wait the time of 2.

Friday, November 27, 2009

Bloom

Image based lighting is the process of taking an image, and lightening and darkening areas of the image based of it's content. One highly used effect derived from this is the Bloom effect, which is used to create a dreamy and cloudy atmosphere. It is mostly utilized in outdoor levels.

The first step is to take an image:



Then use a color ramp to convert it to a luminosity map of the image:



The step after that is to blur the luminosity map:



Then, this image should be overlayed over the original image using a mix method (not add or subtract). Here is the final result:



This technique can be easily implemented in shaders and other postprocess methods. Another common technique is HDRR (high dynamic range rendering), which highlights specular areas, and grays out dull colors. This technique is similar to Bloom, except the luminosity map is not blurred.

Game Engine Architechture

There are many different game engine architectures that are used throughout the gaming industry. I will now attempt to assign names them: code interface based, config based, and scripting based. These three are the main types of architectures used.

Code interface based engines still require you to use code to access engine functions. These engines basically take everything that you need to make a game, and put it there and wrap it, so that you can easily use it. This type of architecture is more commonly found in open source engines, with predominant examples being irrlicht and OGRE (although technically not a complete game engine). The advantage of this type is that you have full and complete control, without any limitations. A disadvantage, however, is that it still takes a large amount of time to make a game.

The next type is config based, which doesn't require you to do any programming at all whatsoever. This is more oftenly found in the commercial world, with engines such as Torque. An open source example would be Reality Factory. These engines are generally genre specific, because the lack of code limits flexibility. Many of these engines, however, have a small element o scripting, for in game events and sequences. The advantage of this architecture is the ease at which you can create a game. The blaring disadvantage is the lack of flexibility or customizability.

The third and final type is a scripting based engine. There are actually two types of these. One type is very similar to the code based interface, except the engine is written in code, but the game is actually run through scripts. Examples of this are Love 2D, or Wrecked games engine. The next sub category, which is probably the best type, is many component engines, with the wrapping game engine written in scripts. This type is the most easily customizable, and usable engine architecture. A proffesional example is the Unreal Engine, which is written in UnrealScript. This means that large portions o the engine, as well as gameplay, can be customized without the need to touch a single line if code. This allows for the most flexibility and ease of use. The only disadvantage is that the creation of the game engine takes a lot of planning, as well as a lot of time to create. This architecture is still, by far, the best possible one to utilize when making a game engine.

Thursday, November 26, 2009

Open Source Rendering Engine

Many people often confuse the term "rendering engine" with "game engine", while in reality, they are very different. A game engine normally consists of multiple components, seeing as a game consists of many different elements. This includes audio, graphics, scripting, physics, and input. Rendering engines focus on the graphics component of games. Taken together with audio engines and physics engines, they make a complete game engine.

In the category of rendering engines, there are many open source options. The most powerful and predominant engine is th OGRE Engine. This behemoth is, by far, the largest, most mature, and most extensible open source rendering engine. When OGRE was not as mature, a competitor by the name of Crystal Space was another alternative. That project, however, died out, and has not been updated in a while. Recently, a new project by the name of Horde3D was started, with a huge focus on simplicity without sacrificing ability. While currently, it seems like a possible contender, it is simply not mature and developed enough to be a permanant solution. It's worth checking out, however.

The reason that OGRE has become so popular is because of it's amazing organization. The entire engine is object-oriented and a highly customizable plugin architecture. This flexibility sets it as the engine of choice for even many commercial game engines, for example the NeoAxis engine. By far, OGRE is the most flexible, powerful, and customizable rendering engine.

Wednesday, November 25, 2009

Binary Space Partitioning

BSP (Binary Space Partitioning) is a method used to divide complex polygons into simple ones in order to increase rendering efficiency. It can even take a volume and subdivide it into planes. The first famous use of BSP was in the Doom engine by id software. Quake later extended BSP with ligtmapping (baking lighting values onto a texture). BSP is still used today and is the crux of many game engines including id tech 5 and Unreal Engine.

Crucial to the rendering of BSP data is the creation of a BSP tree, which iterates through all node ls and sorts them for rendering. Because this, a z-buffer is not required, because nodes are already sorted in depth order. Normally, however, a z-buffer is still rendered an utilized for the seamless intigration of models rendered with tradional methods. Normal models are still used, because BSP is only efficient when using angular and straight figures. Monsters, vehicles, and powerups are normally traditonally rendered. Effects such as particles are also rendered seperately, and composited in through use of the z-buffer.

When using BSP to subdivide volumes, there are two possible methods. One is additive geometry. This method is best thought of as starting out with an empty space and adding volumes of different shapes and sizes. In contrast to additive methods, there is the subtractive type. This is best thought of as starting with a huge block, and carving it into a level by subtracting geometry from it.

Personally, I do not normally use BSP, because I prefer to have full control over a mesh, rather than have to use block-like figures. However, BSP is perfect for indoor levels, or maps with a considerable vertical extent.

Server Side Scripting

Most of you have probably heard of JavaScript or HTML. Both o these are simply text data that is rendered into both dynamic and static web pages. These scripts and markups are actually executed on the client that recieves the web page, not the server that hosts them. This system works pretty well until you attempt to acess high security files on the server. This means that the scripts must actually he run on the server, not the client. From the server, you can also obtain the ability to write to datbases and create files.

Thus came the birth of server side scripting, in which scripts are actually run on the server, and gain full privaleges. Currently, there are many different language options, including asp, asp.net, jsp, php, and much more. When creating these scripted pages, most actually have code embedded in the source HTML page that has been renamed to .php or .asp or such. The server, when called on, asks a script interpreter to parse through the page and execute all of the scripts. This produces an normal static HTML file with all of the scripts. With the proper settigs, it would be impossible for the client to view the original script source, while in JavaScript, code is easily visible.

The advantages of server side scripting are:
- Hide code from clients
- Gain file and database permissions
- Allow code to run even if the clients browser doesn't support scripts

Wednesday, November 18, 2009

Detail Shaders

When attempting to add detail to a 3D model, one option is to simply add more polygons and shape them, but this can drastically lower the framerate of your application. If your application already takes advantage of per-pixel lighting, then you can utilize shaders to give the appearance of detail without actually changing the surface of the mesh. If your application does not use pixel shaders, then you should add support for them.

Take a simple sphere for example. Through the pixel shader, a simple texture can easily be applied to the plane, and both diffuse and specularity lighting can also be implemented.

The simplest of the detail shaders is normal mapping. A key element in this is a normal map, which is an image that stores normal data.

The normal mapping effect is simply done by setting the x normal to the r value, y to g, and z to b. This distorts the orientaion of the pixel, resulting in shading changes that gives the appearance of small crevices and bumps.

This effect, while decent, still does not give a full 3D effect. The next step up is parallax occlusion mapping.

The stones in the above image are actually on a flat plane that appears to be bumpy through a clever use of the parallax effect. The parallax effect involves offsetting the texture based of the viewing angle, thus distorting the image. Occlusion testing is done by casting a ray into the psuedo-volume.

Using these pixel shader techniques, detail can be added to a model without having to change the geometry of the model, and without affecting the frame rate. These techniques, especially normal mapping are widely used in modern games, and parallax mapping is now starting to be seen. These techniques can optimize your game, as well as make it look better.

Sunday, November 15, 2009

Cube

This week, I was going through some old folders on my computer when I stumbled across Cube. Cube was a 3D, Quake engine, open source game, that was really just a massive tech demo for an equally unique and impressive game engine. Cube had a bunch of unique features that make it still a blast to play. First of all was the in game editing, which allowed maps to be tweaked during the game. This was a blast to use, and it only got better when you started cooperative multiplayer editing with friends. Then, it was just awesome!





The real beauty of Cube, howeve was the multiplayer. Cube allowed impressive, fun, and fast paced fragging in both ctf, deathmatch, and other game modes. Playing with friends was a blast. The only bad thing with cube was the single player. The single player modes lacked certain features, such as friendly AI, scripted events, and a story. That aside however, Cube is still amazing, and currently, the Cube 2 : Sauerbraten project is attempting to surpass it.







Cube 2 has improved on the graphics, gameplay, and networking of the original Cube, and is effectively the "next-gen" version. The new Cube has higher resolution textures, Water effects, normal mapping and other lighting effects, larger maps, and more detailed models.

The great thing about the Cube and Cube 2 engine is that it is open source, just like the previous ones. This means that anyone has the ability to make fun and good looking games. Hopefully the developers will continue to add to their engine and improve their game to be greater than what it is today.

Scripting Integration

What is the difference between scripting and programming? Well, that is much more simple than it sounds. A script is a file containing instructions that are executed by another program. There are many types of scripting languages, from python to windows shell scripts, and one of the more prevalent uses of them is to be embeded within another program. This is actually the main goal of many scripting languges including Lua, which was developed recently.

I have started to embed Lua in a game engine that I am developing. I plan on using Lua scripts for a GUI, AI, Dynamic Maps, and some in game scripted events. The advantage of using a scripting language instead of hard coding it is that you can rapidly prototype your application without a recompile. The hardest part of integrating scripting is developing an API. The scripting API is the interface between the scripts an the objects in the executable. For large problems such as these, there is a program called tolua++. This can take classes and methods and make an interface for lua scripts to access it. This can greatly reduce the amount of time required to integrate a language. Similar programs exist for other scripting languages as well.

So, now that I have extensively ranted on about scripting integration, Its time for you to go see for yourself if this could help simplify your development process.

In conclusion, scripts:
- Allow rapid prototyping
- Do not require a recompile
- Cab have interfaces developed by programs such as tolua++
- Allow modular code