Motivation in games

| No Comments
Lately I've been thinking about games, and what is it that makes them work. Most likely, everything I'll write here has already been said by somebody else, but I find that putting things down in paper (or blog) is useful to get my thoughts clear.

So, my last realization is that most games revolve around the concept of control or domination. The purpose of a game is for the player to try to control the game world itself, to change it at will. For example, in an FPS, you try to control the game, following its rules, to defeat enemies and meet objectives. In Tetris, you try to control the set of pieces you're dealt.

Of course, the beauty (and fun) of it comes not in the control itself, but in the struggle to get it. A game where you have absolute control over everything wouldn't be much of a game, really. Conversely, a game where you cannot control anything would scarcely be a traditional game, but something more similar to a film or an "interactive experience".

And this is something that got said about the Amnesia game, that it wasn't very game-like. It really doesn't allow you to control much, apart from the direction to run away in terror, yet it was very well received by game critics and sold quite a few units. How is that possible? I believe it has to do with the human desires it tries to instill in the player to fuel his/her motivation to play. Let me explain.

There's a psychological theory that says there are 16 different desires that motivate people into action. They are (copied verbatim from the wikipedia page):

    • Acceptance, the need for approval
    • Curiosity, the need to learn
    • Eating, the need for food
    • Family, the need to raise children
    • Honor, the need to be loyal to the traditional values of one's clan/ethnic group
    • Idealism, the need for social justice
    • Independence, the need for individuality
    • Order, the need for organized, stable, predictable environments
    • Physical activity, the need for exercise
    • Power, the need for influence of will
    • Romance, the need for sex
    • Saving, the need to collect
    • Social contact, the need for friends (peer relationships)
    • Social status, the need for social standing/importance
    • Tranquility, the need to be safe
    • Vengeance, the need to strike back/to win

If we look at the list, the element for what I've been calling "control" would be "power". Also, games generally use some other element from the list, like curiosity (adventure games), order (puzzle games), physical activity (dexterity games) or tranquility (plain survival would fit in here). The main motivator however, from Sim City to Bejeweled, is almost always to exert control in the game world to meet certain objectives.

But not in Amnesia! That game substitutes the control motivation with a terribly strong survival (tranquility) motivation. The game makes it very clear from the start that you're powerless to control almost anything in the world. You can explore, and if you find something nasty, the only thing you can do is run or hide. You're not trying to "beat" the game and come out triumphant feeling you achieved to "change something". You're just trying not to die the next time you open a door!

And now the question is: could this same thing be done with a different main motivator? Could we create a game where the main motivation is independence? Or idealism? Or even vengeance? It most likely wouldn't fit the traditional definition for what a game is, but it would surely be interesting! 

If somebody gives a shot at it, I promise you'll get my money!



Oh, and if somebody hasn't seen Amnesia in action, check out this video. Turn down the volume, by the way, the player gets quite scared. Also, there's quite a lot of f-word in the video...

Skip to 2:00 if you want to get to the really intense (and fun) stuff ;)
 


New year, new me

| 1 Comment
Today's officially my last day working for RandomControl.

In these last two years I've learned a lot. Chema Guerra is the most proficient C++ programmer I've ever seen, with his almost obsessive dedication to clean code and design, and he's certainly passed lots of that to me. When I began here, I thought of myself as a pretty good C++ programmer. Now I'm a lot better and I know I still have a long way to go before I really get to be "pretty good". Thanks for the humility lesson, Chema xD

Also, in this job I've had to work in C++, C, Objective C, python, ruby, javascript, lua and a couple of small scripting languages, filling my need to learn new things all the time. I really need to shake my brain a bit every few days to keep me interested, and that helped a lot.

I also have to thank Chema for giving me so much freedom with working schedule, holidays and work objectives. Another great thing is that, as I had to move from one side of the country to the other, the fact that we've been working from home was a real godsend for me.

To the RC testing team, I have to say you are one of the best CG artists I've ever seen, and your renders are so great it's usually impossible to tell they're CGI. Keep up that amazing work! And most importantly, you're great to work with, always giving great feedback and making our work feel valued. I'll miss working with you, guys!

But it's now time for me to move to new things, so best of luck to the RandomControl team, the most hard-working guys I've seen in my life. I really wish your amazing work really pays off, you sell like pancakes and become trillionaires or something. As for me, it appears I'm going back to my old hobby: games development!

Stay tuned ! ;)

P.S: With a bit of luck, I'll be able to update this blog a bit more often. Most likely I'll just post updates in twitter, but I'll try to post something here too. Btw, my twitter is @jon_valdes :)

Trip to Kenya

| No Comments
I didn't tell most people, but on April 1st I went on a 10 day trip to Kenya. I'm planning to upload here a good portion of the nearly a thousand photos we made during the trip, but until then, here's a good one to show that IT-related services is never the best paid career. Not even down there ;-)

IMGA0061_2.JPG
And here's an image of an absolutely beautiful place on the delta of the Ramisi river, south of Mombasa. The guy in the image is me, running in that crystal clear, mildly-warm water. That beach is an island that disappears when the tide comes up. All white sand and no-one else but ourselves in the island. I just loved that place.

IMGP1330.JPG
When I have some more time, I'll create a gallery with loads and loads of photographs of that extremely beautiful country.

GPU Raymarching with Distance Fields

| No Comments
These last two weeks I've been pretty distracted with personal issues, and to get back on track it always helps me to start a small personal project in which I experiment something I've never done before. So here we go: GPU Raymarching using Distance Fields.

This project is completely based on the work of Iñigo Quilez, the great IQ from rgba. He put some slides in his website explaining how to do this kind of rendering, and after reading them I just had to try it for myself.

If you want to read more, just go here: 

And now, time for some videos:



 

I already have another build with more shapes and weird things, but it'll have to wait until I get distracted again ;-)

On C++ performance: The Evil Mr. Branch

| No Comments
Here's a simple problem . For some reason, you've got to select between two types or values: smaller (or equal) than 50, and bigger than 50. And you need to do that fast using one core (no multithreading). ¿How do you do it?

For our example, we'll have an array called data where all values are stored, and an array called results, where an integer value is stored, indicating if the input was bigger or smaller than 50.

So, this is the straighfoward way to do it.
for(int i=0;i<NUM_DATA;++i)
{
    if(data[i] <= 50)
        results[i] = SMALLERTHAN_50;
    else
        results[i] = BIGGERTHAN_50;
}
And most of us (and myself 2 days ago) would say: "you won't get much faster than that!". Well, in fact you can. How?

By not branching!

In school they show you that branching can make the processor flush the data it has in the pipeline and start over. They also tell you that today's general-purpose processors do not generally take a big hit when branching, but that hit still exists (I'm talking x86 PCs, here. Consoles DO have serious problems with branching).

So, how do you do this without branching? With a small trick ;-)

result = x + (y-x) & condition

If condition has all bits to 0, the second half of the computation will be 0, so result ends up being x. On the other side, if condition has all bits to 1, the x cancels out, and we get result=y.

So we get a function like this (code shamelessly taken from here):
int isel( int a, int x, int y )
{
    int mask = a >> 31; // arithmetic shift right, splat out the sign bit
    return x + ((y - x) & mask); // mask is 0xFFFFFFFF if (a < 0) and 0x00 otherwise.
}
If a is positive, it returns x, if negative, returns y.

So using this function, we can get our array sorted out just like this:
for(int i=0;i<NUM_DATA;++i)
    results[i] = isel(50 - data[i], SMALLERTHAN_50, BIGGERTHAN_50); 
It works just as well... the only question is performance. What will run faster?

Well, in my machine (a Core 2 Duo at 2.4GHz), for NUM_DATA=10,000, and 200,000 iterations of those loops, I get:

- Using GCC with no optimizations:
Elapsed time BRANCHING:       19.156000000  sec
Elapsed time NOT BRANCHING:   17.566000000  sec
- Using GCC with -O3:
Elapsed time BRANCHING:       2.605000000  sec
Elapsed time NOT BRANCHING:   1.981000000  sec
The difference is not that big without optimizations, but with them, it's clear as day: not branching gets you further. More than a 20%, in this case!!

And this goes to show you what some people might already have told you: compilers inline the hell out of your code if set to high optimization values, even if you don't explicitly ask them to. This code is faster using a function call inside the main loop than without it!

By the way, GCC is NOT generating MMX, SSE, or any instructions like that. This is the disassembly for the main loop in the branchless version (the text on the right is my interpretation for the asm):

.text:00401170 loc_401170:
.text:00401170             mov     eax, ecx                  ## eax=50
.text:00401172             sub     eax, [ebx+edx*4]          ## eax-=data[i]
.text:00401175             shr     eax, 1Fh                  ## eax>>=31
.text:00401178             mov     [edi+edx*4], eax          ## result[i] = eax
.text:0040117B             inc     edx                       ## ++i
.text:0040117C             cmp     edx, 270Fh                ## if(i<NUM_DATA)
.text:00401182             jle     short loc_401170          ##    repeat loop

The process is heavily optimized by GCC, but it's all there in normal, honest-to-fsm, everyday-life instructions.

I do still have some doubts about this, however, as the GCC optimization for the branching loop doesn't use a j(n)le or jz instruction, but setnle. And I can't find out if this instruction does in fact flush the pipeline or not. Anyway, it still results in a faster processing when using the isel version.

For comparison, here's the optimized branching loop:
.text:004010F0 loc_4010F0:
.text:004010F0             xor     eax, eax
.text:004010F2             cmp     dword ptr [ebx+edx*4], 32h
.text:004010F6             setnle  al
.text:004010F9             mov     [esi+edx*4], eax
.text:004010FC             inc     edx
.text:004010FD             cmp     edx, 270Fh
.text:00401103             jle     short loc_4010F0
And, well, here's the whole testing code if anyone is curious about it:
#include <cstdlib>
#include <cstdio>
#include <ctime>

enum{ SMALLERTHAN_50, BIGGERTHAN_50};

int isel( int a, int x, int y )
{
    int mask = a >> 31; // arithmetic shift right, splat out the sign bit
    return x + ((y - x) & mask); // mask is 0xFFFFFFFF if (a < 0) and 0x00 otherwise.
}

int main ()
{
    const int NUM_DATA = 10000;
    const size_t NUM_ITERS = 200000;

    clock_t startTime, endTime;

    int * data     = new int[NUM_DATA];
    int * results  = new int[NUM_DATA];
    int * results2 = new int[NUM_DATA];
    
    // Initialize test data
    for(int i=0;i<NUM_DATA;++i)
        data[i]= rand()%100; 

    startTime = clock();
    // ------------------------------ 
    // Branch test
    for (int j=0;j<NUM_ITERS;++j)
        for(int i=0;i<NUM_DATA;++i)
        {
            if(data[i] <= 50)
                results[i] = SMALLERTHAN_50;
            else
                results[i] = BIGGERTHAN_50;
        }
    // ------------------------------ 
    endTime = clock();

    printf ("\tElapsed time BRANCHING:       %.9f  sec\n", 
        (float)(endTime-startTime) / CLOCKS_PER_SEC);
    
    startTime = clock();
    // ------------------------------ 
    // Branchless test
    for (int j=0;j<NUM_ITERS;++j)
        for(int i=0;i<NUM_DATA;++i)
            results2[i] = isel(
                50 - data[i], 
                SMALLERTHAN_50, 
                BIGGERTHAN_50);                                                                                   
    // ------------------------------ 
    endTime = clock();
 
    printf ("\tElapsed time NOT BRANCHING:   %.9f  sec\n",
        (float)(endTime-startTime) / CLOCKS_PER_SEC);

    // Check we didn't mess things up 
    for(int i=0;i<NUM_DATA;++i)
        if( results[i] != results2[i])
            printf ("ERROR in elem %i=%i, first value: %i, second value: %i\n",
                i, data[i], results[i], results2[i]);
}                                          

C++: Could you please code cleanly? Pretty please?

| 8 Comments
One of my little things when coding is that I always feel ashamed when my code is not clean. This means I try very hard to make my code simple and easy to understand. I don't always succeed (not even close), but I try.

I wasn't always like that, mind you. As a teenager, my code was just a horrible mess of spaghetti code. But Pablo Orduña showed me long ago the value of easily understandable code, and since then I always try my best to code in a clear and non-hackish way.

Anyway, these days I've found myself working with code from other people. And, at that, people that didn't try to create readable code. I'm sure they did know how to do it. They just weren't in the mood. You can just read it in the code: "if it works, ship it!"

As a small example (there are much much worse things than this), here's a slightly modified version of a function I found the other day:
bool RetrieveProperty(object_t * object, string propName, VARIANT* pPropertyValue)
{
	PropertyReader * pPropertyReader = NULL;
	object->FromInterface(IID_IPropertyReader, (void**)&pPropertyReader);
	if (pPropertyReader)
	{
		Properties * pProperties = NULL;
		pPropertyReader->get_Properties(&pProperties);
		if (pProperties)
		{
			long propCount = 0;
			if (pProperties->get_Count(&propCount) == OK)
			{
				for (long i = 0; i < propCount; i++)
				{
					Property * pProperty = NULL;
					pProperties->get_Item(i, &pProperty);
					if (pProperty)
					{
						PropertyDictionary * pPropertyDictionary = NULL;
						pProperty->FromInterface(IID_PropertyDictionary, (void**)&pPropertyDictionary);
						if (pPropertyDictionary)
						{
							string bDictionaryName;
							pPropertyDictionary->get_Name(&bDictionaryName);
							
							if (!strcmp((char*)bDictionaryName.c_str(), "MY_DICTIONARY"))
							{
								HRESULT hr;
								if ((hr = pPropertyDictionary->get_Item(PropertyName, pPropertyValue)) == OK)
									return true;					
							}
						}
					}
				}
			}
		}
	}

	return false;
}
Now, did anyone notice something odd? The nested ifs and loops? They make it unnecessarily hard to follow the program flow and they don't buy us anything in return!

In fact,the whole structure of the function allows us to remove most of the 9 (yes, 9) indentation levels in that code. We just have to reverse the ifs and return early. Like this:
bool RetrieveProperty(object_t * object, string propName, VARIANT* pPropertyValue)
{
	PropertyReader * pPropertyReader = NULL;
	object->FromInterface( IID_IPropertyReader, (void**)&pPropertyReader);
	if (!pPropertyReader)
		return false;

	Properties * pProperties = NULL;
	pPropertyReader->get_Properties(&pProperties);
	if (!pProperties)
		return false;

	long propCount = 0;
	if (pProperties->get_Count(&propCount) != OK)
		return false;
		
	for (long i = 0; i < propCount; i++)
	{
		Property * pProperty = NULL;
		pProperties->get_Item(i, &pProperty);
		if (!pProperty)
			continue;
		
		PropertyDictionary * pPropertyDictionary = NULL;
		pProperty->FromInterface(IID_PropertyDictionary, (void**)&pPropertyDictionary);
		if (!pPropertyDictionary)
			continue;
	
		string bDictionaryName;
		pPropertyDictionary->get_Name(&bDictionaryName);
	
		if (strcmp((char*)bDictionaryName.c_str(), "MY_DICTIONARY"))
			continue;

 		HRESULT hr; // why do we need this, I say?
		if (hr = pPropertyDictionary->get_Item(PropertyName, pPropertyValue) == OK)
			return true;					
	}

	return false;
}
Max indentation: 3 levels.

It takes less than a minute, it's safe to do, the meaning of the program is exactly the same... and it's much cleaner and easy to understand! And as a bonus, it isn't even longer than the previous version!!

There are even clever people that have said this before me several times!

So please, do try. I know it seems easier and faster not to, but it's only in the short run. Some day (when you have to go back to try to understand your own code) you'll regret it.

Still unconvinced? Now try imagining this with 700 loc functions (and yes, I have a few like that one here...)

New job!

| No Comments
Tomorrow monday is officially my first day working for RandomControl as the fryrender plugins subsystem developer and maintainer.

By the way, I'll be telecommuting from home, so I'll avoid cubicle-work once again (yay!)

And, well, I hope this job leaves me some time (and it doesn't eliminate my coding-mood) so I can keep on posting some weird stuff here!

And now, I'm going to bed. Gotta get up early to go to the office tomorrow in the morning :-P

Voronoi video-filling

| No Comments
After showing the last test to a colleague, he guessed that this method would look really bad if seen in motion, using a random sampling method for the image. He thought that a constant sampling pattern would be needed... and so I had to test it out.

This is the result, using a video sample from a Fringe episode. I think it's not half-bad when you get past 1000-2000 samples in the image. And it's completely real-time! :-)

Voronoi image-filling

| No Comments
Little test to see what can you do to approximate a big image using only a few samples, and using Voronoi diagrams for that. It's not amazingly cool or anything, but it runs in real-time :)

C++ template instantiation problem (and some solutions)

| No Comments

Disclaimer: Really long post. If you're not interested in C++ templates (or in correcting me when I write about them), don't read this! Also, if you know more about C++ than I do, please respond with a better solution!

Now, these days I've been reading one of Herb Sutter's wonderful books (Exceptional C++ Style), and one advice he gives is "Where possible, prefer writing functions as nonmembers nonfriends". His arguments seemed pretty solid, so I decided to give it a shot. However, not a day after reading that, I've already found something that keeps me from doing it in certain types of templated code.

For the small engine I've been writing, I've coded a templated Vec2<T> class. And writing some code to test it, g++ gave this wonderful error:

error: no match for 'operator*' in 'geom::Vec2<float>(((const float&)((const float*)(&-1.0e+1f))), ((const float&)((const float*)(&0.0f)))) * ((#'float_expr' not supported by dump_expr#<expression error> * 5.00000000000000010408340855860842566471546888351e-3) + 1.0e+0)'

The line that gave that error is this one:

geom::Vec2<float> v = geom::Vec2<float>(-10,0) * (1+rand()%10*0.005);

The problem (leaving aside some really weird things about float_expr and dump_expr) seems to be that the compiler can't find operator*, even though I did write one. So, what's happening?

Let's see. operator* is defined as a nonmember method, like this:

template <typename T> const Vec2<T> operator*(const Vec2<T> &lhs, T rhs) 
    return Vec2<T>(lhs.x*rhs,lhs.y*rhs); 
}

And if we look close at the line where it fails, we see it's a multiplication just like    

Vec2<float>(10,0) * 0.005

See the problem now?

The compiler, at the moment of the operator* template instantiation, has to choose the type of the template to instantiate, but finds none that fits exactly. It finds that it should call operator*(Vec2<float>, double), but there's only operator*(Vec2<T>,T) defined, so it just sighs and proclaims "What the hell do you want me to do with this?".

In fact, what we probably want it to do is to convert that double to a float, and then choose the float version of operator*. However, the compiler is not smart enough to do that. As it happens, templated functions parameter type selection and automatic type conversion don't usually mix very well (I read something about it from one of Sutter's books, but I don't remember the exact details). So what can we do? 


Option 1 (bad)

One option would be to give the compiler a little nudge (well, not so little really) so it chooses the correct template instantiation. For example, this would compile cleanly:

geom::Vec2<float> v = geom::Vec2<float>(-10,0) * (float)(1+rand()%10*0.005);

However, it's not very polite of us, as utility class programmers, to negate the class user the option to multiply a float vector by a double scalar. So what other option is there? 


Option 2 (better?)

Another option would be to make the operator* method to be a template with 2 typenames, one for each side of the operation. Like this:

template <typename T,typename Y> 
const Vec2<T> operator*(const Vec2<T> &lhs, Y rhs)
{
    return Vec2<T>(lhs.x*rhs,lhs.y*rhs);
}

This would work pretty well. 

As a side note, we would have problems if we tried to do multiplications with the scalar in the left side, like: 

Vec2<float> v = 2.0 * Vec2<float>(10,0);

So we would have to define the swapped version too:

template <typename T,typename Y> 
const Vec2<T> operator*(Y lhs, const Vec2<T> &rhs)
{
    return Vec2<T>(rhs.x*lhs,rhs.y*lhs);
}

The problem with this approach is that every different type we use will instantiate another version of the code. For this particular method it won't really matter because the compiler will probably inline it anyway. But for more complicated methods it will just instantiate another full version of the method, creating a serious case of template bloat. For example (assuming they are not inlined), each of these expressions would instantiate another version of operator*:

Vec2<float> v1 = Vec2<float>(10,0) * 65; 
Vec2<float> v2 = Vec2<float>(10,0) * 65.0; 
Vec2<float> v3 = Vec2<float>(10,0) * 65.0f; 
Vec2<float> v4 = Vec2<float>(10,0) * 'a';


In fact, if we compile those lines with the -ggdb flag, and then inspect them with gdb, we'll see all the instantiations appear:

(gdb) info function geom::operator*

All functions matching regular expression "geom::operator":

File test.cpp:

const geom::Vec2<float> geom::Vec2<float> const geom::operator*<float, char>(geom::Vec2<float> const&, char);

const geom::Vec2<float> geom::Vec2<float> const geom::operator*<float, double>(geom::Vec2<float> const&, double);

const geom::Vec2<float> geom::Vec2<float> const geom::operator*<float, float>(geom::Vec2<float> const&, float);

const geom::Vec2<float> geom::Vec2<float> const geom::operator*<float, int>(geom::Vec2<float> const&, int);


I know this is an exaggerated example, but template bloat (excessive template instantiation) can be a real problem. So, in the end, I'm not sure this option is a very good idea.


Option 3 (almost good?)

Another option (even though we wanted to avoid it from the start to follow the advice of Mr Sutter) is a member method. As simple as this:

template <typename T> struct Vec2
{
    ...
    Vec2<T> operator*(T scalar) const
    {
        return Vec2<T>(x*scalar,y*scalar);
    }
};

This method would have no problem instantiating, as there's only one possible T it can accept, and it's clear that we want to convert the type of scalar to that type T. And for the same reason, it would only instantiate once (again, assuming the compiler won't inline it because of size or something).

But we would still have the problem with the swapped version. If we try to do a multiplication like 2.0*Vec2<float>(10,0), we'll get a nice compiler error. What can we do about that?

The best way I've found is to do this:

template <typename T, typename Y> 
const Vec2<T> operator*(Y lhs, const Vec2<T> &rhs)
{
    return rhs*lhs;
}

That is, create a nonmember double-typenamed swapped version, and make it just call the member straight one. Hopefully the compiler will inline it (it's a small method after all) and we'll avoid the associated template bloat I talked about previously.

So, in the straight version we've got no template bloat, and in the swapped one we will most likely avoid it too, thanks to the compiler inlining capabilities. We would, however, be forced to use member methods and ignore Sutter's advice this time.


And I've already run out of ideas to get this to work properly. If anyone knows of a more elegant way to do it, or has found an error in code, logic, technique, grammar or whatever, please leave a comment :-)


Recent Comments

  • stenyak.com: Good luck, and happy new hacking 2012! read more
  • Jon Valdés Furriel: Yes, using the != operator makes it quite cleaner and read more
  • bob: if (strcmp((char*)bDictionaryName.c_str(), "MY_DICTIONARY")) should be written as if (bDictionaryName != read more
  • Jon Valdés Furriel: Oh, and about the "Need to do something after the read more
  • Jon Valdés Furriel: The thing about returns is that, technically, yes, they are read more
  • stenyak: Personally, I tend to go with the first version (nested read more
  • deigote: Hehehe I agree, is totally unreadable (admit it, you use read more
  • Jon Valdés Furriel: Hi Diego! Of course the best way to do it read more
  • deigote: I agree with you about the not-hackish-way is the correct read more
  • Jon Valdés Furriel: Hi fmonkey! Nice to have you around ;) Old posts read more