[ODE] Ode And Threads

Patrick Enoch Hendrix_ at gmx.net
Mon Jul 9 15:26:09 MST 2007


Hi,

I cannot give away my sources, but these snippets should help you. I  
assume you use windows.

Create lots of threads with (as many threads as there are processors  
works best, otherwise Windows will 'choke')...
	dat->taskid = CreateThread( NULL, THREAD_STACKSIZE, _threadstarter,  
(void*)dat, 0, &dat->threadID );

...that start this function: Make sure to pass the world and  
everything that is important in the dat!
static DWORD WINAPI _threadstarter( LPVOID param )
{
	_threaddata*dat = (_threaddata*)param;
	while( notdead )
	{
		get_message(&data,&o1,&o2);
		if ( collide_message )
		{
			collide();
			lock_world();
			add_contacts();
			unlock_world();
		}
		if ( barrier_message )
		{
			wait_on_collision_done_barrier();
		}
	}
	return;	// returning will kill this thread
}

A lock is a semaphore with inicount = 1. Childprocesses must (be able  
to) inherit the handle.
Create a lock for the world, so that ONLY ONE thread can add contacts  
each time!
	// child processes inherit handle
	SECURITY_DESCRIPTOR secdesc;
	InitializeSecurityDescriptor( &secdesc, SECURITY_DESCRIPTOR_REVISION );
	SECURITY_ATTRIBUTES sec;
	sec.nLength = sizeof(SECURITY_ATTRIBUTES);
	sec.lpSecurityDescriptor = &secdesc;
	sec.bInheritHandle = TRUE;	
	HANDLE lock=CreateSemaphore( &sec, inicount, 0x7FFFFFFF, 0 );

The near callback is just a stub that send messages to the threads:
static void nifty_ode_nearCallback (void *data, dGeomID o1, dGeomID o2)
{
	send_collide_message(data,o1,o2);
}

... that is called during the main collide function
dSpaceCollide();
send_all_threads_a_barrier_message();
wait_on_collision_done_barrier();	// wait for all threads to empty  
the messageq and finish colliding BEFORE YOU CONTINUE

Check the internet for sources for "barriers". Those things make  
everybody halt until everybody else has arrived at the barrier. I  
suggest you use the pthread library for windows. Then you can recycle  
all the resources you can find for "pthread" and "barrier" on the web.


Msgq are just arrays where I add/remove the last entry:

sendmessage:
	lock_wait( msgq->lck );
	msgq->msgs[msgq->countmsg].m1 = msg;
	msgq->msgs[msgq->countmsg].m2 = msg2;
	msgq->msgs[msgq->countmsg].m3 = msg3;
	msgq->countmsg++;
	lock_release( msgq->lck );
	semaphore_post( msgq->sem_msgavail );

getmsg:
	if (!semaphore_wait( msgq->sem_msgavail, timeout ))
		return false;	// no message avail
	lock_wait( msgq->lck );
	msgq->countmsg--;
	*msg = msgq->msgs[msgq->countmsg].m1;
	*msg2 = msgq->msgs[msgq->countmsg].m2;
	*msg3 = msgq->msgs[msgq->countmsg].m3;
	lock_release( msgq->lck );

Make the array very large, or use c++ vector<> class.


What objects are colliding? Some colliders use static variables, you  
need to patch the sources then.


Memory allocation is a delicate topic apparently when one thread  
frees data that another thread has allocated. You might have to  
redirect the "new" and "delete" to
#if MEM_REPLACE_NEW
void *operator new(_CSTD::size_t size) throw(_STD::bad_alloc)
{
	return malloc( size );
}

void *operator new[](_CSTD::size_t size) throw(_STD::bad_alloc)
{
	return operator new( size );
}

void operator delete(void *z) throw()
{
	free( z );
}

void operator delete[](void *z) throw()
{
	operator delete( z );
}
#endif // MEM_REPLACE_NEW
which will give you a speed-penalty.


On 09.07.2007, at 22:07, Mikhail Zoline wrote:

> Dear Patrick,
>
> First of all I'd like to thanks for the information you have provided
> with this reply.
>
> Actually I'm working on bulldozer simulation.
> There I have the pile of many ODE objects so the bulldozer
> can push in way to displace them. Implementation of multi threading  
> could significantly improve the performances of ODE collision  
> treatment, but I'm running out of time.
> It would be very kind from your side if you send the source code of  
> your multi threading implementation. The implementation of classes  
> and functions such as: msgq, global_lock_acquire( o1, o2 ), etc,  
> etc would help me so much.
>
> In hope to hear your news soon.
>
> Very truly yours
> MZ
>
> Patrick Enoch wrote:
>> This might be a good place to share my expericence with multi-  
>> threading ODE:
>> Right now I the collisions are running in parallel in ODE without  
>> any  changes in ODE itself.
>> I am using the near-callback. Here is how:
>> - create "worker threads" (I have about 20). Basically, they are  
>> the  nearcallback function waiting on a multithreaded-shared- 
>> message queue  for the two object IDs
>> - a near-callback stub, being called from dSpaceCollide, that  
>> sends  the objectIDs to the queue
>> - after dCollide we need to wait until the message-queue is empty
>> It looks like this:
>> ODE_do_collide
>> {
>> 	// create  worker threads
>> 	for (i=0;i<20;i++) create_thread( ODE_nearcallback );
>> 	// create msgq
>> 	create msgq;
>> 	// call AABB collider
>> 	dSpaceCollide(world, ODE_nearcallback_stub);
>> 	// wait for empty q
>> 	while (!msgq.isempty()) {};
>> 	// send "kill" message
>> 	for (i=0;i<20;i++) msgq.sendmsg( 0,0 );
>> }
>> ODE_nearcallback_stub( o1, o2 )
>> {
>> 	// can collide at all??
>> 	if ( can_collide(o1,o2) )
>> 	{
>> 		// send "collide" message
>> 		msgq.sendmsg( o1, o2 );
>> 	}
>> }
>> ODE_nearcallback()
>> {
>> 	while (1)
>> 	{
>> 		objectID o1,o2;
>> 		msgq.getmessage( o1,o2 );
>> 		if (o1==0 && o2==0)
>> 		{
>> 			suicide;
>> 		}
>> 		global_lock_acquire( o1, o2 );
>> 		dCollide(o1,o2);
>> 		global_lock_release( o1, o2 );
>> 		global_lock_acquire( world);
>> 		// create contacts
>> 		global_lock_release(world);
>> 	}
>> }
>> You see, you need a lot of workers, because the messages sent to  
>> the  queue will have lots of objects in common, e.g.
>> - collide A,B (received by thread 1)
>> - collide A,C (received by thread 2, has to wait until A,B is done)
>> - ...
>> - collide D,E (received by thread N, can proceed right away)
>> You need to lock the world, because only 1 thread can add contacts  
>> to  the world at a time.
>> If you want to use trimeshes, there is extrawork, because the   
>> colliders use static data (for speedup purposes). So you could  
>> only  collide 1 trimesh, never parallel in its current version of  
>> ODE.
>> The collision time is about halved on my dual-core. I guess the   
>> indentation i messed up when I send the above code to the list.
>> ---
>> my next plan is to parallelize the handling of the islands. the  
>> data  structures do not overlap, which is great. however, this  
>> cannot be  done without changing ODE (like a  
>> process_islands_callback).
>> best,
>> Patrick



More information about the ODE mailing list