Thank you so much for your help =) yet I still have matters to discuss.<div><br></div><div><br></div><div><div class="gmail_quote">On Wed, Nov 30, 2011 at 4:22 PM, Gustavo Correa <span dir="ltr"><<a href="mailto:gus@ldeo.columbia.edu">gus@ldeo.columbia.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">You don't have 8 CPUs of type 'uno'.<br>
This seems to conflict with your mpirun command with -np=8.<br>
You need to match the number of processors you request from Torque and<br>
the number of processes you launch with mpirun.<br>
<br></blockquote><div><br></div><div><br></div><div>1. Why there has to be a match between processors and processes? i could run 1024 process in 1 processor (without torque). Requesting 2 nodes i could spawn 10000 processes...</div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Also, you wrote:<br>
<br>
#PPS -q uno<br>
<br>
Is this a typo in your email or in your Torque submission script?<br>
It should be:<br>
<br>
#PBS -q uno<br>
<br>
In addition, your PBS script doesn't request nodes, something like<br>
#PBS -l nodes=1:ppn=2<br>
I suppose it will use the default for the queue uno.<br>
However, your qmgr configuation doesn't set a default number of nodes to use,<br>
either for the queues or for the server itself.<br>
<br>
You could do:<br>
qmgr -c 'set queue uno resources_default.nodes = 1'<br>
and likewise for queue dos.<br>
<br></blockquote><div><br></div><div><br></div><div>2. thats in fact a type. In the script it says #PBS</div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
More important, is your mpi [and mpiexec] built with Torque support?<br>
For instance, OpenMPI can be built with Torque support, so that it<br>
will use the nodes provided by Torque to run the job.<br>
However, stock packaged MPIs from yum or apt-get are probably not<br>
integrated with Torque.<br>
You would need to build it from source, which is not really hard.<br>
<br>
If you use an mpi that is not integrated with Torque, you need to pass to mpirun/mpiexec<br>
the file created by Torque with the node list.<br>
The file name is held by the environment variable $PBS_NODEFILE.<br>
The syntax vary depending on which mpi you are using, check your mpirun man page,<br>
but should be something like:<br>
<br>
mpirun -hostfile $PBS_NODEFILE -np 2 ./a.out<br>
<br></blockquote><div><br></div><div><div>3. My MPICH2 is version 1.2.1p1. I dont recall if i compiled it with torque support. Even so i dont' have a vairable $PBS_NODEFILE. (doing a "echo $PBS_NODEFILE" returns an empty line).</div>
</div><div><br></div><div><br></div><div><div>4. I dont know if this is my problem or not but you talk about mpirun and mpiexec like if they were the same, yet i have used mpiexec most of the time and im not sure about the similiarities (or differences). You asked if my MPIEXEC is built with torque but a few lines below you mention MPIRUN</div>
</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
[ The flag may be -machinefile instead of -hostfile, or something else, depending on your MPI.]<br>
<div class="im"><br>
<br>
On Nov 30, 2011, at 4:11 PM, Ricardo Román Brenes wrote:<br>
<br>
> Ill post some more info since im pretty desperate right now :P<br>
><br>
<br>
</div>Oh, yes.<br>
You should always do this, if you want help from the list.<br>
Do you see how much more help you get when you give all the information? :)<br>
<div class="im HOEnZb"><br>
<br>
I hope this helps,<br>
Gus Correa<br><br></div></blockquote></div></div>