You may be right. I'm no programmer ;) <br>However I have one more question for you. I have 11 working nodes, which have<br>2 processors (actually 2 logical cores on P4 with HT). So I have 22 processors,\<br>even torque recognize them as well. <br>
When I want to submit job to more than 11 nodes, it won't allow me to do so. <br>I can't tell you the exact message as I don't have access to my cluster (not<br>even remotely) at the moment. <br>Is there a way to set it up? I'm sorry I can't tell you any further details now.<br>
<br>Anyway, thank you very much with that code. It works 100% now.<br><br><div class="gmail_quote">On Thu, Feb 21, 2008 at 7:55 PM, Craig West <<a href="mailto:cwest@astro.umass.edu" target="_blank">cwest@astro.umass.edu</a>> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Jozef,<br>
<br>
There isn't actually a processor lost. I just guessed at how the code<br>
worked before I had seen the code itself. After looking at the code you<br>
can see that the first processor sends and receives messages to all the<br>
other processors. It doesn't send one to itself.<br>
<div><br>
><br>
> It seems to me that one processor is still lost, but I have no bug<br>
> info with this.<br>
> However, when I run it using torque, the job seems to be hung. 'showq'<br>
> shows<br>
> that the job is running but never finishes.<br>
><br>
</div><div><div></div><div>> All my nodes are running now. qstat -f tells me that the job was<br>
> assigned to these hosts:<br>
><br>
> exec_host =<br>
> f135-15/1+f135-15/0+f135-14/1+f135-14/0+f135-13/1+f135-13/0+f1<br>
> 35-12/0<br>
><br>
> I'm thankful for your time and effort.<br>
<br>
<br>
</div></div></blockquote></div><br>