[Moabusers] Re: Moab 4.2.2 on a heterogeneous cluster

Wightman wightman at clusterresources.com
Mon Aug 8 17:39:13 MDT 2005


If the test you performed earlier (with MOABTEST) returned the same key
for on each cluster then the endian issue has been solved.  

If you are compiling on the intel and the ppc, and the moab directory is
not NFS mounted then you will need to make sure that your secret keys
are the same.  In the "include" directory there is a file named:

moab-local.h

Which contains a #define:

MBUILD_SKEY

The value of MBUILD_SKEY must be the same for both versions if the
clients are to talk to each other.

- Douglas

On Mon, 2005-08-08 at 18:26 -0500, Laurence P Dawson wrote:
> It doesn't look like that is the problem - I compile different versions 
> into different target directories, so they shouldn't interfere with one 
> another.
> The date on the intel box for the moab binary and the client binaries is
> Aug  8 13:54, and on the ppc it is 2005-08-08 14:17
> 
> showq --version shows moab client version 4.2.2p4 on both powerpc and 
> intel.
> 
> Does snapshot 1123281169 definitely have the fix? - it was the latest 
> one posted on your source site as of this morning, and it was posted on 
> Friday, as I expected.
> 
> 
> 
> Wightman wrote:
> 
> >Is it possible that the binaries are now out of sync?  Moab uses the
> >command ln -f to create hard-links for the individual client commands.
> >What are the timestamps of the client binaries vs. the timestamp of the
> >moab binary (when spread across different machines, make sure the
> >timestamps are reasonably close)?  They should exactly the same when
> >they are on the same machine.
> >
> >Are the timestamps the same?
> >
> >The "ln -h" is a throwback back to the days of limited disk space.  As
> >that is hardly the case anymore we will convert it over to a "cp" and
> >avoid this problem in the future.
> >
> >Thanks,
> >
> >
> >- Douglas
> >
> >On Mon, 2005-08-08 at 17:28 -0500, Laurence P Dawson wrote:
> >  
> >
> >>We still have an issue.
> >>
> >>Both moab-4.2.2p3 and the p4 snapshot give identical checksum results on 
> >>both architectures - shown below.
> >>
> >>CS: 'iEQIQX1nhHaXN8gyn5/TYCbJstk='/'5JUSUk9HtBONhQydnYWXKScoHaA='
> >>
> >>I had to modify the configuration on a ppc machine to try your test 
> >>since moab is not running on ppc - we are running a single scheduler 
> >>instance on an intel box for both architectures. The command we are 
> >>having a problem with is showq. Setting up with moab-4.2.0p3 works 
> >>correctly. The moab.log file  for the 4.2.2p3 and p4-snapshot shows this:
> >>
> >>On Fri, 2005-08-05 at 17:04 -0500, Laurence P Dawson wrote:
> >>
> >>    
> >>
> >>>>08/01 09:51:55 INFO:     connect request from 10.0.51.1
> >>>>08/01 09:51:55 INFO:     received service request from host 'b1n1.vampire'
> >>>>08/01 09:51:55 INFO:     client socket from 'b1n1.vampire' accepted
> >>>>08/01 09:51:55 MSysProcessRequest(S,FALSE)
> >>>>08/01 09:51:55 MSURecvData(S,5000000,TRUE,SC,EMsg)
> >>>>08/01 09:51:55 MSURecvPacket(9,BufP,9,NULL,5000000)
> >>>>08/01 09:51:55 MSURecvPacket(9,BufP,359,NULL,5000000)
> >>>>08/01 09:51:55 ALERT:    signatures do not match
> >>>>08/01 09:51:55 ALERT:    cannot read client packet
> >>>>08/01 09:51:55 MSUSendData(S,5000000,TRUE,TRUE)
> >>>>08/01 09:51:55 INFO:     packet sent (386 bytes of 386)
> >>>>08/01 09:51:55 MSUDisconnect(S)
> >>>>
> >>>>Could this be a endian issue that was introduced in 4.2.2? Any other ideas?
> >>>>        
> >>>>
> >>> 
> >>>
> >>>      
> >>>
> >>Wightman wrote:
> >>
> >>    
> >>
> >>>Laurence,
> >>>
> >>>I would like to verify that we have moved on from the endian issue.  The
> >>>best way to do this is to export the environment variable:
> >>>
> >>>MOABTEST=checksum:bababa,HMAC64,dog
> >>>
> >>>Then, simply run "moab" and check the result.  On each cluster the value
> >>>returned from this test should be exactly the same as all the other
> >>>clusters.
> >>>
> >>>Also, your earlier posts were being held for approval.  I have removed
> >>>your moderation bit so they should go through quickly from now on.
> >>>
> >>>Thanks,
> >>>
> >>>- Douglas
> >>>Cluster Resources, INC.
> >>>
> >>>On Mon, 2005-08-08 at 14:39 -0600, Dave Jackson wrote:
> >>> 
> >>>
> >>>      
> >>>
> >>>>Laurence,
> >>>>
> >>>> Doug Wightman will be able to assist you both with your MoabUsers
> >>>>posting issue and with the original client secret key issue.  Moab has
> >>>>internal diagostic facilities which should be able to verify correct
> >>>>operation.
> >>>>
> >>>>Dave
> >>>>
> >>>>
> >>>>On Mon, 2005-08-08 at 15:25 -0500, Laurence P Dawson wrote:
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>>>Dave,
> >>>>>I'm seeing the same problem with moab-4.2.2p4-snap.1123281169
> >>>>>
> >>>>>Also, my original emails do not seem to be making it through to the 
> >>>>>moabusers list...who should I talk to?
> >>>>>
> >>>>>Dave Jackson wrote:
> >>>>>
> >>>>>     
> >>>>>
> >>>>>          
> >>>>>
> >>>>>>Laurence,
> >>>>>>
> >>>>>>A new source based Moab 4.2.2 snapshot distribution has been created
> >>>>>>containing the changes.  Please let us know if this resolves your
> >>>>>>problems.  If you need a binary release, please let us know.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>       
> >>>>>>
> >>>>>>            
> >>>>>>
> >>> 
> >>>
> >>>      
> >>>
> >
> >  
> >
> 




More information about the moabusers mailing list