[Moabusers] Running a moab simulation
Corby
corbyz at gmail.com
Sat Oct 14 15:50:00 MDT 2006
I've been messing around with this for awhile now, using the example for the
workload that you gave, but Moab is still doing what it's always been doing,
i.e. exiting right away. The logs and stats don't show it did
anything (e.g".10/14 12:24:51 INFO: scheduling complete (queues
are empty) on
iteration 0"). This is the same whether I set the simulation to run now, or
set the time to be in the past (I've been using around 12:30 AM Jan 1, 2000)
and update the times in the workload trace accordingly.
I feel like I've tried almost every variable to this problem, and it's
always the same. Why would Moab just keep exiting right away when doing a
simulation?
-Corby Ziesman
On 10/9/06, wightman <wightman at clusterresources.com> wrote:
>
> In simulation Moab only looks for JOBEND events from the workload trace
> file. You should populate the workload trace with JOBEND events only.
>
> - Douglas
>
> On Mon, 2006-10-09 at 12:40 -0700, Corby wrote:
> > So when I want to run a simulation, do I just need a JOBCREATE and
> > then Moab will do a JOBSTART and JOBEND as it would if it were
> > actually running?
> >
> >
> >
> >
> > On 10/9/06, wightman <wightman at clusterresources.com> wrote:
> > Workload traces are generated automatically by Moab during
> > normal
> > execution. If you take a look in the "stats" directory you
> > will see
> > "event" files. Here is what a valid entry looks like:
> >
> > 10:33:39 1160411619 job 754 JOBEND 1 1
> > wightman
> > wightman 600 Completed [batch:1] 1160411014 1160411014
> > 1160411014
> > 1160411619 - - - >= 0M
> > >= 0M -
> > 1160411014 1 0 -:- [RESTARTABLE] -
> > - - 0
> > 163.56 pbs 1 0M 0M 0M 0
> > 2140000000
> > maka pbs - - [DEFAULT] - - 0.00 - - - - -
> >
> > (all one 1 line)
> >
> > - Douglas
> >
> > On Mon, 2006-10-09 at 12:35 -0700, Corby wrote:
> > > I was trying to follow the format from
> > >
> >
> http://www.clusterresources.com/products/mwm/docs/16.4.0simulations.shtmlbut realized quickly something was wrong. The problem is that I couldn't
> figure out what exactly. There are so many fields so there's a lot of
> variables when trying to figure out what part of it was wrong.
> > >
> > > I wasn't sure how the workload trace should work. In
> > particular I
> > > wasn't sure if I needed to do jobcreate and then jobstart in
> > order to
> > > make moab actually simulate the execution of a program.
> > >
> > > Because of the setup right now that I'm working with, I
> > wasn't able to
> > > simply have Moab output some accounting records.
> > >
> > > -Corby Ziesman
> > >
> > >
> > >
> > > On 10/9/06, wightman < wightman at clusterresources.com> wrote:
> > > The workload trace entries are incorrect. Where did
> > you
> > > obtain these
> > > records? Did Moab output them?
> > >
> > > - Douglas
> > >
> > > On Wed, 2006-10-04 at 14:52 -0700, Corby wrote:
> > > > I'm trying to run a simulation on moab, but every
> > time I run
> > > moab it
> > > > just exits right away and the job I scheduled in
> > my workload
> > > trace
> > > > doesn't seem to have run.
> > > >
> > > > I'm not quite sure how to set up the resource and
> > workload
> > > traces, and
> > > > what fields are essential to getting a simulation
> > to
> > > actually run at
> > > > all (I can then worry about making it reflect the
> > hardware
> > > and
> > > > everything I want to simulate later, but first I
> > just need
> > > it to
> > > > work).
> > > >
> > > > My workload trace just has these two entries:
> > > >
> > > > 00:00:20_01/01/00 946710020 job 1234
> > jobcreate 1 1
> > > root root
> > > > 86400 \
> > > > Completed [batch:1] 946710010 0 0 ethernet
> > x86_64
> > > Linux >= 0
> > > > >= 0 \
> > > > - 0 1 -1 - -
> > - /opt/moab/tracefiles/testprogram - -1
> > > 0
> > > > [DEFAULT] \
> > > > 1 0 0 0 0 0 - - - - - - - 0.0 - - - - - -
> > > > 00:00:30_01/01/00 946710030 job 1234
> > jobstart 1 1
> > > root root
> > > > 86400 \
> > > > Completed [batch:1] 946710010 0 0 ethernet
> > x86_64
> > > Linux >= 0
> > > > >= 0 \
> > > > - 0 1 -1 - -
> > - /opt/moab/tracefiles/testprogram - -1
> > > 0
> > > > [DEFAULT] \
> > > > 1 0 0 0 0 0 - - - - - - - 0.0 - - - - - -
> > > >
> > > > and my resource trace just has these:
> > > >
> > > > COMPUTENODE AVAILABLE 1 Node001 PBS 27580
> > 6442
> > > 100000 2 -1 -1
> > > > -1 Linux \
> > > > x86_64 [NONE] [batch:1] [ethernet] 1.0
> > [NONE] [NONE]
> > > [NONE]
> > > > COMPUTENODE AVAILABLE 1 Node002 PBS 27580
> > 6442
> > > 100000 2 -1 -1
> > > > -1 Linux \
> > > > x86_64 [NONE] [batch:1] [ethernet] 1.0
> > [NONE] [NONE]
> > > [NONE]
> > > >
> > > > and my moab.cfg:
> > > >
> > > >
> > SCHEDCFG[Moab] SERVER=xxxxxxxxxxxxxxx:xx
> > > > MODE=SIMULATION
> > > > ADMINCFG[1] USERS=root,root
> > > >
> > > >
> > SIMRESOURCETRACEFILE tracefiles/resource.trace
> > > >
> > SIMWORKLOADTRACEFILE tracefiles/workload.trace
> > > > SIMSTOPITERATION 0
> > > > SIMSTARTTIME 00:00:00_01/01/00
> > > > SIMSTOPTIME 12:00:00_01/01/00
> > > >
> > > > RMCFG[base] TYPE=PBS
> > > >
> > > >
> > > > Are there any obvious mistakes I'm making?
> > > >
> > > > Thank you for any time you can spare to help
> > someone new to
> > > Moab,
> > > > -Corby Ziesman
> > > > _______________________________________________
> > > > moabusers mailing list
> > > > moabusers at supercluster.org
> > > >
> > http://www.supercluster.org/mailman/listinfo/moabusers
> > >
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/moabusers/attachments/20061014/51744f57/attachment.html
More information about the moabusers
mailing list