<HTML dir=ltr><HEAD><TITLE>Re: [Mauiusers] [torqueusers] Jobs going into incorrect queue</TITLE>
<META http-equiv=Content-Type content="text/html; charset=unicode">
<META content="MSHTML 6.00.2900.3527" name=GENERATOR></HEAD>
<BODY>
<DIV id=idOWAReplyText2659 dir=ltr>
<DIV dir=ltr><FONT face=Arial color=#000000 size=2>does it fail if you submit a 300 hour job directly to the short queue?</FONT></DIV>
<DIV dir=ltr><FONT face=Arial size=2></FONT> </DIV>
<DIV dir=ltr><FONT face=Arial size=2>--Joe</FONT></DIV></DIV>
<DIV dir=ltr><BR>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> mauiusers-bounces@supercluster.org on behalf of Philip Peartree<BR><B>Sent:</B> Wed 4/22/2009 1:19 PM<BR><B>To:</B> Steve Young<BR><B>Cc:</B> torqueusers@supercluster.org; mauiusers@supercluster.org<BR><B>Subject:</B> Re: [Mauiusers] [torqueusers] Jobs going into incorrect queue<BR></FONT><BR></DIV>
<DIV>
<P><FONT size=2>The reasoning behind the long time limit, is that some software we use <BR>is notoriously unpredictable, and therefore, it's best to give a <BR>longish time, knowing that most will complete quickly, but some can <BR>last nearly those 2 weeks.<BR><BR><BR>Quoting Steve Young <chemadm@hamilton.edu>:<BR><BR>> Hi Phillip,<BR>> Ah I see... yea first glance it looks like it *should* work =). I'm <BR>> using routing queue's but they aren't based on walltime so not sure <BR>> if I have any good suggestions. The routing queue's I have setup <BR>> work as expected. What happens when you try submitting a job to each <BR>> of the execution queue's? I'd think you should get rejected on the <BR>> short_2h?<BR>><BR>> My point before was to understand why you'd want to let them default <BR>> to a large amount of time instead of making it smaller so it <BR>> finishes quick and they figure out they need to put in a proper <BR>> walltime. If I queue up something that takes a month to run but <BR>> forget to put in walltime I wouldn't know for two weeks. Then when <BR>> it was killed off by the system I'd have to start again with the <BR>> proper walltime thus taking a month to get back to where I was when <BR>> it ended prematurely. Anyhow, hope this helps.<BR>><BR>> -Steve<BR>><BR>><BR>> On Apr 22, 2009, at 9:16 AM, Philip Peartree wrote:<BR>><BR>>> Steve, you seem to have miss understood, I have a default walltime<BR>>> set, at 2 weeks (336 hours), and therefore the job should go into the<BR>>> unspec queue, but instead, it is going to the short_2h queue, where it<BR>>> shouldn't be able to run (since the max queue walltime 2h)<BR>>><BR>>> I have included the full output of print server:<BR>>><BR>>> #<BR>>> # Create queues and set their attributes.<BR>>> #<BR>>> #<BR>>> # Create and define queue short_2h<BR>>> #<BR>>> create queue short_2h<BR>>> set queue short_2h queue_type = Execution<BR>>> set queue short_2h Priority = 50<BR>>> set queue short_2h resources_max.walltime = 02:00:00<BR>>> set queue short_2h acl_group_enable = True<BR>>> set queue short_2h acl_groups = nmrc<BR>>> set queue short_2h enabled = True<BR>>> set queue short_2h started = True<BR>>> #<BR>>> # Create and define queue guest<BR>>> #<BR>>> create queue guest<BR>>> set queue guest queue_type = Execution<BR>>> set queue guest Priority = 10<BR>>> set queue guest enabled = True<BR>>> set queue guest started = True<BR>>> #<BR>>> # Create and define queue long_1w<BR>>> #<BR>>> create queue long_1w<BR>>> set queue long_1w queue_type = Execution<BR>>> set queue long_1w Priority = 30<BR>>> set queue long_1w resources_max.walltime = 168:00:00<BR>>> set queue long_1w acl_group_enable = True<BR>>> set queue long_1w acl_groups = nmrc<BR>>> set queue long_1w enabled = True<BR>>> set queue long_1w started = True<BR>>> #<BR>>> # Create and define queue med_12h<BR>>> #<BR>>> create queue med_12h<BR>>> set queue med_12h queue_type = Execution<BR>>> set queue med_12h Priority = 40<BR>>> set queue med_12h resources_max.walltime = 12:00:00<BR>>> set queue med_12h acl_group_enable = True<BR>>> set queue med_12h acl_groups = nmrc<BR>>> set queue med_12h enabled = True<BR>>> set queue med_12h started = True<BR>>> #<BR>>> # Create and define queue route<BR>>> #<BR>>> create queue route<BR>>> set queue route queue_type = Route<BR>>> set queue route route_destinations = short_2h<BR>>> set queue route route_destinations += med_12h<BR>>> set queue route route_destinations += long_1w<BR>>> set queue route route_destinations += unspec<BR>>> set queue route route_destinations += guest<BR>>> set queue route enabled = True<BR>>> set queue route started = True<BR>>> #<BR>>> # Create and define queue unspec<BR>>> #<BR>>> create queue unspec<BR>>> set queue unspec queue_type = Execution<BR>>> set queue unspec Priority = 20<BR>>> set queue unspec acl_group_enable = True<BR>>> set queue unspec acl_groups = nmrc<BR>>> set queue unspec enabled = True<BR>>> set queue unspec started = True<BR>>> #<BR>>> # Set server attributes.<BR>>> #<BR>>> set server scheduling = True<BR>>> set server acl_hosts = steel<BR>>> set server managers = root@steel.mib.man.ac.uk<BR>>> set server operators = root@steel.mib.man.ac.uk<BR>>> set server default_queue = route<BR>>> set server log_events = 511<BR>>> set server mail_from = adm<BR>>> set server query_other_jobs = True<BR>>> set server resources_default.walltime = 336:00:00<BR>>> set server scheduler_iteration = 600<BR>>> set server node_check_rate = 150<BR>>> set server tcp_timeout = 6<BR>>> set server queue_centric_limits = True<BR>>> set server mom_job_sync = True<BR>>> set server keep_completed = 300<BR>>> set server next_job_number = 9066<BR>>><BR>>><BR>>> Thanks<BR>>><BR>>> Phil<BR>>><BR>>><BR>>> Quoting Steve Young <chemadm@hamilton.edu>:<BR>>><BR>>>> Hi,<BR>>>> I use a server default for torque.....<BR>>>><BR>>>> set server resources_default.walltime = 24:00:00<BR>>>><BR>>>> This way if they don't specify anything they will default to 24<BR>>>> hours. I took the approach that if the user doesn't specify anything<BR>>>> that they should get a minimal amount of queue time. With this I don't<BR>>>> have to have a queue to handle unspecified. I'd rather have their job<BR>>>> finish fairly quick and realize they didn't specify a time than to<BR>>>> have them go for days/weeks before they realized they didn't specify<BR>>>> it. I'd hate to have a job run for two weeks and then end up getting<BR>>>> killed off because I didn't specify my time. Especially for a job that<BR>>>> can't pick up where it left off and has to start from the beginning<BR>>>> again. Seems like a waste of resources to me. Not sure if this helps<BR>>>> you any. Could you send the output of the rest of the qmgr output?<BR>>>> It's hard to tell why it's getting to the unspec queue if we can't see<BR>>>> the config for it.<BR>>>><BR>>>> -Steve<BR>>>><BR>>>><BR>>>><BR>>>> On Apr 21, 2009, at 1:06 PM, Philip Peartree wrote:<BR>>>><BR>>>>> The default queue is the routing queue, which should place the job<BR>>>>> based on allowed time, that is why it's so puzzling that the jobs end<BR>>>>> up in the short_2h queue, as they should be rejected by that and<BR>>>>> others until it reaches the unspec queue.<BR>>>>><BR>>>>><BR>>>>> Quoting "Greenseid, Joseph M (IS)" <Joseph.Greenseid@ngc.com>:<BR>>>>><BR>>>>>> have you tried to set the default queue (set server default_queue =<BR>>>>>> unspec) in qmgr? this is how i route jobs that don't specify<BR>>>>>> resources to a default location...<BR>>>>>><BR>>>>>> --Joe<BR>>>>>><BR>>>>>> ________________________________<BR>>>>>><BR>>>>>> From: mauiusers-bounces@supercluster.org on behalf of Philip Peartree<BR>>>>>> Sent: Tue 4/21/2009 12:32 PM<BR>>>>>> To: torqueusers@supercluster.org; mauiusers@supercluster.org<BR>>>>>> Subject: [Mauiusers] Jobs going into incorrect queue<BR>>>>>><BR>>>>>><BR>>>>>><BR>>>>>> Hi Guys<BR>>>>>><BR>>>>>> I have a problem that jobs appear to be not routing to the correct<BR>>>>>> queue. My set up is as follows:<BR>>>>>><BR>>>>>> routing queue<BR>>>>>> 2h queue<BR>>>>>> 12h queue<BR>>>>>> 1w queue<BR>>>>>> unspecified time queue (max time 2w)<BR>>>>>> guest queue (low priority)<BR>>>>>><BR>>>>>> If a time is unspecified at job submission a default time of 2w<BR>>>>>> (336h) is set<BR>>>>>><BR>>>>>> The routing queue is setup as follows (as taken from qmgr -c 'print<BR>>>>>> server')<BR>>>>>><BR>>>>>> create queue route<BR>>>>>> set queue route queue_type = Route<BR>>>>>> set queue route route_destinations = short_2h<BR>>>>>> set queue route route_destinations += med_12h<BR>>>>>> set queue route route_destinations += long_1w<BR>>>>>> set queue route route_destinations += unspec<BR>>>>>> set queue route route_destinations += guest<BR>>>>>> set queue route enabled = True<BR>>>>>> set queue route started = True<BR>>>>>><BR>>>>>> my problem is that some jobs with unspecified time (which have<BR>>>>>> correctly been given a time of 336h) are ending up in the short_2h<BR>>>>>> queue, which has a higher priority than other queues. Does anyone<BR>>>>>> know<BR>>>>>> of any possible explanation for this?<BR>>>>>><BR>>>>>> Phil Peartree<BR>>>>>> University of Manchester<BR>>>>>><BR>>>>>> _______________________________________________<BR>>>>>> mauiusers mailing list<BR>>>>>> mauiusers@supercluster.org<BR>>>>>> <A href="http://www.supercluster.org/mailman/listinfo/mauiusers">http://www.supercluster.org/mailman/listinfo/mauiusers</A><BR>>>>>><BR>>>>>><BR>>>>>><BR>>>>><BR>>>>><BR>>>>><BR>>>>> _______________________________________________<BR>>>>> mauiusers mailing list<BR>>>>> mauiusers@supercluster.org<BR>>>>> <A href="http://www.supercluster.org/mailman/listinfo/mauiusers">http://www.supercluster.org/mailman/listinfo/mauiusers</A><BR>>>><BR>>>> _______________________________________________<BR>>>> mauiusers mailing list<BR>>>> mauiusers@supercluster.org<BR>>>> <A href="http://www.supercluster.org/mailman/listinfo/mauiusers">http://www.supercluster.org/mailman/listinfo/mauiusers</A><BR>>>><BR>>><BR>>><BR>>><BR>>> _______________________________________________<BR>>> torqueusers mailing list<BR>>> torqueusers@supercluster.org<BR>>> <A href="http://www.supercluster.org/mailman/listinfo/torqueusers">http://www.supercluster.org/mailman/listinfo/torqueusers</A><BR>><BR>><BR><BR><BR><BR>_______________________________________________<BR>mauiusers mailing list<BR>mauiusers@supercluster.org<BR><A href="http://www.supercluster.org/mailman/listinfo/mauiusers">http://www.supercluster.org/mailman/listinfo/mauiusers</A><BR></FONT></P></DIV></BODY></HTML>