[torquedev] Torque 2.3.0 + GSSAPI problem

Sergio Gelato Sergio.Gelato at astro.su.se
Tue Mar 4 03:21:24 MST 2008


* Enrico Morelli [2008-02-28 09:25:41 +0100]:
> To use OpenAFS and Kerberos I'm trying to use the svn version of torque with
> gssapi support. The compilation ends fine.

I'm still running an older version of that, and haven't had time to
work on it lately. I hope there haven't been any regressions.

> Feb 27 12:33:25 v6-enmr PBS_Server: Connection refused (111) in
> contact_sched, Could not contact Scheduler - port 15004 cannot bind to port
> 1023 in client_to_svr - connection refused

> There is a "Connection refused" that I don't understand.

I think it happens when the server is started before the scheduler (as
in your case).

> But when I try to submit a job (qsub pbsrun -q batch) I receive:
> qsub: Unknown queue MSG=cannot save creds

Never mind the "Unknown queue"; "cannot save creds" can only result from
a non-zero return code by pbsgss_save_creds() in the server. For
troubleshooting, one may want to log the actual error code; but first
check that your TGT is forwardable.

There is a design isue here: how should a GSSAPI-enabled TORQUE behave
when there are no forwarded credentials? This could be either because a
generic binary has been deployed at a site that does not wish to use
GSSAPI, or because a user happens to lack valid credentials. In the
latter case, some sites may still want to let the job proceed (e.g.,
if the credentials are only needed for authenticating the qsub request
and not afterwards). I'd be inclined to make this a run-time
configuration option.

> and pbs_server died without messages.

That's definitely a bug. I see that in the relevant piece of code,
the call to req_reject() is not followed by a return. Try adding one:
	req_reject(rc,0,preq,NULL,"cannot save creds");
+	free(ccname);
+	return;
    }
(This is in src/server/req_quejob.c.)


More information about the torquedev mailing list