Next: 9. Command Reference Manual
Up: 8. Version History and
Previous: 8.10 Development Release Series
Contents
Index
Subsections
8.11 Stable Release Series 6.0
6.0 is the first version of Condor with ClassAds.
It contains many other fundamental enhancements over version 5.
It is also the first official stable release series, with a
development series (6.1) simultaneously available.
Version 6.0.3
- Fixed a bug that was causing the hostname of the submit machine
that claimed a given execute machine to be incorrectly reported by the
condor_ startd at sites using NIS.
- Fixed a bug in the condor_ startd's benchmarking code that
could cause a floating point exception (SIGFPE, signal 8) on very,
very fast machines, such as newer Alphas.
- Fixed an obscure bug in condor_ submit that could happen when
you set a requirements expression that references the ``Memory''
attribute.
The bug only showed up with certain formations of the requirement
expression.
Version 6.0.2
- Fixed a bug in the fcntl() call for Solaris 2.6 that was
causing problems with file I/O inside Fortran jobs.
- Fixed a bug in the way the DEFAULT_DOMAIN_NAME
parameter was handled so that this feature now works properly.
- Fixed a bug in how the SOFT_UID_DOMAIN config file
parameter was used in the condor_ starter.
This feature is also documented in the manual now (see
section 3.3.7 on
page
).
- You can now set the RunBenchmarks expression to ``False'' and
the condor_ startd will never run benchmarks, not even at startup
time.
- Fixed a bug in getwd() and getcwd() for sites
that use the NFS automounter.
his bug was only present if user programs tried to call
chdir() themselves.
Now, this is supported.
- Fixed a bug in the way we were computing the available virtual
memory on HPUX 10.20 machines.
- Fixed a bug in condor_ q -analyze so it will correctly identify
more situations where a job won't run.
- Fixed a bug in condor_ status -format so that if the requested
attribute isn't available for a given machine, the format string
(including spaces, tabs, newlines, etc) is still printed, just the
value for the requested attribute will be an empty string.
- Fixed a bug in the condor_ schedd that was causing
condor_ history to not print out the first ClassAd attribute of all
jobs that have completed
- Fixed a bug in condor_ q that would cause a segmentation fault
if the argument list was too long.
Version 6.0.1
Version 6.0 pl4
NOTE: Back in the bad old days, we used this evil ``patch level''
version number scheme, with versions like ``6.0pl4''.
This has all gone away in the current versions of Condor.
- Fixed a bug that could cause a segmentation violation in the
condor_ schedd under rare conditions when a condor_ shadow exited.
- Fixed a bug that was preventing any core files that user jobs
submitted to Condor might create from being transferred back to the
submit machine for inspection by the user who submitted them.
- Fixed a bug that would cause some Condor daemons to go into an
infinite loop if the "ps" command output duplicate entries.
This only happens on certain platforms, and even then, only under rare
conditions.
However, the bug has been fixed and Condor now handles this case
properly.
- Fixed a bug in the condor_ shadow that would cause a
segmentation violation if there was a problem writing to the user log
file specified by "log = filename" in the submit file used with
condor_ submit.
- Added new command line arguments for the Condor daemons to support
saving the PID (process id) of the given daemon to a file, sending a
signal to the PID specified in a given file, and overriding what
directory is used for logging for a given daemon.
These are primarily for use with the condor_ kbdd when it needs to be
started by XDM for the user logged onto the console, instead of
running as root.
See section 3.13.4 on ``Installing the condor_ kbdd'' on
page
for details.
- Added support for the CREATE_CORE_FILES config file
parameter.
If this setting is defined, Condor will override whatever limits you
have set and in the case of a fatal error, will either create core
files or not depending on the value you specify ("true" or "false").
- Most Condor tools (condor_ on, condor_ off,
condor_ master_off, condor_ restart, condor_ vacate,
condor_ checkpoint, condor_ reconfig, condor_ reconfig_schedd,
condor_ reschedule) can now take the IP address and port you want to
send the command to directly on the command line, instead of only
accepting hostnames.
This IP/port must be passed in a special format used in Condor (which
you will see in the daemon's log files, etc).
It is of the form: <ip.address:port>.
For example: <123.456.789.123:4567>.
Version 6.0 pl3
- Fixed a bug that would cause a segmentation violation if a
machine was not configured with a full hostname as either the official
hostname or as any of the hostname aliases.
- If your host information does not include a fully qualified
hostname anywhere, you can specify a domain in the
DEFAULT_DOMAIN_NAME parameter in your global config file
which will be appended to your hostname whenever Condor needs to use a
fully qualified name.
- All Condor daemons and most tools now support a "-version"
option that displays the version information and exits.
- The condor_ install script now prompts for a short description
of your pool, which it stores in your central manager's local config
file as COLLECTOR_NAME .
This description is used to display the name of your pool when sending
information to the Condor developers.
- When the condor_ shadow process starts up, if it is configured
to use a checkpoint server and it cannot connect to the server, the
shadow will check the MAX_DISCARDED_RUN_TIME parameter.
If the job in question has accumulated more CPU minutes than this
parameter, the condor_ shadow will keep trying to connect to the
checkpoint server until it is successful.
Otherwise, the condor_ shadow will just start the job over from
scratch immediately.
- If Condor is configured to use a checkpoint server, it will only
use the checkpoint server.
Previously, if there was a problem connecting to the checkpoint
server, Condor would fall back to using the submit machine to store
checkpoints.
However, this caused problems with local disks filling up on machines
without much disk space.
- Fixed a rare race condition that could cause a segmentation
violation if a Condor daemon or tool opened a socket to a daemon and
then closed it right away.
- All TCP sockets in Condor now have the "keep alive" socket option
enabled.
This allows Condor daemons to notice if their peer goes away in a hard
crash.
- Fixed a bug that could cause the condor_ schedd to kill jobs
without a checkpoint during its graceful shutdown method under certain
conditions.
- The condor_ schedd now supports the
MAX_SHADOW_EXCEPTIONS parameter.
If the condor_ shadow processes for a given match die due to a fatal
error (an exception) more than this number of times, the
condor_ schedd will now relinquish that match and stop trying to
spawn condor_ shadow processes for it.
- The "-master" option to condor_ status now displays the Name
attribute of all condor_ master daemons in your pool, as opposed
to the Machine attribute.
This helps for pools that have submit-only machines joining them, for
example.
Version 6.0 pl2
- In patch level 1, code was added to more accurately find the
full hostname of the local machine.
Part of this code relied on the resolver, which on many platforms is a
dynamic library.
On Solaris, this library has needed many security patches and the
installation of Solaris on our development machines produced binaries
that are incompatible with sites that haven't applied all the security
patches.
So, the code in Condor that relies on this library was simply removed
for Solaris.
- Version information is now built into Condor.
You can see the CondorVersion attribute in every daemon's
ClassAd.
You can also run the UNIX command "ident" on any Condor binary to see
the version.
- Fixed a bug in the "remote submit" mode of condor_ submit.
The remote submit wasn't connecting to the specified schedd, but was
instead trying to connect to the local schedd.
- Fixed a bug in the condor_ schedd that could cause it to exit
with an error due to its log file being locked improperly under
certain rare circumstances.
Version 6.0 pl1
- condor_ kbdd bug patched: On Silicon Graphics and DEC Alpha
ports, if your X11 server is using Xauthority user authentication, and
the condor_ kbdd was unable to read the user's .Xauthority
file for some reason, the condor_ kbdd would fall into an infinite
loop.
- When using a Condor Checkpoint Server, the protocol between the
Checkpoint Server and the condor_ schedd has been made more robust
for a faulty network connection. Specifically, this improves
reliability when submitting jobs across the Internet and using a
remote Checkpoint Server.
- Fixed a bug concerning MAX_JOBS_RUNNING : The parameter
MAX_JOBS_RUNNING in the config file controls the maximum
number of simultaneous condor_ shadow processes allowed on your
submission machine.
The bug was the number of shadow processes could, under certain
conditions, exceed the number specified by
MAX_JOBS_RUNNING.
- Added new parameter JOB_RENICE_INCREMENT that can be
specified in the config file.
This parameter specifies the UNIX nice level that the condor_ starter
will start the user job.
It works just like the renice(1) command in UNIX.
Can be any integer between 1 and 19; a value of 19 is the lowest
possible priority.
- Improved response time for condor_ userprio.
- Fixed a bug that caused periodic checkpoints to happen more
often than specified.
- Fixed some bugs in the installation procedure for certain
environments that weren't handled properly, and made the documentation
for the installation procedure more clear.
- Fixed a bug on IRIX that could allow vanilla jobs to be started
as root under certain conditions.
This was caused by the non-standard uid that user "nobody" has on
IRIX.
Thanks to Chris Lindsey at NCSA for help discovering this bug.
- On machines where the /etc/hosts file is misconfigured to
list just the hostname first, then the full hostname as an alias,
Condor now correctly finds the full hostname anyway.
- The local config file and local root config file are now only
found by the files listed in the LOCAL_CONFIG_FILE and
LOCAL_ROOT_CONFIG_FILE parameters in the global config
files.
Previously, /etc/condor and user condor's home directory
(condor) were searched as well.
This could cause problems with submit-only installations of Condor at
a site that already had Condor installed.
Version 6.0 pl0
- Initial Version 6.0 release.
Next: 9. Command Reference Manual
Up: 8. Version History and
Previous: 8.10 Development Release Series
Contents
Index
condor-admin@cs.wisc.edu