You can have a Condor pool that consists of both Unix and Windows machines.
Your central manager can be either Windows or Unix. For example, even if you had a pool consisting strictly of Unix machines, you could use a Windows box for your central manager, and vice versa.
Submitted jobs can originate from either a Windows or a Unix machine, and be destined to run on Windows or a Unix machine. Note that there are still restrictions on the supported universes for jobs executed on Windows machines.
So, in summary:
See Section 1.5, on page .
First, make sure that the program really does work outside of Condor under Windows, that the disk is not full, and that the system is not out of user resources.
As the next consideration, know that some Windows programs do not run properly because they are dynamically linked, and they cannot find the .dll files that they depend on. Version 6.4.x of Condor sets the PATH to be empty when running a job. To avoid these difficulties, do one of the following
getenv = truein the submit description file. This will copy your environment into the job's environment.
net start condoror start the Condor service from the Service Control Manager located in the Windows Control Panel.
Jobs submitted from a Windows machine require a stashed password in order for Condor to perform certain operations on the user's behalf. Refer to section 6.2.3 for information about password storage on Windows. The command which stashes a password for a user is condor_ store_cred. See the manual page on on page for usage details.
The error message that Condor gives if a user has not stashed a password is of the form:
ERROR: No credential stored for username@machinename Correct this by running: condor_store_cred add
A difficulty with defaults causes jobs submitted from Unix for execution on a Windows platform to remain in the queue, but make no progress. For jobs with this problem, log files will contain error messages pointing to shadow exceptions.
This difficulty stems from the defaults for whether file transfer takes place. The workaround for this problem is to place the line
TRANSFER_FILES = ALWAYSinto the submit description file for jobs submitted from a Unix machine for execution on a Windows machine.
Condor uses the first network interface it sees on your machine. This problem usually means you have an extra, inactive network interface (such as a RAS dial up interface) defined before to your regular network interface.
To solve this problem, either change the order of your network interfaces in the Control Panel, or explicitly set which network interface Condor should use by adding the following parameter to your Condor configuration file:
NETWORK_INTERFACE = ip-address
ip-address is the IP address of the interface you wish
Condor to use.
This can occur when the machine your job is running on is missing a DLL (Dynamically Linked Library) required by your program. The solution is to find the DLL file the program needs and put it in the TRANSFER_INPUT_FILES list in the job's submit file.
To find out what DLLs your program depends on, right-click the program in Explorer, choose Quickview, and look under ``Import List''.
Five methods for making access of network files work with Condor are given in section 6.2.7.
Given the command
condor_off hostname2an error message of the form
Can't find address for master hostname2.somewhere.eduappears. Yet, when looking at the host names with
condor_status -masterthe output is of the form
hostname1.somewhere.edu hostname2 hostname3.somewhere.edu
To correct this incomplete host name, add an entry to the configuration file for DEFAULT_DOMAIN_NAME that specifies the domain name to be used. For the example given, the configuration entry will be
DEFAULT_DOMAIN_NAME = somewhere.edu
After adding this configuration file entry, use condor_ restart to restart the Condor daemons and effect the change.
An example of a batch script sets environment variables:
REM set some environment variables set LICENSE_SERVER=192.168.1.202:5012 set MY_PARAMS=2 REM Run the actual job now %*
First, make sure the condor_ schedd is running.
Next, check the SchedLog. It will contain more detailed information
about the failure. Frequently, the error is a result of
PERMISSION DENIED errors. You can read more about properly configuring
security settings on page .
Windows is likely to be running out of desktop heap. Confirm this to be the case by looking in the log for the condor_ schedd daemon to see if condor_ shadow daemons are immediately exiting with status 128. If this is the case, increase the desktop heap size. Open the registry key:
The SharedSection value can have three values separated by commas. The third value controls the desktop heap size for non-interactive desktops, which the Condor service uses. The default is 512 (Kbytes). 60 condor_ shadow daemons consume about 256 Kbytes, hence 120 shadows can run with the default value. To be able to run a maximum of 300 condor_ shadow daemons, set this value at 1280.
Reboot the system for the changes to take effect. For more information, see Microsoft Article Q184802.