You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When setting up the cluster object, the log file (e.g., fireSense_SpreadFit_2021-04-12_124702_pid27400.log) is being specified explicitly in a way that requires that the full file path exists on the remote cluster nodes.
This location should perhaps be in /tmp on the remote machine(s) because there is no guarantee the project output path exists there. If that path does not exist, fitting fails with the folowing error:
Errorin file(outfile, open="a") :cannotopentheconnectionCalls:workRSOCK->sinkWorkerOutput->fileInaddition:Warningmessage:In file(outfile, open="a") :cannotopenfile'/mnt/projects2/WBI_SBW/outputs/AB/fireSense_SpreadFit_2021-04-12_104043_pid27400.log':NosuchfileordirectoryExecutionhaltedErrorin file(outfile, open="a") :cannotopentheconnectionCalls:workRSOCK->sinkWorkerOutput->fileInaddition:Warningmessage:In file(outfile, open="a") :cannotopenfile'/mnt/projects2/WBI_SBW/outputs/AB/fireSense_SpreadFit_2021-04-12_104043_pid27400.log':NosuchfileordirectoryExecutionhaltedErrorin file(outfile, open="a") :cannotopentheconnectionCalls:workRSOCK->sinkWorkerOutput->fileInaddition:Warningmessage:In file(outfile, open="a") :cannotopenfile'/mnt/projects2/WBI_SBW/outputs/AB/fireSense_SpreadFit_2021-04-12_104043_pid27400.log':NosuchfileordirectoryExecutionhalted2021-04-1210:47:45ERROR::e:FailedtolaunchandconnecttoRworkeronremotemachine ‘forcast01.local’ fromlocalmachine ‘forcast02’.*Theerrorproducedby socketConnection() was: ‘reachedelapsedtimelimit’ (whichsuggeststhattheconnectiontimeoutof120 seconds (argument'connectTimeout') kickedin)
*ThelocalhostsocketconnectionthatfailedtoconnecttotheRworkerusedport11604usingacommunicationtimeoutof2592000secondsandaconnectiontimeoutof120seconds.*Workerlaunchcall:'/usr/bin/ssh'-R11672:localhost:11604forcast01.local"'Rscript' --default-packages=datasets,utils,grDevices,graphics,stats,methods -e 'workRSOCK <- tryCatch(parallel:::.slaveRSOCK, error=function(e) parallel:::.workRSOCK); workRSOCK()' MASTER=localhost PORT=11672 OUT=/mnt/projects2/WBI_SBW/outputs/AB/fireSense_SpreadFit_2021-04-12_104043_pid27400.log TIMEOUT=2592000 XDR=FALSE".*Troubleshootingsuggestions:-Suggestion#1: Set 'verbose=TRUE' to see more details.-Suggestion#2: Set 'outfile=NULL' to see output from worker.-Suggestion#3: Set 'rshlogfile=TRUE' to enable logging for ‘/usr/bin/ssh’.*Numberofattempts:3 (15sdelay)
2021-04-1210:47:45ERROR::e: socketConnection("localhost", port=port, server=TRUE, blocking=TRUE, open="a+b", timeout=timeout)
Errorin socketConnection("localhost", port=port, server=TRUE, blocking=TRUE, :FailedtolaunchandconnecttoRworkeronremotemachine ‘forcast01.local’ fromlocalmachine ‘forcast02’.*Theerrorproducedby socketConnection() was: ‘reachedelapsedtimelimit’ (whichsuggeststhattheconnectiontimeoutof120 seconds (argument'connectTimeout') kickedin)
*ThelocalhostsocketconnectionthatfailedtoconnecttotheRworkerusedport11604usingacommunicationtimeoutof2592000secondsandaconnectiontimeoutof120seconds.*Workerlaunchcall:'/usr/bin/ssh'-R11672:localhost:11604forcast01.local"'Rscript' --default-packages=datasets,utils,grDevices,graphics,stats,methods -e 'workRSOCK <- tryCatch(parallel:::.slaveRSOCK, error=function(e) parallel:::.workRSOCK); workRSOCK()' MASTER=localhost PORT=11672 OUT=/mnt/projects2/WBI_SBW/outputs/AB/fireSense_SpreadFit_2021-04-12_104043_pid27400.log TIMEOUT=2592000 XDR=FALSE".
The error about timeout is a red herring, as it's the warning about the path not existing that is the real culprit.
The text was updated successfully, but these errors were encountered:
When setting up the cluster object, the log file (e.g.,
fireSense_SpreadFit_2021-04-12_124702_pid27400.log
) is being specified explicitly in a way that requires that the full file path exists on the remote cluster nodes.This location should perhaps be in
/tmp
on the remote machine(s) because there is no guarantee the project output path exists there. If that path does not exist, fitting fails with the folowing error:The error about timeout is a red herring, as it's the warning about the path not existing that is the real culprit.
The text was updated successfully, but these errors were encountered: