You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tl;dr
It would be great if you could select to prematurely complete a job such that the data is still collected and reported.
Currently with COSBench, after a job has completed, the results from the drivers and collected and reported, such as Bandwith, Response-Time, Success-Ratio, etc. If a job is terminated during the "main" phase, then the status shows as terminated and no data is collected and reported.
Hypothetical Scenario: You have launched a long-running 20-hr job in COSBench to prove a new Swift cluster, for example. What if you are at the 18th-hour and the network has to be restarted to install critical patches, say for the Heartbleed OpenSSL bug, and they will not wait for 2 more hours for the job to complete. From the GUI, I select the Job and click a button to prematurely complete the job. It gracefully stops the work on the Drivers and gathers the data on the Controller for viewing. I can then see the BW, RespTime, Succ-Rat, etc. for the 18 of the 20 hours. The status does not need to say "Success", instead perhaps "Conditional-Success" or "Provisional-Success" or "Terminated-Success" to reflect that the job as submitted did not complete, but it was not an error that caused the job to terminate but directed by the operator.
I don't know if this is feasible, to communicate with the Drivers in this fashion. Mainly, I would like the ability to still view the statistical data for jobs that do not complete.
The text was updated successfully, but these errors were encountered:
it's certainly feasible, actually, controller is already got data points before the failure time, and we currently just discard them. will consider to support it. btw, v0.4.0 beta2 will be uploaded this week, which includes one fix to avoid termination at long run, you may try it before this issue is resolved.
tl;dr
It would be great if you could select to prematurely complete a job such that the data is still collected and reported.
Currently with COSBench, after a job has completed, the results from the drivers and collected and reported, such as Bandwith, Response-Time, Success-Ratio, etc. If a job is terminated during the "main" phase, then the status shows as terminated and no data is collected and reported.
Hypothetical Scenario: You have launched a long-running 20-hr job in COSBench to prove a new Swift cluster, for example. What if you are at the 18th-hour and the network has to be restarted to install critical patches, say for the Heartbleed OpenSSL bug, and they will not wait for 2 more hours for the job to complete. From the GUI, I select the Job and click a button to prematurely complete the job. It gracefully stops the work on the Drivers and gathers the data on the Controller for viewing. I can then see the BW, RespTime, Succ-Rat, etc. for the 18 of the 20 hours. The status does not need to say "Success", instead perhaps "Conditional-Success" or "Provisional-Success" or "Terminated-Success" to reflect that the job as submitted did not complete, but it was not an error that caused the job to terminate but directed by the operator.
I don't know if this is feasible, to communicate with the Drivers in this fashion. Mainly, I would like the ability to still view the statistical data for jobs that do not complete.
The text was updated successfully, but these errors were encountered: