Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosbench Terminates the job after completing all stages. #394

Open
ashoksangee opened this issue Nov 23, 2019 · 1 comment
Open

Cosbench Terminates the job after completing all stages. #394

ashoksangee opened this issue Nov 23, 2019 · 1 comment

Comments

@ashoksangee
Copy link

Hi

I am facing a weird issue and not very sure what is the problem.

COSBench 0.4.2.c4
Ceph Nautilus accessing via S3

All stages in my test completes successfully however, Workload status is terminated. Not sure what is the problem.

image

Below is my workload file.

<!-- ************************* Create Containers ********************************* -->

 <workstage name="init">

   <work type="init" workers="8" config="cprefix=4rgw4ktest;containers=r(1,40)" id="op1"/>

     <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.107:8080"/>


 </workstage>
<workstage name="prepare-objects">


  <work name="prepare-4K-objects" workers="7" interval="5" type="prepare" ratio="100" division="object" config="cprefix=4rgw4ktest;containers=r(1,40);objects=r(1,10000);sizes=c(4)KB" id="op2">

    <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.107:8080" />

  </work>

 </workstage>


<!--************************* 4K ********************************* -->
 <work name="4K Write" workers="8" runtime="300" division="object">

   <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.107:8080"/>

      <operation type="write" ratio="100" config="cprefix=4rgw4ktest;containers=s(1,10);objects=s(1,10000);sizes=c(4)KB" id="op3"/>

 </work>

 <work name="4K Write" workers="8" runtime="300" division="object">

   <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.66:8080"/>

      <operation type="write" ratio="100" config="cprefix=4rgw4ktest;containers=s(11,20);objects=s(1,10000);sizes=c(4)KB" id="op3"/>

  </work>

  <work name="4K Write" workers="8" runtime="300" division="object">

   <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.69:8080"/>

      <operation type="write" ratio="100" config="cprefix=4rgw4ktest;containers=s(21,30);objects=s(1,10000);sizes=c(4)KB" id="op3"/>

  </work>

  <work name="4K Write" workers="8" runtime="300" division="object">

   <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.116:8080"/>

      <operation type="write" ratio="100" config="cprefix=4rgw4ktest;containers=s(31,40);objects=s(1,10000);sizes=c(4)KB" id="op3"/>

  </work>
  <work type="delay" workers="128"/>
 <work name="4K Read" workers="8" runtime="300" division="object">

   <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.107:8080"/>

       <operation type="read" ratio="100" config="cprefix=4rgw4ktest;containers=u(1,10);objects=u(1,10000)" id="op4"/>
 <work name="4K Read" workers="8" runtime="300" division="object">

   <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.66:8080"/>

       <operation type="read" ratio="100" config="cprefix=4rgw4ktest;containers=u(11,20);objects=u(1,10000)" id="op4"/>

 </work>

 <work name="4K Read" workers="8" runtime="300" division="object">

   <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.69:8080"/>

       <operation type="read" ratio="100" config="cprefix=4rgw4ktest;containers=u(21,30);objects=u(1,10000)" id="op4"/>

</work>

<work name="4K Read" workers="8" runtime="300" division="object">

   <storage type="s3" config="accesskey=KPUOZSG7YD2BF2CH36I6;secretkey=Y9DsdoKFwutqXcEGyCffBMLZY7jeIdScfHaQN365;timeout=999999;endpoint=http://10.208.140.116:8080"/>

       <operation type="read" ratio="100" config="cprefix=4rgw4ktest;containers=u(31,40);objects=u(1,10000)" id="op4"/>

</work>
<workstage name="delay" closuredelay="100">

  <work type="delay" workers="128" />

</workstage>
<workstage name="dispose">
  <work type="dispose" workers="1" config="cprefix=4rgw4ktest;containers=r(1,10000)" />
</workstage>
@ashoksangee
Copy link
Author

Forgot add the system.log output.

Here is the log and I see a Java exception triggered after the final stage.

2019-11-23 09:00:13,913 [INFO] [WorkloadProcessor] - begin to run stage s8-dispose
2019-11-23 09:00:13,913 [INFO] [WorkloadProcessor] - ============================================
2019-11-23 09:00:13,913 [INFO] [WorkloadProcessor] - START WORK: dispose
2019-11-23 09:00:13,917 [INFO] [AbstractCommandTasklet] - time drift between controller and driver-driver1 is 1 mSec
2019-11-23 09:00:13,000 [INFO] [StageRunner] - successfully booted all tasks in stage s8-dispose
2019-11-23 09:00:13,007 [INFO] [StageRunner] - successfully submitted all tasks in stage s8-dispose
2019-11-23 09:00:13,009 [INFO] [COSBDriverService] - handler=MB989020A50
2019-11-23 09:00:13,010 [INFO] [MissionHandler] - mission MB989020A50 has been authed successfully
2019-11-23 09:00:13,210 [INFO] [StageRunner] - successfully authenticated all tasks in stage s8-dispose
2019-11-23 09:00:13,413 [INFO] [StageRunner] - successfully launched all tasks in stage s8-dispose
2019-11-23 09:00:13,969 [INFO] [MissionHandler] - mission MB989020A50 has been executed successfully
2019-11-23 09:00:18,416 [INFO] [StageRunner] - successfully queried all tasks in stage s8-dispose
2019-11-23 09:00:18,419 [INFO] [MissionHandler] - mission MB989020A50 has been closed successfully
2019-11-23 09:00:18,441 [INFO] [StageRunner] - successfully closed all tasks in stage s8-dispose
2019-11-23 09:00:18,442 [INFO] [StageRunner] - acceptable failure ratio of work s8-dispose-dispose = 0.0
2019-11-23 09:00:18,442 [INFO] [StageRunner] - real failure ratio of work s8-dispose-dispose = N/A
2019-11-23 09:00:18,442 [INFO] [StageRunner] - successfully reach the goal of acceptable failure ratio in stage s8-dispose - work dispose
2019-11-23 09:00:20,914 [INFO] [WorkloadProcessor] - END WORK: dispose, Time elapsed: 0:0::7
2019-11-23 09:00:20,914 [INFO] [WorkloadProcessor] - ============================================
2019-11-23 09:00:20,914 [INFO] [WorkloadProcessor] - successfully ran stage s8-dispose
2019-11-23 09:00:20,914 [ERROR] [WorkloadProcessor] - unexpected exception
java.lang.NullPointerException
at com.intel.cosbench.controller.model.WorkloadContext.mergeErrorStatistics(WorkloadContext.java:286)
at com.intel.cosbench.controller.service.WorkloadProcessor.processWorkload(WorkloadProcessor.java:168)
at com.intel.cosbench.controller.service.WorkloadProcessor.process(WorkloadProcessor.java:130)
at com.intel.cosbench.controller.service.ControllerThread.run(ControllerThread.java:14)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-11-23 09:00:20,915 [INFO] [WorkloadProcessor] - begin to terminate workload w107
2019-11-23 09:00:21,109 [INFO] [SimpleWorkloadArchiver] - workload w107 has been successfully archived
2019-11-23 09:00:21,110 [INFO] [WorkloadProcessor] - successfully terminated workload w107

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant