Help with pipeline command #23

Firedrops · 2019-02-04T05:53:55Z

I've been trying to adapt various components to the new project_id and buckets. There are some parts of the pipeline command that I either need help with, or am unsure where the files are, or on what the paths are on.

Nevermind solved, ignore the rectangle around com.theappsolutions...
The 2nd red rectangle, servicesUrl currently points to a generic apache installation success page. Are there further setup steps associated? Also, should this IP address be changed, or remain exactly as is?
The following bwa and kalign endpoints and DB.fasta files, where are they? I could not find them in the buckets nor on the docker allenday/bwa-http-docker, and they are probably not bucket paths since there's no gs://. Are the .cgi files generated by bwa-mem on install?
the resistantGenes stuff in buckets, could you please share them, or where I could find them?

The text was updated successfully, but these errors were encountered:

obsh · 2019-02-04T13:14:35Z

2, 3
Following options - "servicesUrl", "bwaEndpoint", "bwaDatabase", "kAlignEndpoint" - configure how HTTP requests are made from data-pipeline to cluster of aligment machines.

so for:
--servicesUrl=http://130.211.33.64
--bwaEndpoint=/cgi-bin/bwa.cgi
--bwaDatabase=DB.fasta
The system will post request with fastq data to http://130.211.33.64/cgi-bin/bwa.cgi with parameter database=DB.fasta

130.211.33.64 - is the public IP address of existing alignment cluster,
for http://130.211.33.64/ it returns default Apache page, but there are working endpoints:
http://130.211.33.64/cgi-bin/bwa.cgi
and http://130.211.33.64/cgi-bin/kalign.cgi

Cluster with aligner could be provisioned using sh files from "aligner" directory.

Regarding files bwa.cgi and kalign.cgi - they are baked into Docker image during build stage, see Dockerfile:
https://github.com/allenday/bwa-http-docker/blob/master/http.Dockerfile

Firedrops · 2019-02-04T14:32:10Z

Regarding files bwa.cgi and kalign.cgi - they are baked into Docker image during build stage, see Dockerfile:
https://github.com/allenday/bwa-http-docker/blob/master/http.Dockerfile

I'm a little confused about this as well, do you know what is being copied at the COPY lines? Is it FROM google/cloud-sdk? What exactly is google/cloud-sdk and can I look at what is there?

obsh · 2019-02-04T14:37:11Z

"google/cloud-sdk" is a docker image owned by Google with gcloud built inside.
See on dockerhub:
https://hub.docker.com/r/google/cloud-sdk/

obsh · 2019-02-04T14:44:25Z

Regarding 4.
For now I've uploaded files to "nano-stream" bucket, see:
https://console.cloud.google.com/storage/browser/nano-stream/ResistanceGenes/?project=nano-stream

Firedrops · 2019-02-04T14:50:16Z

Thank you for sharing those resistant files!

"google/cloud-sdk" is a docker image owned by Google with gcloud built inside.
See on dockerhub:
https://hub.docker.com/r/google/cloud-sdk/

so does COPY copy the files like bwa.cgi from google/cloud-sdk to the docker image being generated? That does not really make sense to me, because why would Google have those files in their docker?

obsh · 2019-02-04T14:58:41Z

so does COPY copy the files like bwa.cgi from google/cloud-sdk to the docker image being generated? That does not really make sense to me, because why would Google have those files in their docker?

It's just specifics of the Dockerfile syntax, it works in the following way - "FROM" instruction defines base image from which new image is built.
"COPY" instruction - copies files from the local folder into image. In our case this files are:
https://github.com/allenday/bwa-http-docker/blob/master/bwa.cgi
and
https://github.com/allenday/bwa-http-docker/blob/master/kalign.cgi

Firedrops · 2019-02-05T01:01:40Z

"COPY" instruction - copies files from the local folder into image. In our case this files are:

Ah I see now... did not realise to view that git before. This would be very helpful. Thank you!

Firedrops · 2019-02-05T03:45:13Z

Cluster with aligner could be provisioned using sh files from "aligner" directory.

Should these scripts be loaded onto cloud shell and executed from there? Or are they deployed in another way?

Update: Running them from cloud shell did not appear to do anything.

obsh · 2019-02-05T19:18:43Z

Could you specify which script did you run and what was the output, please?
This scripts can be run in any environment where you have gcloud installed and initialized.

Firedrops · 2019-02-06T02:00:37Z

Apologies, realised I had to run either provision_species.sh or provision_resistance_genes.sh. Have tried provision_species.sh and it works. I think this issue can be closed now. Thanks!

Firedrops · 2019-02-06T08:22:57Z

Sorry, just ran them, and realised that the generated instance does not allow HTTP traffic.
I have checked VPC rules that the rule exists

but the instance has HTTP unchecked, but 2 other network tags of http and http-server1, neither of which has VPC rules.

Is this a typo, and provision.sh should have allow-http added to the line 11 --tags http-server,http \?

Or am I looking for the wrong IP address to put into 2?

UPDATE: realised the target tags of allow-http rule is http-server, so it should be working. Any other idea why I'm unable to access the VM's IP?

UPDATE 2: I have SSH-ed in and looked at docker container with docker ps, the docker container's status is always Restarting. I don't think this is expected behavior, but not quite sure what's the fix yet. Might have something to do with #26

obsh · 2019-02-06T15:23:07Z

Could you check the logs of the docker container with docker logs command?
Meanwhile I'll run provision_resistance_genes.sh in a new GCP project in order to reproduce issue.

Firedrops · 2019-02-07T00:40:46Z

Output from docker logs reveals mainly repeats of these 2 chunks:

..............+++++
...............................................................+++++
e is 65537 (0x010001)
140680312165760:error:28069065:UI routines:UI_set_result:result too small:../crypto/ui/ui_lib.c:765:You must type in 4 to 1023 characters
140680312165760:error:28069065:UI routines:UI_set_result:result too small:../crypto/ui/ui_lib.c:765:You must type in 4 to 1023 characters
140680312165760:error:0906906F:PEM routines:PEM_ASN1_write_bio:read key:../crypto/pem/pem_lib.c:330:
Generating RSA private key, 2048 bit long modulus
........................................................................................................................+++++
................................................+++++
e is 65537 (0x010001)
unable to load Private Key
139751600240000:error:28069065:UI routines:UI_set_result:result too small:../crypto/ui/ui_lib.c:765:You must type in 4 to 1023 characters
139751600240000:error:06065064:digital envelope routines:EVP_DecryptFinal_ex:bad decrypt:../crypto/evp/evp_enc.c:536:
139751600240000:error:0906A065:PEM routines:PEM_do_header:bad decrypt:../crypto/pem/pem_lib.c:439:
Generating RSA private key, 2048 bit long modulus
.+++++
..................+++++

I figured the problem might be with the openssl code in entrypoint.sh, which is currently

openssl genrsa -des3 -passout pass:x -out /etc/apache2/ssl/pass.key 2048
openssl rsa -passin pass:x -in /etc/apache2/ssl/pass.key -out /etc/apache2/ssl/server.key
cat /tmp/ssl-info.txt | openssl req -new -key /etc/apache2/ssl/server.key -out /etc/apache2/ssl/server.csr
openssl x509 -req -days 365 -in /etc/apache2/ssl/server.csr -signkey /etc/apache2/ssl/server.key -out /etc/apache2/ssl/server.crt

I have found these 2 threads which might be related, but their code has slightly different format, not sure if they're used in the same context
Fedora "Issue"
Servefault thread

Firedrops · 2019-02-07T05:33:28Z

Realised I had mixed up the 2 entrypoint.sh, and was accidentally running the one in the root folder instead of the /http folder. They have been fixed now, but provision_species.html still does not make an accessible external IP.

Firedrops · 2019-02-07T06:26:04Z

Turns out these problems were due to my adapter docker container not being set up properly. I have tried with allenday's original bwa, and it works. However, the external IP 34.85.27.91 is still not accessible.

Is this the right IP to be looking at to substitute into (2) --servicesUrl=http://130.211.33.64?

obsh · 2019-02-07T06:47:51Z

As I see, external IP 34.85.27.91 is accessible now, isn't it??
Right, this should be used as --servicesUrl=http://34.85.27.91

Firedrops · 2019-02-07T07:03:59Z

Ah it is! I guess the start-up time was longer than I expected. Thanks!

Running the main java cp (path to jar).... code on the Cloud Shell returns this:
-bash: --bwaEndpoint=/cgi-bin/bwa.cgi: No such file or directory
seems like those are local paths, so this command should be run be run from inside the VM via SSH, is that correct?

obsh · 2019-02-07T07:12:55Z

it looks like there are some newlines not-escaped after parameters in multiline command,
please check that each newline after parameter is escaped like that:
--servicesUrl=http://34.85.27.91 \
--bwaEndpoint=/cgi-bin/bwa.cgi \
...

Firedrops · 2019-02-07T07:19:42Z

Yep, that was the problem, one of my lines had an extra space after the . Great thing we have your experience!

For reference, this is the full code snippet I'm trying to run now:

java -cp /home/coingroupimb/git_larry_2019-02-06/NanostreamDataflowMain/build/NanostreamDataflowMain.jar \
  com.theappsolutions.nanostream.NanostreamApp \
  --runner=org.apache.beam.runners.dataflow.DataflowRunner \
  --project=nano-stream1 \
  --streaming=true \
  --processingMode=species \
  --inputDataSubscription=projects/nano-stream1/topics/file_upload \
  --alignmentWindow=20 \
  --statisticUpdatingDelay=30 \
  --servicesUrl=http://34.85.27.91 \
  --bwaEndpoint=/cgi-bin/bwa.cgi \
  --bwaDatabase=DB.fasta \
  --kAlignEndpoint=/cgi-bin/kalign.cgi \
  --outputFirestoreDbUrl=https://nano-stream1.firebaseio.com \
  --outputFirestoreSequencesStatisticCollection=resistant_sequences_statistic \
  --outputFirestoreSequencesBodiesCollection=resistant_sequences_bodies \
  --outputFirestoreGeneCacheCollection=resistant_gene_cache \

now I'm getting this error:

Error occurred during initialization of VM
java.lang.Error: Properties init: Could not determine current working directory.
        at java.lang.System.initProperties(Native Method)
        at java.lang.System.initializeSystemClass(System.java:1166)

It appears Java wasn't installed by default on the cloud VM, will installing it be a fix? I had the idea that dataflow wasn't running directly on the cloud VM instance...

Update: Installed Java11 and ran it again, got this scary message:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 (file:/home/coingroupimb/git_larry_2019-02-06/NanostreamDataflowMain/build/NanostreamDataflowMain.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[main] INFO org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory - No tempLocation specified, attemptingto use default bucket: dataflow-staging-us-central1-465460488211
[main] INFO org.apache.beam.runners.dataflow.options.DataflowPipelineOptions$StagingLocationFactory - No stagingLocation provided, falling back to gcpTempLocation
Exception in thread "main" java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
        at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:224)
        at org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:155)
        at org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
        at org.apache.beam.sdk.Pipeline.create(Pipeline.java:145)
        at com.theappsolutions.nanostream.NanostreamApp.main(NanostreamApp.java:80)
Caused by: java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:214)
        ... 4 more
Caused by: java.lang.IllegalArgumentException: Unable to use ClassLoader to detect classpath elements. Current ClassLoader is jdk.internal.loader.ClassLoaders$AppClassLoader@4b85612c, only URLClassLoaders are supported.
        at org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage(PipelineResources.java:57)
        at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:270)
        ... 9 more

pseveryn · 2019-02-07T10:31:02Z

Regarding the official Apache Beam documentation, current SDK doesn't support JAVA 11 version (https://beam.apache.org/roadmap/java-sdk/). Now the official docs recommend using JAVA 8 (https://beam.apache.org/get-started/quickstart-java/#set-up-your-development-environment)

Firedrops · 2019-02-08T01:38:21Z

Regarding the PubSub subscription, does it matter if it's a PUSH or PULL type?

Firedrops · 2019-02-08T03:31:34Z

Attempted running with the new jar (pushed yesterday), came up with these errors that were not present before:
this line around 10 times:

[main] INFO org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2019-02-08T03:29:15.620Z: Unable to bring up enough workers.  Will retry in 5 seconds.

then terminated with this chunk

[main] ERROR org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2019-02-08T03:29:37.738Z: Workflow failed. Causes: Unable to bring up enough workers: minimum 1, actual 0. Please check your quota and retry later, or please try in a different zone/region.
[main] INFO org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2019-02-08T03:29:37.929Z: Cleaning up.
[main] INFO org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2019-02-08T03:29:37.958Z: Worker pool stopped.
[main] INFO org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2019-02-08T03:29:37.970Z: Stopping worker pool...
[main] INFO org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2019-02-08T03:29:37.979Z: Worker pool stopped.
[main] INFO org.apache.beam.runners.dataflow.DataflowPipelineJob - Job 2019-02-07_19_26_52-9841017688330516146 failed with status FAILED.

didn't run into this with the previous NanostreamDataflowMain.jar from a few days ago.

Update 1:
Have tried specifiying --region=asia-northeast1 to match the region of our provisioning cluster, ran into a quota issue where all 8 of our vCPU cores are being used by the provisioning instance.
Have submitted a request to increase quota to 16 vCPUs, Google e-mail says it will take about 2 business days to process.

In the meantime, it'd be good to hear everyones inputs on whether I'm on the right troubleshooting track. I'm having doubts since earlier I could run the old jar in the default region, us-central.

Update 2:
Rolled back to yesterday's version before changes were made to the .jar, same issues as above regarding worker nodes and quotas. Anyone know what might be going on? Why did it deploy successfully ~3 hours ago, but never again?

Update 3:
Requesting further quotas failed due to Lachlan's GCP account being "free trial". I "activated" it (still free until we run out of the USD$300) and it automatically increased existing quota to 24 vCPUs, which is enough. The updated pipeline deployed successfully.

But nothing seems to be happening. PubSub notifications are being sent properly.

Firebase collections remain empty.

Currently I have it set up to use the same PUSH subscription as the monitoring website. Does this pipeline require PULL? I did not notice PULL functions in the java source, so I assumed it's PUSH.

--alignmentWindow=20 does this line mean that alignment only gets sent when there are 20 fastq files? Or every 20 seconds?

pseveryn · 2019-02-08T08:26:34Z

Regarding the PubSub subscription, does it matter if it's a PUSH or PULL type?

You should use a PULL type. Here is an example of some of our subscribtions:

pseveryn · 2019-02-08T08:30:43Z

--alignmentWindow=20 does this line mean that alignment only gets sent when there are 20 fastq files? Or every 20 seconds?

It means that we used 20 seconds windows to gather fastq data into the batch to make alignment in the next step

pseveryn · 2019-02-08T08:32:41Z

Currently I have it set up to use the same PUSH subscription as the monitoring website. Does this pipeline require PULL? I did not notice PULL functions in the java source, so I assumed it's PUSH.

Please, try with PULL type

Firedrops · 2019-02-09T12:54:20Z

I have switched to PULL, and after fixing some random problems with PubSub today, there is finally activity in Dataflow:

However, the Firestore remains empty except for the pre-generated file. It currently looks like this:

Are there further steps to set up the Firestore that we missed?

obsh · 2019-02-09T21:09:08Z

It’s hard to say from the pipeline pictures what exactly goes wrong.
Alignment step looks a bit suspicious, as the wall time there is "14min 51 sec” and for next step - “0 sec”.

Could you check the following thins in the pipeline, please:

general pipeline errors:

statistics for alignment step, click on the step to reveal statistics on the right panel:

if above two doesn’t show the problem, statistics for other steps should be checked.

Firedrops · 2019-02-10T02:43:07Z

From the aligner "14m" step:

So I went back to check the compute instance and VPC rules, everything looks normal. This summary also shows proper tagging of rules and correct external IP.

What else could be wrong? Could it be using the custom tag of http-server instead of default-allow-http be causing confusion?

Or could it be the Dataflow pipeline being unable to connect to external IPs?

obsh · 2019-02-10T08:22:25Z

Or could it be the Dataflow pipeline being unable to connect to external IPs?

I don't think there is a problem with firewall rules, as we checked previously, that Apache page is available from the public internet.

Maybe Load-balancer's default timeout of 30 seconds is too small and that's why connections timed out.

Try to increase timeout, with gcloud:

gcloud compute backend-services update bwa-resistance-genes-backend-service --timeout=600 --global

Firedrops · 2019-02-11T00:58:15Z

Thanks! That got it through alignment, and it works

as does next step, Extract Sequences,

but Group by SAM reference doesn't seem to be receiving anything. The log remains empty even at Any log level.

Do you know what might be going on here? I assume the VM with dockers have already finished their job by this stage?

UPDATE: Re-deployed the updated pipeline and japsa committed ~13 hours ago by Allen, still running into the same issue at Group by SAM reference

obsh · 2019-02-11T23:35:42Z

It looks that there were no matched sequences after alignment stage.
“Extract Sequences” step has condition:
https://github.com/allenday/nanostream-dataflow/blob/master/NanostreamDataflowMain/src/main/java/com/theappsolutions/nanostream/aligner/GetSequencesFromSamDataFn.java#L31

that skips all records, where reference name equals “*”.

I think it was the case, because there is a log record about number of SAMRecord items in this step, but there is no output or errors logged.

I've noticed, that you were using IP address of the bwa-resistance-genes VM, while running pipeline species--2019-02-11t13-46-23ddut in the species mode. Could it be the cause, that there were no matches to the reference sequences?

Also regarding aligner - you should use Load Balancer's IP address:

other way auto-scaling won't work, as all requests will hit single server.
I'll update documentation regarding Load Balancer's IP.

Firedrops · 2019-02-12T02:33:05Z

Also regarding aligner - you should use Load Balancer's IP address:
other way auto-scaling won't work, as all requests will hit single server.
I'll update documentation regarding Load Balancer's IP.

I've updated the IP to 35.201.96.177 and re-deployed the pipeline. Ran the timeout=600 command again as well, just in case. It now gets stuck at Alignment with
Status: 200, response length: 0

UPDATE 1: Since I didn't have this error before, I tried deploying with the old IP address again (of just the single VM instance). It also returned the same error.

UPDATE 2: Sending in a single big fastq file (~160mb) will stop the pipeline much earlier, even the Alignment step will show no logs at all instead of Status: 200. Is this an unexpected limitation?

UPDATE 3: I see a worker VM instance, species--2019-02-12t13-29-02111930-jvqa-harness-wdnc, running like 4 docker containers with only 15 GB memory. Is this causing a bug, as the database files total ~26 GB? 15 GB would've been enough for the previous database ~12 GB, but not the new ones we're testing with. I've been looking for where this 15 GB was specified, but could not find it to change it.

UPDATE 4: I have cleared the previous provisioning and dataflow setups, and re-deployed with the original CombinedDatabases which are only ~12 GB, unfortunately, I'm seeing a new set of errors at Alignment step:

UPDATE 5: I cleared and re-deployed provisioning and dataflow AGAIN, I'm back to having only Status 200 errors, as before. Still doesn't fully work. Checking the provisioning VM instance's subpages, http://35.243.64.91/cgi-bin/bwa.cgi returns content, but http://35.243.64.91/cgi-bin/kalign.cgi returns HTTP ERROR 400. Could be a problem there?

obsh · 2019-02-12T06:23:13Z

I've checked VM instances, there is a deployed file genomeDB.fasta,
therefore pipeline should be run with the --bwaDatabase=genomeDB.fasta and currently it's running with --bwaDatabase=DB.fasta wich results in empty response.

Firedrops · 2019-02-12T07:34:54Z

OK so we will do a new build of japsa and uber-jar.
Should anything be done about http://35.243.64.91/cgi-bin/kalign.cgi returning HTTP ERROR 400, and/or the NCBI querying?

obsh · 2019-02-12T07:48:04Z

Should anything be done about http://35.243.64.91/cgi-bin/kalign.cgi returning HTTP ERROR 400

No, 400 for the GET requests is expected, you can check that it works by submitting POST request with sample data:

curl -v -F fasta=@NanostreamDataflowMain/src/test/resources/kAlignResult.txt http://35.243.64.91/cgi-bin/kalign.cgi

Update:
By the way, you can check alignment with:

curl -v -F database='genomeDB.fasta' -F fastq=@NanostreamDataflowMain/src/test/resources/fasqQOutputData.txt http://35.243.64.91/cgi-bin/bwa.cgi

obsh · 2019-02-12T08:16:14Z

and/or the NCBI querying?

It's not clear at the moment what's the issue there, I'll try to debug it.

Firedrops · 2019-02-12T12:26:17Z

I have attempted to build the uber jar from the Cloud Shell after installing Maven,

At the last step, mvn clean package, after a few minutes of outputs, it ends with this

Looking through the surefire-reports, the error is found in com.theappsolutions.nanostream.EndToEndPipelineTest.txt

-------------------------------------------------------------------------------
Test set: com.theappsolutions.nanostream.EndToEndPipelineTest
-------------------------------------------------------------------------------
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 55.248 s <<< FAILURE! - in com.theappsolutions.nanostream.EndToEndPipelineTest
testEndToEndPipelineSpeciesMode(com.theappsolutions.nanostream.EndToEndPipelineTest)  Time elapsed: 55.246 s  <<< ERROR!
org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.RuntimeException: org.apache.beam.sdk.coders.CoderException: cannot encode a null List
	at com.theappsolutions.nanostream.EndToEndPipelineTest.testEndToEndPipelineSpeciesMode(EndToEndPipelineTest.java:114)
Caused by: java.lang.RuntimeException: org.apache.beam.sdk.coders.CoderException: cannot encode a null List
Caused by: org.apache.beam.sdk.coders.CoderException: cannot encode a null List

Is this something worth troubleshooting, or is it easier for you to do another build and commit?

obsh · 2019-02-12T14:40:21Z

I've committed a new build and created separate issue #54 to deal with failing test.

Firedrops · 2019-02-12T14:57:45Z

Does the same test fail for you when you build it too?

obsh · 2019-02-12T15:04:21Z

Does the same test fail for you when you build it too?

No, for me this test executes successfully. I'm not sure right now why it's failing in your case.
Maybe we'll exclude it from the default test-suite, as this test do real HTTP calls to alignment endpoints.

Firedrops · 2019-02-13T00:46:29Z

Ok, I have ignored that and re-cloned the updated git.

I am now running into this problem

the Alignment step is running much faster than real time (i.e. ~5 minutes have passed IRL, it's showing 21 minutes)
The logs are split into 2 distinct sections,
the top section don't specify what error is occuring, but it's at MakeAlignmentViaHttpFn,
the bottom section is Status: 200, rerponse length: 0 like yesterday, but I've made sure that --bwaDatabase=genomeDB.fasta.

obsh · 2019-02-13T01:11:28Z

As I see there is an element the passed Alignment step, and finally there are results into Firestore.

Firedrops · 2019-02-13T01:21:11Z

Yes, there are. So are these errors safe to ignore? Or are we getting incomplete results?

obsh · 2019-02-13T01:27:30Z

I think we have logger misconfiguration, because this logs should be actually "INFO" level.

And in message text it says "INFO" for both types of message:

Firedrops · 2019-02-13T01:49:59Z

I see, safe for us to ignore then.

I have been unable to get the (visualization app](https://nano-stream1.appspot.com/) to show data from firestore.

From Firebase, I get this code

<script src="https://www.gstatic.com/firebasejs/5.8.2/firebase.js"></script>
<script>
  // Initialize Firebase
  var config = {
    apiKey: "AIzaSyDLtxwk4r3ahh-R7aTGIXlMvgrBi5pc_P0",
    authDomain: "nano-stream1.firebaseapp.com",
    databaseURL: "https://nano-stream1.firebaseio.com",
    projectId: "nano-stream1",
    storageBucket: "nano-stream1.appspot.com",
    messagingSenderId: "465460488211"
  };
  firebase.initializeApp(config);
</script>

And I have updated those var config values from the previous upwork-nano-stream everywhere I could find it. I have also tried pasting that entire code into sunburst.html's <head> section. Both cases only show No Data Available

obsh · 2019-02-13T01:57:13Z

I've updated Firestore security configuration to allow public read access:

And added collection and document names to URL:
https://nano-stream1.appspot.com/?c=resistant_sequences_statistic&d=resultDocument--2019-02-13T00-28-52UTC

Firedrops · 2019-02-13T02:02:03Z

I see, thank you! The public read access part should probably be added to the readme.

Regarding the URL, is there any way to make it more user-friendly? For example, add code that scans resistant_sequences_statistic for entries and auto-generate hotlinks on nano-stream1.appspot.com, which will become just a 'hub'?

Firedrops · 2019-02-13T02:08:35Z

I think this issue can be closed, and create other issues for individual feature requests and optimization?

obsh · 2019-02-13T02:10:53Z

I think this issue can be closed, and create other issues for individual feature requests and optimization?

Yes, I'll close it now, it becoming a bit hard to navigate this issue's page)

The public read access part should probably be added to the readme.

Regarding the URL, is there any way to make it more user-friendly? For example, add code that scans resistant_sequences_statistic for entries and auto-generate hotlinks on nano-stream1.appspot.com, which will become just a 'hub'?

Totally agree, we'll work on it. #56 #57

Firedrops · 2019-02-13T02:55:48Z

Sorry, we noticed that the results were odd, only staph strains were detected, and in exactly equal proportions. Other bacteria like helicobacter were absent.

So we ran the same .fastq file again, this time we got error:

And many minutes later, no further entries were added to resistant_sequences_statistic

Firedrops · 2019-02-14T00:31:36Z

@obsh sorry just tagging you for notifications, not sure if closed topics still send them automatically.

We ran the pipeline a few more times, it is quite inconsistent. Sometimes it gives the above errors, sometimes they don't, even though I use the exact same deployment command each time.

obsh · 2019-02-14T01:00:41Z

@obsh sorry just tagging you for notifications, not sure if closed topics still send them automatically.

No problem at all.

We ran the pipeline a few more times, it is quite inconsistent. Sometimes it gives the above errors, sometimes they don't, even though I use the exact same deployment command each time.

I've made adjustments to the alignment step in #65 , it should work more stable now.
Build file is updated, but also I've marked integration test as ignored for now, so you can try to build jar using mvn clean install.

Firedrops · 2019-02-14T02:07:49Z

My first attempt today after a clean clone and mvn clean install with no errors, gave the above error again

 2019-02-14 (12:03:33) Processing stuck in step Alignment for at least 05m00s without outputting or completing in state pro...
Processing stuck in step Alignment for at least 05m00s without outputting or completing in state process
  at java.net.SocketInputStream.socketRead0(Native Method)
  at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
  at java.net.SocketInputStream.read(SocketInputStream.java:170)
  at java.net.SocketInputStream.read(SocketInputStream.java:141)
  at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
  at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
  at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
  at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
  at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
  at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
  at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
  at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
  at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
  at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
  at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
  at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85)
  at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
  at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
  at com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
  at com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
  at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)
  at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn$DoFnInvoker.invokeProcessElement(Unknown Source)

and 5 minutes later:

2019-02-14 (12:08:33) Processing stuck in step Alignment for at least 10m00s without outputting or completing in state pro...
Processing stuck in step Alignment for at least 10m00s without outputting or completing in state process
  at java.net.SocketInputStream.socketRead0(Native Method)
  at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
  at java.net.SocketInputStream.read(SocketInputStream.java:170)
  at java.net.SocketInputStream.read(SocketInputStream.java:141)
  at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
  at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
  at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
  at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
  at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
  at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
  at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
  at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
  at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
  at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
  at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
  at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85)
  at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
  at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
  at com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
  at com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
  at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)
  at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn$DoFnInvoker.invokeProcessElement(Unknown Source)

org.apache.http.client.ClientProtocolException: Unexpected response status: 502
        com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:39)
        com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:17)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:223)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
        com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
        com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
        com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)

Hang on, Lachlan just told me this was a problem in japsa that was just fixed, let me try with the new release.

Firedrops · 2019-02-14T03:29:50Z

OK it's running, but still inconsistently. Sometimes works, sometimes gives that 5 mins error.Also I noticed that on a run that worked, 2 collections were generated, ~35 seconds apart. Is the sessioning being overzealous?

https://nano-stream1.appspot.com/?c=new_scanning_species_sequences_statistic&d=resultDocument--2019-02-14T03-31-42UTC
and
https://nano-stream1.appspot.com/?c=new_scanning_species_sequences_statistic&d=resultDocument--2019-02-14T03-31-07UTC

both generated from Erwinia_amylovora.fastq

UPDATE 1: It might be related to input .fastq size. It now always fails on 20170731_GP01_MNP_nohuman, which is 866 kb. If I feed another .fastq, regardless of size, after getting the 5 minute errors, I get these errors:

 2019-02-14 (14:15:05) java.net.SocketException: Broken pipe
java.net.SocketException: Broken pipe
        java.net.SocketOutputStream.socketWrite0(Native Method)
        java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
        java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        org.apache.http.impl.io.SessionOutputBufferImpl.streamWrite(SessionOutputBufferImpl.java:124)
        org.apache.http.impl.io.SessionOutputBufferImpl.flushBuffer(SessionOutputBufferImpl.java:136)
        org.apache.http.impl.io.SessionOutputBufferImpl.write(SessionOutputBufferImpl.java:167)
        org.apache.http.impl.io.ContentLengthOutputStream.write(ContentLengthOutputStream.java:113)
        org.apache.http.entity.mime.content.StringBody.writeTo(StringBody.java:174)
        org.apache.http.entity.mime.AbstractMultipartForm.doWriteTo(AbstractMultipartForm.java:134)
        org.apache.http.entity.mime.AbstractMultipartForm.writeTo(AbstractMultipartForm.java:157)
        org.apache.http.entity.mime.MultipartFormEntity.writeTo(MultipartFormEntity.java:113)
        org.apache.http.impl.DefaultBHttpClientConnection.sendRequestEntity(DefaultBHttpClientConnection.java:156)
        org.apache.http.impl.conn.CPoolProxy.sendRequestEntity(CPoolProxy.java:160)
        org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:238)
        org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
        org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
        org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85)
        org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
        org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
        com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
        com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
        com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)

UPDATE 2: Stopped and re-deployed to try with another file, 189 KB, got this error instead

 2019-02-14 (14:22:06) org.apache.http.client.ClientProtocolException: Unexpected response status: 502
org.apache.http.client.ClientProtocolException: Unexpected response status: 502
        com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:39)
        com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:17)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:223)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
        com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
        com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
        com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)

If I add the small Erwinia after this, I get the stuck in alignment 5 minutes error again.

obsh mentioned this issue Feb 12, 2019

EndToEndPipelineTest Fails #54

Closed

obsh closed this as completed Feb 13, 2019

This was referenced Feb 13, 2019

Logger misconfiguration #58

Open

Aligner nodes memory 15 GB limitation? #60

Closed

This was referenced Feb 18, 2019

Single .fastq files generating multiple Firestore documents #74

Closed

Pipeline still unstable with large .fastqs #72

Open

Help with pipeline command #23

Help with pipeline command #23

Comments

Firedrops commented Feb 4, 2019 • edited Loading

obsh commented Feb 4, 2019

Firedrops commented Feb 4, 2019

obsh commented Feb 4, 2019

obsh commented Feb 4, 2019

Firedrops commented Feb 4, 2019

obsh commented Feb 4, 2019

Firedrops commented Feb 5, 2019

Firedrops commented Feb 5, 2019 • edited Loading

obsh commented Feb 5, 2019

Firedrops commented Feb 6, 2019

Firedrops commented Feb 6, 2019 • edited Loading

obsh commented Feb 6, 2019

Firedrops commented Feb 7, 2019 • edited Loading

Firedrops commented Feb 7, 2019 • edited Loading

Firedrops commented Feb 7, 2019

obsh commented Feb 7, 2019 • edited Loading

Firedrops commented Feb 7, 2019 • edited Loading

obsh commented Feb 7, 2019 • edited Loading

Firedrops commented Feb 7, 2019 • edited Loading

pseveryn commented Feb 7, 2019

Firedrops commented Feb 8, 2019

Firedrops commented Feb 8, 2019 • edited Loading

pseveryn commented Feb 8, 2019 • edited Loading

pseveryn commented Feb 8, 2019

pseveryn commented Feb 8, 2019

Firedrops commented Feb 9, 2019

obsh commented Feb 9, 2019

Firedrops commented Feb 10, 2019 • edited Loading

obsh commented Feb 10, 2019

Firedrops commented Feb 11, 2019 • edited Loading

obsh commented Feb 11, 2019

Firedrops commented Feb 12, 2019 • edited Loading

obsh commented Feb 12, 2019 • edited Loading

Firedrops commented Feb 12, 2019

obsh commented Feb 12, 2019 • edited Loading

obsh commented Feb 12, 2019

Firedrops commented Feb 12, 2019 • edited Loading

obsh commented Feb 12, 2019

Firedrops commented Feb 12, 2019

obsh commented Feb 12, 2019

Firedrops commented Feb 13, 2019

obsh commented Feb 13, 2019

Firedrops commented Feb 13, 2019 • edited Loading

obsh commented Feb 13, 2019

Firedrops commented Feb 13, 2019 • edited Loading

obsh commented Feb 13, 2019

Firedrops commented Feb 13, 2019 • edited Loading

Firedrops commented Feb 13, 2019

obsh commented Feb 13, 2019 • edited Loading

Firedrops commented Feb 13, 2019

Firedrops commented Feb 14, 2019

obsh commented Feb 14, 2019

Firedrops commented Feb 14, 2019 • edited Loading

Firedrops commented Feb 14, 2019 • edited Loading

Firedrops commented Feb 4, 2019 •

edited

Loading

Firedrops commented Feb 5, 2019 •

edited

Loading

Firedrops commented Feb 6, 2019 •

edited

Loading

Firedrops commented Feb 7, 2019 •

edited

Loading

Firedrops commented Feb 7, 2019 •

edited

Loading

obsh commented Feb 7, 2019 •

edited

Loading

Firedrops commented Feb 7, 2019 •

edited

Loading

obsh commented Feb 7, 2019 •

edited

Loading

Firedrops commented Feb 7, 2019 •

edited

Loading

Firedrops commented Feb 8, 2019 •

edited

Loading

pseveryn commented Feb 8, 2019 •

edited

Loading

Firedrops commented Feb 10, 2019 •

edited

Loading

Firedrops commented Feb 11, 2019 •

edited

Loading

Firedrops commented Feb 12, 2019 •

edited

Loading

obsh commented Feb 12, 2019 •

edited

Loading

obsh commented Feb 12, 2019 •

edited

Loading

Firedrops commented Feb 12, 2019 •

edited

Loading

Firedrops commented Feb 13, 2019 •

edited

Loading

Firedrops commented Feb 13, 2019 •

edited

Loading

Firedrops commented Feb 13, 2019 •

edited

Loading

obsh commented Feb 13, 2019 •

edited

Loading

Firedrops commented Feb 14, 2019 •

edited

Loading

Firedrops commented Feb 14, 2019 •

edited

Loading