Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server-side virtual data processing #527

Open
marceloandrioni opened this issue Aug 16, 2024 · 13 comments
Open

Server-side virtual data processing #527

marceloandrioni opened this issue Aug 16, 2024 · 13 comments

Comments

@marceloandrioni
Copy link

marceloandrioni commented Aug 16, 2024

Hello. I was checking Unidata's news and stumble upon this article.
It says that @matakleo implemented a server-side virtual data processing in TDS, as part of his summer intern project. Any idea when this feature will be available on a main release?
I watched his project presentation and he demonstrated the EnhancementProvider using a classifier.
Do you think it will also be possible to create a provider capable of getting two variables (u and v) and returning two more (magnitude and direction)?

I know uv to magdir it's a simple calculation, but I lost count of how many times we had some kind of problem where non metocan people (e.g. naval architects, subsea engineers) got u/v data directly from our inhouse TDS server and applied an incorrect transformation when converting to magdir. Thus, I believe a resource capable of offering server-side calculations directly for the APIs (opendap, ncss, wcs) would be very helpful.

Thank you and congratulations for the great work.

@tdrwenski
Copy link
Contributor

tdrwenski commented Aug 16, 2024

Hi @marceloandrioni, very cool that you are interested in using this! The way it currently works, it can apply a transformation to a single variable. So some extra work may be necessary before you could transform two variables into two others.

It is available in the current 5.6-SNAPSHOT which does require JDK 17 and some extra JVM args (see CHRONICLE_CACHE here). We are in the process of some security updates, after which we plan to make another release, and that would also contain this feature. It could be nice if you could start to test with the 5.6-SNAPSHOT, because then we can make adjustments to the EnhancementProvider if you run into any issues.

@marceloandrioni
Copy link
Author

Hi @tdrwenski , sorry for the late reply. I am glad to know this option is already available in the snapshot. I will try to set the 5.6-snap + JDK17 on my side to run some tests and get back to you.
Thank you.

@haileyajohnson
Copy link

Hi @marceloandrioni - I have this implemented now here: https://github.com/haileyajohnson/vectorize-thredds-plugin
I'm not what the performance is like because I've only tested it on test data but it's a start at least!

@haileyajohnson
Copy link

also pinging @matakleo - if you wanna see your project in use :)

@marceloandrioni
Copy link
Author

Thanks for this @haileyajohnson. I am out of the office at the moment, but I will try this as soon as possible, probably with some ERA5 wind data. I imagine an extra argument will be needed to indicate if the vector direction calculated from U/V should be "reversed" to indicate "coming from", like the wind and wave convetions.
Thank you!

@matakleo
Copy link

Hi everyone! I've been following this, and I think it's amazing to see it already in use. It's great to know that my contribution can benefit others. Long live TDS and netCDF!

@marceloandrioni
Copy link
Author

Hello @haileyajohnson, I managed to get a TDS running with the following versions:

Linux 5.4.0-155-generic
OpenJDK17U-jdk_x64_linux_hotspot_17.0.13_11
apache-tomcat-10.1.31
THREDDS Data Server 5.6 2024-10-16 (beta)

I ran "mvn package" for the vectorize plugin and moved the resulting vectorize-tds-plugin-1.0-SNAPSHOT.jar file to /usr/local/tds/tomcat/webapps/thredds##5.6/WEB-INF/lib.

Then I ran some tests using this netcdf with dims:

  • time = 3 ;
  • depth = 3 ;
  • latitude = 121 ;
  • longitude = 169 ;

In the thredds catalog.xml I added the following definitions:

    <dataset name="cmems_uv_only"
             ID="cmems_uv_only"
             urlPath="datasets/cmems/cmems_uv_only"
             dataType="Grid">
        <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
                location="file:/usr/local/tds/datasets/cmems/cmems_forecast_20210101.nc"/>
    </dataset>

    <dataset name="cmems_uv_and_magdir"
             ID="cmems_uv_and_magdir"
             urlPath="datasets/cmems/cmems_uv_and_magdir"
             dataType="Grid">

        <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
                location="file:/usr/local/tds/datasets/cmems/cmems_forecast_20210101.nc">

            <variable name="cspd" shape="time depth latitude longitude" type="float">
                <attribute name="vectorize_mag" value="uo/vo" />
                <attribute name="long_name" value="current speed" />
                <attribute name="units" value="m/s" />
            </variable>

            <variable name="cdir" shape="time depth latitude longitude" type="float">
                <attribute name="vectorize_dir" value="uo/vo" />
                <attribute name="long_name" value="current direction" />
                <attribute name="units" value="degrees" />
            </variable>

        </netcdf>

    </dataset>

When I tried to access cmems_uv_and_magdir the first time I got some errors:

Throwable exception handled : jakarta.servlet.ServletException: Handler dispatch failed: java.lang.UnsupportedClassVersionError: org/example/VectorMagnitude$Provider has been compiled by a more recent version of the Java Runtime (class file version 63.0), this version of the Java Runtime only recognizes class file versions up to 61.0 (unable to load class [org.example.VectorMagnitude$Provider])
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1104)
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:979)
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1014)
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:903)

I then went back to the vectorize plugin and replaced version 19 with 17 in the source and target in the pom.xml:

    <properties>
        <maven.compiler.source>17</maven.compiler.source>
        <maven.compiler.target>17</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

and again moved the resulting vectorize-tds-plugin-1.0-SNAPSHOT.jar file to /usr/local/tds/tomcat/webapps/thredds##5.6/WEB-INF/lib.

With this new file the magnitude and direction appeared in the TDS interface:

Image

But when I tried to get the magnitude and direction values everything showed as zero, despite valid values of u/v.

Image

I am not sure if I missing some steps. It is enough to just put the plugin jar file in the WEB-inf/lib folder or do I need to also declare it as an ioServiceProvider in the threddsConfig.xml config file?

  <!--                                                                          
  Configuring the CDM (netcdf-java library)                                     
  see https://www.unidata.ucar.edu/software/netcdf-java/reference/RuntimeLoading.html
                                                                                
  <nj22Config>                                                                                                                                                                                                                                                                                                       
    <ioServiceProvider class="edu.univ.ny.stuff.FooFiles"/>                     
    <coordSysBuilder convention="foo" class="test.Foo"/>                        
    <coordTransBuilder name="atmos_ln_sigma_coordinates" type="vertical" class="my.stuff.atmosSigmaLog"/>
    <typedDatasetFactory datatype="Point" class="gov.noaa.obscure.file.Flabulate"/>
  </nj22Config>                                                                 
  --> 

Thank you!

@haileyajohnson
Copy link

Cool! Thanks for trying it out! I think you need to put values in your magnitude and direction variables, they need to contain just the index of the corresponding u and v (so just 0 to u/v.length)

@marceloandrioni
Copy link
Author

Hi @haileyajohnson. I am not sure I got this right. I included the values definition in the variables:

    <dataset name="cmems_uv_and_magdir"
             ID="cmems_uv_and_magdir"
             urlPath="datasets/cmems/cmems_uv_and_magdir"
             dataType="Grid">

        <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
                location="file:/usr/local/tds/datasets/cmems/cmems_forecast_20210101.nc">

            <variable name="cspd" shape="time depth latitude longitude" type="float">
                <attribute name="vectorize_mag" value="uo/vo" />
                <values start="0" increment="1" />
                <attribute name="long_name" value="current speed" />
                <attribute name="units" value="m/s" />
            </variable>

            <variable name="cdir" shape="time depth latitude longitude" type="float">
                <attribute name="vectorize_dir" value="uo/vo" />
                <values start="0" increment="1" />
                <attribute name="long_name" value="current direction" />
                <attribute name="units" value="degrees" />
            </variable>

        </netcdf>

    </dataset>

But now, after downloading the file using NCSS, the largest value for magnitude and direction shows as 184040, that is, the size of my dataset (3 x 121 x 3 x 169)

Image

@haileyajohnson
Copy link

hmm that looks like it's getting the min/max values from the un-converted values, which would be a bug....
do the values themselves look right?

@marceloandrioni
Copy link
Author

A ncdump -v cspd cmems_uv_and_magdir.nc shows the values of the indexes instead of the magnitude.

Image

The funny thing is that I also tried a direct access using xarray and then the values were all zero.

Image

@haileyajohnson
Copy link

Doesn't look like it's working then haha. I'll take a look at it this afternoon, but we should maybe move this discussion to an issue on my repo and let unidata close this one.

@marceloandrioni
Copy link
Author

No problem. Should I close this and open a new one on vectorize?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants