Skip to content
This repository has been archived by the owner on Sep 28, 2024. It is now read-only.

Commit

Permalink
Merge pull request #39 from Abhishek-1Bhatt/deeponet
Browse files Browse the repository at this point in the history
Deeponet
  • Loading branch information
foldfelis authored Feb 27, 2022
2 parents f47637f + d3e58f1 commit 82b0465
Show file tree
Hide file tree
Showing 12 changed files with 334 additions and 2 deletions.
35 changes: 35 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,15 @@ It performs Fourier transformation across infinite-dimensional function spaces a
With only one time step information of learning, it can predict the following few steps with low loss
by linking the operators into a Markov chain.

**DeepONet operator** (Deep Operator Network)learns a neural operator with the help of two sub-neural net structures described as the branch and the trunk network. The branch network is fed the initial conditions data, whereas the trunk is fed with the locations where the target(output) is evaluated from the corresponding initial conditions. It is important that the output size of the branch and trunk subnets is same so that a dot product can be performed between them.

Currently, the `FourierOperator` layer is provided in this work.
As for model, there are `FourierNeuralOperator` and `MarkovNeuralOperator` provided. Please take a glance at them [here](src/model.jl).

## Usage

### Fourier Neural Operator

```julia
model = Chain(
# lift (d + 1)-dimensional vector field to n-dimensional vector field
Expand Down Expand Up @@ -76,6 +80,32 @@ opt = Flux.Optimiser(WeightDecay(1f-4), Flux.ADAM(1f-3))
Flux.@epochs 50 Flux.train!(loss, params(model), data, opt)
```

### DeepONet

```julia
#tuple of Ints for branch net architecture and then for trunk net, followed by activations for branch and trunk respectively
model = DeepONet((32,64,72), (24,64,72), σ, tanh)
```
Or specify branch and trunk as separate `Chain` from Flux and pass to `DeepONet`

```julia
branch = Chain(Dense(32,64,σ), Dense(64,72,σ))
trunk = Chain(Dense(24,64,tanh), Dense(64,72,tanh))
model = DeepONet(branch,trunk)
```

You can again specify loss, optimization and training parameters just as you would for a simple neural network with Flux.

```julia
loss(xtrain,ytrain,sensor) = Flux.Losses.mse(model(xtrain,sensor),ytrain)
evalcb() = @show(loss(xval,yval,grid))

learning_rate = 0.001
opt = ADAM(learning_rate)
parameters = params(model)
Flux.@epochs 400 Flux.train!(loss, parameters, [(xtrain,ytrain,grid)], opt, cb = evalcb)
```

## Examples

PDE training examples are provided in `example` folder.
Expand All @@ -84,6 +114,10 @@ PDE training examples are provided in `example` folder.

[Burgers' equation](example/Burgers)

### DeepONet implementation for solving Burgers' equation

[Burgers' equation](example/Burgers/src/Burgers_deeponet.jl)

### Two-dimensional Fourier Neural Operator

[Double Pendulum](example/DoublePendulum)
Expand Down Expand Up @@ -113,3 +147,4 @@ PDE training examples are provided in `example` folder.
- [Neural Operator: Graph Kernel Network for Partial Differential Equations](https://arxiv.org/abs/2003.03485)
- [zongyi-li/graph-pde](https://github.com/zongyi-li/graph-pde)
- [Markov Neural Operators for Learning Chaotic Systems](https://arxiv.org/abs/2106.06898)
- [DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators](https://arxiv.org/abs/1910.03193)
34 changes: 33 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Documentation for [NeuralOperators](https://github.com/foldfelis/NeuralOperators
|:----------------:|:--------------:|
| ![](https://github.com/foldfelis/NeuralOperators.jl/blob/master/example/FlowOverCircle/gallery/ans.gif?raw=true) | ![](https://github.com/foldfelis/NeuralOperators.jl/blob/master/example/FlowOverCircle/gallery/inferenced.gif?raw=true) |

The demonstration showing above is Navier-Stokes equation learned by the `MarkovNeuralOperator` with only one time step information.
The demonstration shown above is Navier-Stokes equation learned by the `MarkovNeuralOperator` with only one time step information.
Example can be found in [`example/FlowOverCircle`](https://github.com/foldfelis/NeuralOperators.jl/tree/master/example/FlowOverCircle).

## Abstract
Expand All @@ -30,6 +30,8 @@ It performs Fourier transformation across infinite-dimensional function spaces a
With only one time step information of learning, it can predict the following few steps with low loss
by linking the operators into a Markov chain.

**DeepONet operator** (Deep Operator Network)learns a neural operator with the help of two sub-neural net structures described as the branch and the trunk network. The branch network is fed the initial conditions data, whereas the trunk is fed with the locations where the target(output) is evaluated from the corresponding initial conditions. It is important that the output size of the branch and trunk subnets is same so that a dot product can be performed between them.

Currently, the `FourierOperator` layer is provided in this work.
As for model, there are `FourierNeuralOperator` and `MarkovNeuralOperator` provided.
Please take a glance at them [here](apis.html#Models).
Expand All @@ -44,6 +46,8 @@ pkg> add NeuralOperators

## Usage

### Fourier Neural Operator

```julia
model = Chain(
# lift (d + 1)-dimensional vector field to n-dimensional vector field
Expand Down Expand Up @@ -78,3 +82,31 @@ loss(𝐱, 𝐲) = sum(abs2, 𝐲 .- model(𝐱)) / size(𝐱)[end]
opt = Flux.Optimiser(WeightDecay(1f-4), Flux.ADAM(1f-3))
Flux.@epochs 50 Flux.train!(loss, params(model), data, opt)
```

### DeepONet

```julia
#tuple of Ints for branch net architecture and then for trunk net, followed by activations for branch and trunk respectively
model = DeepONet((32,64,72), (24,64,72), σ, tanh)
```

Or specify branch and trunk as separate `Chain` from Flux and pass to `DeepONet`

```julia
branch = Chain(Dense(32,64,σ), Dense(64,72,σ))
trunk = Chain(Dense(24,64,tanh), Dense(64,72,tanh))
model = DeepONet(branch,trunk)
```

You can again specify loss, optimization and training parameters just as you would for a simple neural network with Flux.

```julia
loss(xtrain,ytrain,sensor) = Flux.Losses.mse(model(xtrain,sensor),ytrain)
evalcb() = @show(loss(xval,yval,grid))

learning_rate = 0.001
opt = ADAM(learning_rate)
parameters = params(model)
Flux.@epochs 400 Flux.train!(loss, parameters, [(xtrain,ytrain,grid)], opt, cb = evalcb)
```
A more complete example using DeepONet architecture to solve Burgers' equation can be found in the [examples](../../example/Burgers/src/Burgers_deeponet.jl)
3 changes: 2 additions & 1 deletion example/Burgers/src/Burgers.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ using Flux
using CUDA

include("data.jl")
include("Burgers_deeponet.jl")

__init__() = register_burgers()

Expand All @@ -30,7 +31,7 @@ function train()
Dense(128, 1),
flatten
) |> device

loss(𝐱, 𝐲) = sum(abs2, 𝐲 .- m(𝐱)) / size(𝐱)[end]

loader_train, loader_test = get_dataloader()
Expand Down
32 changes: 32 additions & 0 deletions example/Burgers/src/Burgers_deeponet.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
function train_don()
if has_cuda()
@info "CUDA is on"
device = gpu
CUDA.allowscalar(false)
else
device = cpu
end

x, y = get_data_don(n=300)
xtrain = x[1:280, :]' |> device
xval = x[end-19:end, :]' |device

ytrain = y[1:280, :] |> device
yval = y[end-19:end, :] |> device

grid = collect(range(0, 1, length=1024))' |> device

learning_rate = 0.001
opt = ADAM(learning_rate)

m = DeepONet((1024,1024,1024),(1,1024,1024),gelu,gelu)
loss(xtrain,ytrain,sensor) = Flux.Losses.mse(model(xtrain,sensor),ytrain)
evalcb() = @show(loss(xval,yval,grid))

Flux.@epochs 400 Flux.train!(loss, params(m), [(xtrain,ytrain,grid)], opt, cb = evalcb)
= m(xval, grid)

diffvec = vec(abs.((yval .- ỹ)))
mean_diff = sum(diffvec)/length(diffvec)
return mean_diff
end
9 changes: 9 additions & 0 deletions example/Burgers/src/data.jl
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,15 @@ function get_data(; n=2048, Δsamples=2^3, grid_size=div(2^13, Δsamples), T=Flo
return x_loc_data, y_data
end

function get_data_don(; n=2048, Δsamples=2^3, grid_size=div(2^13, Δsamples))
file = matopen(joinpath(datadep"Burgers", "burgers_data_R10.mat"))
x_data = collect(read(file, "a")[1:n, 1:Δsamples:end])
y_data = collect(read(file, "u")[1:n, 1:Δsamples:end])
close(file)

return x_data, y_data
end

function get_dataloader(; n_train=1800, n_test=200, batchsize=100)
𝐱, 𝐲 = get_data(n=2048)

Expand Down
5 changes: 5 additions & 0 deletions example/Burgers/test/deeponet.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
@testset "DeepONet Training Accuracy" begin
ϵ = Burgers.train_don()

@test ϵ < 0.4
end
1 change: 1 addition & 0 deletions example/Burgers/test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@ using Test

@testset "Burgers" begin
include("data.jl")
include("deeponet.jl")
end
132 changes: 132 additions & 0 deletions src/DeepONet.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
"""
`DeepONet(architecture_branch::Tuple, architecture_trunk::Tuple,
act_branch = identity, act_trunk = identity;
init_branch = Flux.glorot_uniform,
init_trunk = Flux.glorot_uniform,
bias_branch=true, bias_trunk=true)`
`DeepONet(branch_net::Flux.Chain, trunk_net::Flux.Chain)`
Create an (unstacked) DeepONet architecture as proposed by Lu et al.
arXiv:1910.03193
The model works as follows:
x --- branch --
|
-⊠--u-
|
y --- trunk ---
Where `x` represents the input function, discretely evaluated at its respective sensors.
So the ipnut is of shape [m] for one instance or [m x b] for a training set.
`y` are the probing locations for the operator to be trained. It has shape [N x n] for
N different variables in the PDE (i.e. spatial and temporal coordinates) with each n distinct evaluation points.
`u` is the solution of the queried instance of the PDE, given by the specific choice of parameters.
Both inputs `x` and `y` are multiplied together via dot product Σᵢ bᵢⱼ tᵢₖ.
You can set up this architecture in two ways:
1. By Specifying the architecture and all its parameters as given above. This always creates
`Dense` layers for the branch and trunk net and corresponds to the DeepONet proposed by Lu et al.
2. By passing two architectures in the form of two Chain structs directly. Do this if you want more
flexibility and e.g. use an RNN or CNN instead of simple `Dense` layers.
Strictly speaking, DeepONet does not imply either of the branch or trunk net to be a simple
DNN. Usually though, this is the case which is why it's treated as the default case here.
# Example
Consider a transient 1D advection problem ∂ₜu + u ⋅ ∇u = 0, with an IC u(x,0) = g(x).
We are given several (b = 200) instances of the IC, discretized at 50 points each and want
to query the solution for 100 different locations and times [0;1].
That makes the branch input of shape [50 x 200] and the trunk input of shape [2 x 100]. So the
input for the branch net is 50 and 100 for the trunk net.
# Usage
```julia
julia> model = DeepONet((32,64,72), (24,64,72))
DeepONet with
branch net: (Chain(Dense(32, 64), Dense(64, 72)))
Trunk net: (Chain(Dense(24, 64), Dense(64, 72)))
julia> model = DeepONet((32,64,72), (24,64,72), σ, tanh; init_branch=Flux.glorot_normal, bias_trunk=false)
DeepONet with
branch net: (Chain(Dense(32, 64, σ), Dense(64, 72, σ)))
Trunk net: (Chain(Dense(24, 64, tanh; bias=false), Dense(64, 72, tanh; bias=false)))
julia> branch = Chain(Dense(2,128),Dense(128,64),Dense(64,72))
Chain(
Dense(2, 128), # 384 parameters
Dense(128, 64), # 8_256 parameters
Dense(64, 72), # 4_680 parameters
) # Total: 6 arrays, 13_320 parameters, 52.406 KiB.
julia> trunk = Chain(Dense(1,24),Dense(24,72))
Chain(
Dense(1, 24), # 48 parameters
Dense(24, 72), # 1_800 parameters
) # Total: 4 arrays, 1_848 parameters, 7.469 KiB.
julia> model = DeepONet(branch,trunk)
DeepONet with
branch net: (Chain(Dense(2, 128), Dense(128, 64), Dense(64, 72)))
Trunk net: (Chain(Dense(1, 24), Dense(24, 72)))
```
"""
struct DeepONet
branch_net::Flux.Chain
trunk_net::Flux.Chain
end

# Declare the function that assigns Weights and biases to the layer
function DeepONet(architecture_branch::Tuple, architecture_trunk::Tuple,
act_branch = identity, act_trunk = identity;
init_branch = Flux.glorot_uniform,
init_trunk = Flux.glorot_uniform,
bias_branch=true, bias_trunk=true)

@assert architecture_branch[end] == architecture_trunk[end] "Branch and Trunk net must share the same amount of nodes in the last layer. Otherwise Σᵢ bᵢⱼ tᵢₖ won't work."

# To construct the subnets we use the helper function in subnets.jl
# Initialize the branch net
branch_net = construct_subnet(architecture_branch, act_branch;
init=init_branch, bias=bias_branch)
# Initialize the trunk net
trunk_net = construct_subnet(architecture_trunk, act_trunk;
init=init_trunk, bias=bias_trunk)

return DeepONet(branch_net, trunk_net)
end

Flux.@functor DeepONet

#= The actual layer that does stuff
x is the input function, evaluated at m locations (or m x b in case of batches)
y is the array of sensors, i.e. the variables of the output function
with shape (N x n) - N different variables with each n evaluation points =#
function (a::DeepONet)(x::AbstractArray, y::AbstractVecOrMat)
# Assign the parameters
branch, trunk = a.branch_net, a.trunk_net

#= Dot product needs a dim to contract
However, we perform the transformations by the NNs always in the first dim
so we need to adjust (i.e. transpose) one of the inputs,
which we do on the branch input here =#
return branch(x)' * trunk(y)
end

# Sensors stay the same and shouldn't be batched
(a::DeepONet)(x::AbstractArray, y::AbstractArray) =
throw(ArgumentError("Sensor locations fed to trunk net can't be batched."))

# Print nicely
function Base.show(io::IO, l::DeepONet)
print(io, "DeepONet with\nbranch net: (",l.branch_net)
print(io, ")\n")
print(io, "Trunk net: (", l.trunk_net)
print(io, ")\n")
end
4 changes: 4 additions & 0 deletions src/NeuralOperators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ module NeuralOperators
using Zygote
using ChainRulesCore

export DeepONet

include("fourier.jl")
include("model.jl")
include("DeepONet.jl")
include("subnets.jl")
end
39 changes: 39 additions & 0 deletions src/subnets.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
"""
Construct a Chain of `Dense` layers from a given tuple of integers.
Input:
A tuple (m,n,o,p) of integer type numbers that each describe the width of the i-th Dense layer to Construct
Output:
A `Flux` Chain with length of the input tuple and individual width given by the tuple elements
# Example
```julia
julia> model = NeuralOperators.construct_subnet((2,128,64,32,1))
Chain(
Dense(2, 128), # 384 parameters
Dense(128, 64), # 8_256 parameters
Dense(64, 32), # 2_080 parameters
Dense(32, 1), # 33 parameters
) # Total: 8 arrays, 10_753 parameters, 42.504 KiB.
julia> model([2,1])
1-element Vector{Float32}:
-0.7630446
```
"""
function construct_subnet(architecture::Tuple, σ = identity;
init=Flux.glorot_uniform, bias=true)
# First, create an array that contains all Dense layers independently
# Given n-element architecture constructs n-1 layers
layers = Array{Flux.Dense}(undef, length(architecture)-1)
@inbounds for i 2:length(architecture)
layers[i-1] = Flux.Dense(architecture[i-1], architecture[i], σ;
init=init, bias=bias)
end

# Concatenate the layers to a string, chain them and parse them into
# the Flux Chain constructor syntax
return Meta.parse("Chain("*join(layers,",")*")") |> eval
end
Loading

0 comments on commit 82b0465

Please sign in to comment.