-
Hello, I'm creating something like visual programming for procedural modeling with help of the library. I've implemented several functions and faced with next problem. No matter how much data income into a function or how much data it produces the execution time is nearly the same. On the video performance of Subdivide Polyline node is always about 7 milliseconds. AwkwardTest.mp4At first I thought that this is a bug but after reading some answers here I came to conclusion that this is so called initial cost. So the question is there a way to reduce it? In my case it is quite significant because users can create trees with thousands of nodes and if all of them have some initial costs the tree will soon loos its ability to calculate in real time. Here is the function in case I'm doing something wrong. import numpy as np
import awkward as ak
def subdivide_polyline(verts, cuts):
"""Only works with list of polylines"""
lines_mum = len(verts)
segment_num = ak.num(verts, axis=-2) - 1
segment_shape = ak.num(verts, axis=-1)[..., :-1]
_, count_per_seg = ak.broadcast_arrays(segment_shape, cuts)
new_verts_num = segment_num * cuts + ak.num(verts)
total_num = ak.sum(new_verts_num)
seg_weight0 = np.zeros(total_num - lines_mum)
seg_weight1 = ak.unflatten(seg_weight0, ak.flatten(count_per_seg)+1)
seg_weight2 = ak.unflatten(seg_weight1, segment_num)
seg_weight3 = ak.local_index(seg_weight2)
seg_weight = seg_weight3 / (count_per_seg+1)
vert1 = verts[..., :-1, :]
vert1 = vert1[..., np.newaxis, :]
vert2 = verts[..., 1:, :]
vert2 = vert2[..., np.newaxis, :]
new_verts = linear_interpolation(vert1, vert2, seg_weight)
new_verts = ak.flatten(new_verts, axis=-2)
return new_verts
def linear_interpolation(v1, v2, factor):
return v1 * (1-factor) + v2 * factor
if __name__ == '__main__':
p_line = ak.Array([[[0, 0, 0], [1.5, 0, 0], [2, 0.5, 0]],
[[0, 0, 0], [0, 1.5, 0], [0, 2, 0.5], [0, 3, 0]]])
from timeit import timeit
t = timeit('subdivide_polyline(p_line, ak.Array([1]))', 'from __main__ import ak, subdivide_polyline, p_line', number=100)
print(f"{t*10:.1f} ms") |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
As you observe, we anticipate a certain "constant overhead" that is independent of the size of the data being operated upon (and scales linearly with the number of steps that you take. Whilst there is room for some performance optimisation here, realtime performance needs (i.e. <16ms) are somewhat out of scope. You might find for your application that writing a Numba routine to perform this logic (without awkward) performs better; Numba jitted functions don't have a significant performance overhead IIRC. I will update this conversation with some pointers. |
Beta Was this translation helpful? Give feedback.
-
We don't yet have a section on this in our docs, but I'll get round to that at some point! The best solution for your needs (low overhead) is probably Numba. The trade-off of using Numba is primarily expressiveness; some things are more cumbersome or sometimes impossible to write. However, in this case, it should be fairly straightforward. Here's a function doing what I think you're hoping to achieve; to subdivide a set of poly-lines This function evaluates much, much faster than the pure Awkward variant (because the data are so small), which is beneficial for your use case. I get 0.02 seconds per thousand iterations, which compares with 8 seconds per thousand for the pure-awkward case. I'm assuming that you want to take a ragged array of polylines, i.e of import numpy as np
import awkward as ak
import numba as nb
from timeit import timeit
@nb.njit
def subdivide_polylines(poly_lines, n_line_subdivisions):
# Pre-pass over the data to determine how large our arrays need to be
final_vertex_count = 0
for line in poly_lines:
n_segments = len(line) - 1
final_vertex_count += len(line) + n_line_subdivisions * n_segments
flat_result = np.empty((final_vertex_count, 3), dtype=np.float64)
flat_count = np.zeros(len(poly_lines), dtype=np.int64)
# New number of vertices per segment
n_final_segment_vertices = 2 + n_line_subdivisions
# Keep track of the vertex index in the flat result
l_vertex_index = 0
# For each polyline
for i_line, line in enumerate(poly_lines):
assert len(line) >= 2
# Given that stop(line i) == start(line i+1), we skip the first
# vertex of the line to avoid double counting
# Hence, let's write that here
start = np.asarray(line[0])
flat_result[l_vertex_index] = start
flat_count[i_line] += 1
l_vertex_index += 1
# For each segment (three vertices = two segments)
n_segments = len(line) - 1
for j_segment in range(n_segments):
# Get these as NumPy arrays
start = np.asarray(line[j_segment])
stop = np.asarray(line[j_segment + 1])
# For each vertex in the result, skiping the first (as explained above)
for k_segment_vertex in range(1, n_final_segment_vertices):
# Interpolate between start and stop
t = k_segment_vertex / (n_final_segment_vertices - 1)
flat_result[l_vertex_index] = start * (1 - t) + stop * t
l_vertex_index += 1
flat_count[i_line] = len(line) + n_line_subdivisions * n_segments
return flat_result, flat_count
if __name__ == "__main__":
p_line = ak.Array(
[
[[0, 0, 0], [1.5, 0, 0], [2, 0.5, 0]],
[[0, 0, 0], [0, 1.5, 0], [0, 2, 0.5], [0, 3, 0]],
]
)
# Trigger the jit to compile (ensure this function is only defined once, or you'll pay the JIT cost each time)
subdivide_polylines(p_line, 1)
t = timeit("subdivide_polylines(p_line, 1)", number=1000, globals=globals())
print(f"{t:.3f} s per 1000")
flat_result, flat_count = subdivide_polylines(p_line, 1)
result = ak.unflatten(flat_result, flat_count, axis=0)
print(result.tolist()) |
Beta Was this translation helpful? Give feedback.
We don't yet have a section on this in our docs, but I'll get round to that at some point!
The best solution for your needs (low overhead) is probably Numba. The trade-off of using Numba is primarily expressiveness; some things are more cumbersome or sometimes impossible to write. However, in this case, it should be fairly straightforward. Here's a function doing what I think you're hoping to achieve; to subdivide a set of poly-lines
N
times:This function evaluates much, much faster than the pure Awkward variant (because the data are so small), which is beneficial for your use case. I get 0.02 seconds per thousand iterations, which compares with 8 seconds per thousand for the pure-awkwar…