Trimming zeroes in BigInteger #60563
-
#35565 is finally closed and we can move forward to the next series of optimizations. This time I would like to propose the optimization for trimming operation. When arithmetic operation occurred, every time we need to trim leading elements of type
#41495 is another proposal but I would not consider it as a part of this change. I did some research and here is the results: Benchmark[SimpleJob(runStrategy: RunStrategy.Throughput, launchCount: 1)]
[Orderer(SummaryOrderPolicy.Declared)]
public class TrimLeadingZeroesBenchmark
{
public readonly struct ArrayParam
{
private readonly uint[] _array;
private readonly string _text;
internal ArrayParam(string text, uint[] array)
{
_array = array;
_text = text;
}
public Span<uint> Value => _array.AsSpan();
public override string ToString() => _text;
}
[ParamsSource(nameof(Arrays))]
public ArrayParam array;
public static IEnumerable<ArrayParam> Arrays
{
get
{
yield return new("No leading zeroes", new uint[] { 1U, 2U, 3U, 4U, 5U, 6U, 7U, 8U, 9U, 10U, 11U, 12U, 13U, 14U, 15U, 16U, 17U, 18U });
yield return new("One leading zero", new uint[] { 1U, 2U, 3U, 4U, 5U, 6U, 7U, 8U, 9U, 10U, 11U, 12U, 13U, 14U, 15U, 16U, 17U, 0U });
yield return new("Four leading zeroes", new uint[] { 1U, 2U, 3U, 4U, 5U, 6U, 7U, 8U, 9U, 10U, 11U, 12U, 13U, 14U, 0U, 0U, 0U, 0U });
yield return new("Eight leading zeroes", new uint[] { 1U, 2U, 3U, 4U, 5U, 6U, 7U, 8U, 9U, 10U, 0U, 0U, 0U, 0U, 0U, 0U, 0U, 0U });
yield return new("Nine leading zeroes", new uint[] { 1U, 2U, 3U, 4U, 5U, 6U, 7U, 8U, 9U, 0U, 0U, 0U, 0U, 0U, 0U, 0U, 0U, 0U });
}
}
[Benchmark(Baseline = true, Description = "Loop")]
public Span<uint> TrimUsingLoop()
{
Span<uint> span = array.Value;
int length = span.Length;
while (length > 0 && span[length - 1] == 0)
--length;
return span.Slice(0, length);
}
[Benchmark(Description = "MemoryExtensions.TrimEnd")]
public Span<uint> TrimUsingMethod()
=> array.Value.TrimEnd(0U);
[Benchmark(Description = "Vector + unsafe code")]
public Span<uint> TrimUsingVectorUnsafe()
{
Span<uint> span = array.Value;
int length = span.Length;
while (length > Vector<uint>.Count)
{
int offset = length - Vector<uint>.Count;
ref uint ptr = ref Unsafe.Add(ref MemoryMarshal.GetReference(span), offset);
Vector<uint> vec = Unsafe.As<uint, Vector<uint>>(ref ptr);
if (vec != Vector<uint>.Zero)
break;
length = offset;
}
while (length > 0 && span[length - 1] == 0)
--length;
return span.Slice(0, length);
}
[Benchmark(Description = "Vector")]
public Span<uint> TrimUsingVector()
{
Span<uint> span = array.Value;
int length = span.Length;
while (length > Vector<uint>.Count)
{
int offset = length - Vector<uint>.Count;
Vector<uint> vec = new(span.Slice(offset, Vector<uint>.Count));
if (vec != Vector<uint>.Zero)
break;
length = offset;
}
while (length > 0 && span[length - 1] == 0)
--length;
return span.Slice(0, length);
}
}
Vector gives the most significant benefit when the array contains a lot of leading zeroes. And small overhead when the number of zeroes doesn't fit to vector size. However, this is constant price that do not depend on the number elements. Before I can start this work, it would be great to hear the opinion. /cc @jeffhandley , @tannergooding , @danmoseley , @GrabYourPitchforks |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Thanks for the consideration here @sakno. I think its best if we hold off on this change at the moment. I'm currently looking at a more general refactoring/potential rewrite of This refactoring is being considered for many reasons, including that most of the current bugs stem from it being (since the original implementation from 2010) stored as If you're interested in being involved in such a refactoring; then we'd be happy to coordinate; but that is likely going to require me completing some initial investigations and doing a write-up on the intended roadmap for such an undertaking (which will probably happen in early November at the moment). |
Beta Was this translation helpful? Give feedback.
Thanks for the consideration here @sakno.
I think its best if we hold off on this change at the moment. I'm currently looking at a more general refactoring/potential rewrite of
BigInteger
to just store and do everything astwo's complement
directly.This refactoring is being considered for many reasons, including that most of the current bugs stem from it being (since the original implementation from 2010) stored as
sign
+magnitude
. The current approach also leads to other problems, such as the implementation being overall less performant (which prevented its usage forfloating-point parsing/formatting
and is why we have a separate: https://source.dot.net/#System.Private.CoreLib/Number.B…