Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue of AssetsTools.NET #86

Open
myocytebd opened this issue Oct 23, 2022 · 12 comments
Open

Performance issue of AssetsTools.NET #86

myocytebd opened this issue Oct 23, 2022 · 12 comments
Labels
enhancement New feature or request

Comments

@myocytebd
Copy link

I practiced a AssetsTools.NET helloworld by dumping an assets list, but found it slower than expectation.
A minimal loop like this could end up in pretty bad performance.

foreach (var assetInfo in assetsFile.table.assetFileInfo) {
    var dummy = assetsManager.GetTypeInstance(assetsFile, assetInfo);
}

Some performance numbers:
(Linux/Mono 5.15.0/6.12; i7-8809G/kabylake 3.1-4.2G)
(Resources size: around 4GiB, 64k assets)

With the loop above: ~20s
Remove loop, skip AssetsManager.GetTypeInstance(): < 2s
Sequence read of assets file: < 0.5s (fully cached)

According to mono profiler:

85% time is consumed in unmanaged code.
GC took 2/3 of the 85% unmanaged time. (I would guess that most of the rest is allocation)
Allocation count is high, near 30M when I killed the run.
Top-down allocation (IMO bytes doesn't matter, count dominates).

Allocation summary
     Bytes      Count  Average Type name
 178768656    3724347       48 AssetsTools.NET.AssetTypeValueField
 169192824    2643434       64 System.Char[]
 147427880    3676029       40 AssetsTools.NET.AssetTypeValueField[]
 138437568    3840063       36 System.String
 129897912      54316     2391 System.Byte[]
 126876768    2643266       48 System.Text.StringBuilder
  89752352    2804761       32 AssetsTools.NET.AssetTypeValue
  76334176      28707     2659 AssetsTools.NET.ClassDatabaseTypeField[]
  71313368    1273453       56 AssetsTools.NET.AssetTypeTemplateField
  51226184    1282455       39 AssetsTools.NET.AssetTypeTemplateField[]
  50938120    1273453       40 System.Collections.Generic.List<System.Int32>
  44710440    1862935       24 System.Int32
  29761104    1240046       24 System.Single
  27338816     507606       53 System.Int32[]
  19587816     816159       24 System.Byte
  15064728     627697       24 System.Int64
  10482720     436780       24 AssetsTools.NET.AssetTypeArray
   6289536      98274       64 AssetsTools.NET.AssetFileInfoEx
   5592720     233030       24 System.UInt32
   3848256     160344       24 System.SByte

(Use mode=throughput is about 5-10% faster and situation does not really change)
Is it possible to reduce objects count? (I guess most of them will be persistent?)
With so many objects, proceeding GC could also be slow.

@nesrak1
Copy link
Owner

nesrak1 commented Oct 23, 2022

Can you check which asset types in particular take a long time to load? I bet this is probably types with byte arrays which are typed like regular arrays being read as regular arrays which makes each element in a byte array an entire object with properties. One of the more common examples are shaders iirc.

@myocytebd
Copy link
Author

myocytebd commented Oct 23, 2022

Can you check which asset types in particular take a long time to load? I bet this is probably types with byte arrays which are typed like regular arrays being read as regular arrays which makes each element in a byte array an entire object with properties. One of the more common examples are shaders iirc.

Font is extremely slow - exclude 20 Fonts reduces total wall time by 40%.
Rest types are relatively even distributed.
After excluding Fonts, mono reports unmanaged time is 75%(~9s), while GC time is ~2s. It is unclear where the rest 7s are spent (allocation is unlikely to take that long).
Allocation count is high, I would estimate it to be 140M (by running 1/10 object per type).

@nesrak1
Copy link
Owner

nesrak1 commented Oct 23, 2022

image

The Font asset does indeed use a regular Array instead of a TypelessData one. You can modify the template to read it as a byte array instead of a normal array if you want.

(Note: I wrote this for the at3 branch. for at2, you'll need to change MakeValue to the AssetTypeInstance constructor.)

using AssetsTools.NET;
using AssetsTools.NET.Extra;

static AssetTypeValueField? GetByteArrayFont(AssetsManager am, AssetsFileInstance inst, AssetFileInfo inf)
{
    AssetTypeTemplateField fontTemp = am.GetTemplateBaseField(inst, inf);
    AssetTypeTemplateField? m_FontData = fontTemp.Children.FirstOrDefault(f => f.Name == "m_FontData");
    
    if (m_FontData == null)
        return null;

    AssetTypeTemplateField m_FontData_Array = m_FontData.Children[0];

    m_FontData_Array.ValueType = AssetValueType.ByteArray;
    m_FontData_Array.Type = "TypelessData";

    AssetTypeValueField baseField = fontTemp.MakeValue(inst.file.Reader, inf.AbsoluteByteStart);
    return baseField;
}

void Load()
{
    var assetsManager = new AssetsManager();
    assetsManager.LoadClassPackage("classdata.tpk");

    var afileInst = assetsManager.LoadAssetsFile(args[0], false);
    var afile = afileInst.file;

    assetsManager.LoadClassDatabaseFromPackage(afile.Metadata.UnityVersion);

    foreach (var assetInfo in afile.GetAssetsOfType(AssetClassID.Font))
    {
        var dummy = GetByteArrayFont(assetsManager, afileInst, assetInfo);
        Console.WriteLine($"data size: {dummy["m_FontData.Array"].AsByteArray.Length} bytes");
    }
}

if (args.Length < 1)
{
    Console.WriteLine("need a file argument");
    return;
}

Load();

@myocytebd
Copy link
Author

I tried to make a (assets-only) PoC to support thread-safe reading.
#87
Speed-up of the simple loop above is around 2.5x or 3x (excl. old-gen GC time) on 4C8T.
Do you think threading support is worth it?
(Unfortunately C# doesn't have something like thread sanitizer, so verifying threading behavior is tough.)

@nesrak1
Copy link
Owner

nesrak1 commented Oct 25, 2022

Maybe I can look at it later, but right now I'm still trying to fix up the at3 branch to be mostly bug free before I move onto any newer features like this. However, the bulk of the wait time isn't reading from disk but deserialization, so if I were doing this, I wouldn't touch the binary readers (plus that just adds more complexity).

@nesrak1 nesrak1 added the enhancement New feature or request label Oct 25, 2022
@myocytebd
Copy link
Author

However, the bulk of the wait time isn't reading from disk but deserialization, so if I were doing this, I wouldn't touch the binary readers (plus that just adds more complexity).

Reader is the major (or only, if I was correct) thing that prevents thread-safe reading and deserialization.

@nesrak1
Copy link
Owner

nesrak1 commented Oct 25, 2022

What I meant was you could read a bunch of assets into individual memory streams/binary readers and not worry about sharing the same stream/reader. The only thing that would have to be single threaded would be disk->memory stream but that would be it.

@BuslikDrev
Copy link

А зачем вы объявляете переменную "var dummy" внутри массива?

@suphamster
Copy link

Very slow compressing/decomressing bundle files, especially lzma ones. On CPU i7-8700 usage is below 10%. 1Gb bundle takes about 5 min to compress or decompress and don't matter if I'm doing it in UABEA or using manually written tool based on AssetsTools.NET.dll.

@nesrak1
Copy link
Owner

nesrak1 commented Feb 23, 2024

@suphamster LZMA decompression/compression uses the official 7z sdk, and unfortunately unless we swap it out with the native version or rewrite it, it'll always be that slow. I don't really know of any other managed c# library that could get the job done.

@suphamster
Copy link

It seems there are no fast C# LZMA implementation https://stackoverflow.com/questions/12292593/why-is-lzma-sdk-7-zip-so-slow
I've found only this compressor https://www.nuget.org/packages/EasyCompressor.LZMA/#readme-body-tab but acording benchmark https://github.com/mjebrahimi/EasyCompressor?tab=readme-ov-file#benchmarks it's also slow as hell.
But maybe LZ4 version can be used here because it also too slow on big files.

@nesrak1
Copy link
Owner

nesrak1 commented Feb 23, 2024

Easy Compressor uses the same exact library for LZMA and the same library for LZ4, although the version I'm using for LZ4 is a bit old in order to support .NET 3.5. I'm sure the LZ4 stuff could be upgraded if you wanted more speed, but LZMA is the really slow one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants