Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operations.Send() is radically slow #3638

Open
nickger opened this issue Oct 8, 2024 · 4 comments
Open

Operations.Send() is radically slow #3638

nickger opened this issue Oct 8, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@nickger
Copy link

nickger commented Oct 8, 2024

What package are you referring to?

Operations.

Describe the bug

Operations.Send() demonstrates a very slow performance in general and particularly for SQLiteTransport. I've decided to report it as a bug because it is not a small issue: real projects require too much time to export. Currently the time to write locally 7500 elements is about 10 min. I've not checked server transport, but the issue is not in transport code (see below).

To Reproduce

A high level code to send the data collected in a queue from Revit (Task.Run is omitted):

bool started = true;
using (var transp = new CustomIFCExport.SQLiteTransport(@"C:\Users\GNI\Documents\GeomExports", null, doc.Title))
{
   while (started || !dataQueue.IsEmpty)
    {
        if (dataQueue.TryDequeue(out Base data))
        {
           await Operations.Send(data, transp, false);  // Call the method that sends data
                                            
        }
        else
        {
            // If no data is available, check if collection has finished
            if (await Task.WhenAny(collectionCompleted.Task, Task.Delay(100)) == collectionCompleted.Task && dataQueue.IsEmpty)
            {
                // Exit loop if collection is complete and queue is empty
                started = false;
                
            }
        }
        
    }
                           
}

Additional context

We use local transports to speed up the development and enable local collaboration instead of using Speckle Server for experiments. I am entirely sure this is also important for other users and developers.

Proposed Solution (if any)

The code wich explicitelly repeats the same steps but with modified awaiter for transport write finish:

bool started = true;
using (var transp = new CustomIFCExport.SQLiteTransport(@"C:\Users\GNI\Documents\GeomExports", null, doc.Title))
{
    BaseObjectSerializerV2 serializerV2 = new BaseObjectSerializerV2( new List<ITransport>() { transp }, null, default);
    transp.BeginWrite();
    while (started || !dataQueue.IsEmpty)
    {
        if (dataQueue.TryDequeue(out Base data))
        {
            serializerV2.Serialize(data);  // Call the method that sends data
                                            
        }
        else
        {
            // If no data is available, check if collection has finished
            if (await Task.WhenAny(collectionCompleted.Task, Task.Delay(100)) == collectionCompleted.Task && dataQueue.IsEmpty)
            {
                // Exit loop if collection is complete and queue is empty
                started = false;
                transp.EndWrite();
            }
        }
        
    }
    await transp.WriteComplete();  //this resolves the issue with degraded performance                      
}

The point is: Operations.SerializerSend() creates tasks to wait for EVERY object write operation. This is an overkill: we need to wait the transport globally, not for each object. Moving this method outside to the level, where the transport is instantiated resolves the issue. This must also be documented - that waiting for the writer is not a responsibility of the sender.

@nickger nickger added the bug Something isn't working label Oct 8, 2024
Copy link

linear bot commented Oct 8, 2024

@AlanRynne
Copy link
Member

Hi @nickger!

Thanks for bringing this up.

The Operations.Send function wasn't really designed to be called multiple times in sequence sending small objects, but rather to send one big fat object to the transports of choice. Both Operations.Send and Operations.Receive where designed as self-contained utility functions, hence why they wait for the writer to finish: It is not expected you'd recycle the write operation of the transport for successive sends.

That being said... 👇🏼

Does the performance issue still exist if you wrap your data objects inside a new base object and just send that? This would reduce the Operations.Send calls to just 1 instead of having a loop. This would require you to wait until the collection has completed (as per your loop suggests)

i.e.

var rootObject = new Base();
rootObject["@data"] = dataQueue.ToList()

await Operations.Send(data, transp, false);

If your use-case has some limitations that require you to use a queue instead for performance reasons, then your proposed code seems like it should do the trick just fine.

Out of curiosity, how many objects would you expect to come out of the queue?

@nickger
Copy link
Author

nickger commented Oct 8, 2024

Hi, Alan.
We are discovering a scenario to export IFC from Revit through an intermediate storage as Revit itself is horrable at this. As such the amount of objects can easily reach 100000. Doing collection, serialization and writing sequentially is not very smart and takes about 20-30Gb for nothing. We couldn't even imagine that it is a defaul scenario. Pipe processing is prefferable to extract the data. Converting to IFC or whatever is the next step.
Anyway, would be great to see such important details in the documentation. For now it looks false too simple.

@didimitrie
Copy link
Member

We're working on improving performance left and right throughout the codebase, so thanks for the feedback. We'll update the docs to reflect your comment, as it's true it can be misleading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants