Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reduce dds_write time #660

Closed
BH1SCW opened this issue Dec 21, 2020 · 3 comments
Closed

How to reduce dds_write time #660

BH1SCW opened this issue Dec 21, 2020 · 3 comments
Labels
question Explanation on code or feature is requested

Comments

@BH1SCW
Copy link

BH1SCW commented Dec 21, 2020

Hi, I did some test on sending big size buffer using cylclone dds. I hope to reduce the time cost of dds_write.

Your advice is always welcome and appreciated.

t1 = get_time();
rc = dds_write(...);
t2 = get_time();
interval = t2 - t1;

here we define 2 different test struct:
1.15MB and 2.448MB

typedef struct OGM_Map_Struct
{
    octet data[1212296];
} ogm_map_t

typedef struct Roadmap_Vis
{
    octet data[2602056];
} roadmap_vis_t

average time:

ogm_map  17.11 ms
roadmap_vis 40.95 ms

Max time:

ogm_map  26 ms
roadmap_vis 75.02 ms

First I am not sure if this is normal range, I hope it is possible to reduce the time because if in a loop it may block the whole process logic.

I use best effort mode and the data lost is allowed.
All data does not need to do serialisation because it is just array, will it help if I disable serialisation?
How to disable serialisation? set //CycloneDDS/Domain/Internal/MaxSampleSize set to 0 b?

If async write mode is supported, that sounds a nice way to solve the problem, but from ros2/rmw_cyclonedds#89
it seems async write is not supported by cyclone dds now.

any other way to reduce write time?

Thanks

@eboasson
Copy link
Contributor

If you look at where the time is spent, it is spent waiting for sendmsg to complete, that is, the network can't keep up. The serialisation ends up being a memcpy and with modern memory subsystems doesn't usually become an issue until you exceed a GB/s. The reasoning behind that figure is simple: memory bandwidth is >>10GB/s, but there's some copies to serialise/send it, so let's say you get 10GB/s effective. At 1GB/s, that means, you're spending 10% of your time in copying.

Indeed asynchronous writes would help, but it just isn't supported yet. What may help is that all network I/O is inherently asynchronous in most operating systems: one writes the data into a socket send buffer, the kernel transfers it from the socket buffer to the network. Cyclone defaults to a pretty small send buffer, but you can raise it to huge sizes, just like you can with the receive buffer. If you make it large enough to contain the sample, the kernel will do the asynchronous write for you.

There are downsides to that: one is that the other outgoing messages get stuck behind it in the buffer (it'll probably survive that with the delays you mention). The other is that it only works if the network is fast enough to keep up with the average rate at which you're trying to send, else you'll just fill up the larger buffer and end up in the same situation (but worse because of that large buffer).

Do-it-yourself async writes could be an option (yes ... really ...), if it is a one-off and you can live with a single-place buffer. Then the code to do it asynchronously would be no more than a few lines of code. Or, if you're feeling adventurous, you could take the asynchronous writing that does exist (but that I try not to tell anyone about because it really was just an experiment that made its way out in the initial commit by accident!) and modify it a bit so that it only handles data for some writers and never blocks. If you make xpack_send_async per-writer and change nn_xpack_send a bit, you'd be in business.

@BH1SCW
Copy link
Author

BH1SCW commented Jan 7, 2021

Hey, eboasson
Great thanks for such a detailed answer!
I made some perf with flamegraph

if I put big buffer of calling dds_write tasks into a thread pool, is it a good way to deal with the problem?

Best wishes

@thijsmie thijsmie added the question Explanation on code or feature is requested label Jun 16, 2022
@thijsmie
Copy link
Contributor

The main problem was solved! There was a secondary question with no follow-up, please open a new issue if that is something you still want to talk about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Explanation on code or feature is requested
Projects
None yet
Development

No branches or pull requests

3 participants