If you're using an async library, do you know that nothing else is queued and not yet sent (like 10 other publishes?) or it could be busy ingesting/accepting incoming queues.  It depends on a lot of aspects of application design.
It also depends on how much the library is abstracting things.  To just emit a publish message, it should queue up the message for transmission, and the queueing above is about the only source of delay, but the library might be such that you are waiting to be clear that the publish has actually reached the broker.  For that to happen, the broker needs to answer, and there is an exchange of packets involved (the exchange depends on QoS.  One always has PubACK (that might be all for QoS==0)... with QoS==1 , need to also wait for a PUBREC, and for QoS==2, there is PUBREL and PUBCOMP.
There is a negotiation process to re-send failed publishes, and the library you are using may be waiting for the publish to be completed which involves finishing the dialog, rather than just sending.
How busy is your broker? how is the network to your broker (lossy?, quick?) what is the round-trip time?