Re: [mosquitto-dev] A decentralize Mosquitto cluster design.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [mosquitto-dev] A decentralize Mosquitto cluster design.

From: jianhui zhan <hui6075@xxxxxxxxxxx>
Date: Mon, 15 Jan 2018 12:46:02 +0000
Accept-language: en-US, zh-CN
Delivered-to: mosquitto-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/mosquitto-dev>
List-help: <mailto:mosquitto-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/mosquitto-dev>, <mailto:mosquitto-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/mosquitto-dev>, <mailto:mosquitto-dev-request@eclipse.org?subject=unsubscribe>
Thread-index: AQHTjf7NU6yNLtBHmUWCTtosPWEJ3w==
Thread-topic: [mosquitto-dev] A decentralize Mosquitto cluster design.

Hi Roger, Tatsuzo, Tifaifai, anyone who is interested with in mosquitto cluster,

I've tested the mosquitto cluster by appropriate amount of simultaneous users, connect/subscribe/publish TPS, which gave a proper pressure(carry as much TPS as possible w/o causing latency) to the system.

It is meaningless to give a plain benchmark w/o any comparison, so I've also tested mosquitto bridge under a same scenario which is quite similar with our smart home service platform architecture and scenario(10 brokers, 20k subscribers, 1k publishes from 10 publishers which use persistent TCP from a http server):

9 brokers on 3 OpenStack VMs(4cores 8G RAM), after 30k persistant subscribers setup, send 10k publishes per second(actually only 2.5kps publishes due to client's bottleneck) from publishers(one publish per client use non-persistant TCP), with payload length = 744 bytes.

With QoS=0, both cluster and bridge(topic # both 0) can work as normal, with QoS=1(topic # both 1), the CPU usage for each broker stabilise at at 65%-75% in cluster, but 100% for the bridge broker during the publish phase, and meanwhile 30% messages lost due to bridge broker's overload(see appendix). More detail test reports which include connect/request response time, network throughput, server monitoring are available under https://github.com/hui6075/mosquitto/tree/develop/benchmark .

I believe that the situation will be worse for bridge under QoS=2, but will not deteriorate for cluster since publish messages forward with it's origin QoS and process with QoS=0 in the cluster. The mosquitto cluster equalized all brokers' load, and it is different with bridge, it is an entire MQTT logic broker for external clients such as duplicate client id elimination, persistent session inheritable after client's reconnection, and the most important is that it's a autonomy system which provide continuous service under single point of failure which bridge doesn't have, so I sincerely hope that you can make any comments, code review, make performance testing under your scenario, etc., to make mosquitto cluster be better.

Thanks!

BRs,

Jianhui

PS. an oProfile report has attached in appendix, which shows that a more efficient timer management should be involved to save the CPU cycles which bring from expiration polling.

Appendix:

bridge cpu usage snapshot(18225 is the bridge):

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

18225 mosquitt 20 0 45232 5576 1776 R 100.0 0.1 1:59.91 mosquitto(bridge)

18224 mosquitt 20 0 44704 5052 1760 R 91.0 0.1 1:28.08 mosquitto

18223 mosquitt 20 0 44708 5076 1760 R 82.7 0.1 1:30.28 mosquitto

4869 mosquitt 20 0 44708 5008 1764 R 79.4 0.1 1:41.62 mosquitto

4875 mosquitt 20 0 44720 5004 1764 R 78.0 0.1 1:38.25 mosquitto

4876 mosquitt 20 0 44724 5008 1764 R 75.7 0.1 1:38.51 mosquitto

2900 mosquitt 20 0 24480 4892 1572 R 71.4 0.1 1:25.55 mosquitto

2898 mosquitt 20 0 24480 4872 1572 S 68.1 0.1 1:26.22 mosquitto

2899 mosquitt 20 0 24488 4860 1572 R 66.1 0.1 1:25.79 mosquitto

cluster cpu usage snapshot:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

19715 mosquitt 20 0 47632 7968 1796 R 73.4 0.1 2:17.25 mosquitto

19716 mosquitt 20 0 47660 7996 1796 R 72.4 0.1 2:20.55 mosquitto

19717 mosquitt 20 0 47512 7868 1796 R 70.7 0.1 2:16.84 mosquitto

6574 mosquitt 20 0 47796 8148 1800 R 64.4 0.1 2:14.92 mosquitto

6573 mosquitt 20 0 47928 8180 1800 S 63.1 0.1 2:17.60 mosquitto

6572 mosquitt 20 0 47808 8100 1800 R 62.8 0.1 2:15.44 mosquitto

3580 mosquitt 20 0 27364 7728 1604 R 62.8 0.1 1:48.19 mosquitto

3581 mosquitt 20 0 27552 7936 1604 R 62.1 0.1 1:48.50 mosquitto

3582 mosquitt 20 0 27824 8260 1604 S 60.8 0.1 1:47.63 mosquitto

Oprofile report:

CPU: Intel Haswell microarchitecture, speed 3500 MHz (estimated)

Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 6000

samples % linenr info image name symbol name

5786584 52.8127 loop.c:101 mosquitto mosquitto_main_loop

2749353 25.0926 subs.c:388 mosquitto sub__search

546670 4.9893 database.c:856 mosquitto db__message_write

267198 2.4386 subs.c:692 mosquitto retain__search.isra.2

...

:int mosquitto_main_loop(struct mosquitto_db *db, mosq_sock_t *listensock, int listensock_count, int listener_max)

/* mosquitto_main_loop total: 5786584 52.8127 */

26163 0.2388 : HASH_ITER(hh_sock, db->contexts_by_sock, context, ctxt_tmp){

3211672 29.3121 : if(time_count > 0){

...

540500 4.9330 : context->pollfd_index = -1;

...

439382 4.0101 : if(context->events & EPOLLOUT) {

...

691792 6.3138 : if(context->current_out_packet || context->state == mosq_cs_connect_pending || context->ws_want_write){

From: jianhui zhan
Sent: Friday, December 29, 2017 9:34
To: General development discussions for the mosquitto project
Subject: Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.

yes, 2000 PUB/SUBs testing is more of a functional testing then a stress testing, I will do some more testing to verify the performance.

From: mosquitto-dev-bounces@xxxxxxxxxxx <mosquitto-dev-bounces@xxxxxxxxxxx> on behalf of Tatsuzo Osawa <tatsuzo.osawa@xxxxxxxxx>
Sent: Friday, December 29, 2017 9:12
To: General development discussions for the mosquitto project
Subject: Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.

Hi Jianhui,

Thank you for the further information, but I'm not sure the cluster can expand the performance.

The amount of '2000 PUB/SUBs' seems too small, it can be handled by using only one broker.

Could you simples the scenarios, and show the performance change in according with the number of brokers?

Regards,

Tatsuzo

2017-12-29 2:32 GMT+09:00 jianhui zhan <hui6075@xxxxxxxxxxx>:

Hi Osawa,

I have continuously write and debug the non-centralized cluster, and test it with a light load, about 2000 SUB/PUB pair, fortunately it work as expected, without errors. Attached is the test report. More stress is needed in future testing.

These days I've also thought of some other business MQTT cluster, almost Erlang or Java based, which contains too much nodes/services such as broker, event dispatcher, subscription in-memory cache, persistence database, distributed coordinator, etc., with high latency(400ms~500ms), and depend on the availability of each key service.
In some ways, I think Mosquitto should take it's role as Redis in NoSQLs, ZeroMQ in Message Queues, make the very very low latency as it's most significant advantage..

From: mosquitto-dev-bounces@eclipse.org <mosquitto-dev-bounces@eclipse.org> on behalf of Tatsuzo Osawa <tatsuzo.osawa@xxxxxxxxx>
Sent: Saturday, December 9, 2017 11:54
To: General development discussions for the mosquitto project
Subject: Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.

Hi Jianhui,

Your e-mail software made the subjects include multi-byte characters.
Therefore, the subjects cannot be read on the archive:
https://dev.eclipse.org/mhonarc/lists/mosquitto-dev/ .

Mailing List Archives - mosquitto-dev

dev.eclipse.org

Eclipse is probably best known as a Java IDE, but it is more: it is an IDE framework, a tools framework, an open source project, a community, an eco-system, and a ...

Could you correct the subjects from the next post?

Regards,
Tatsuzo
_______________________________________________
mosquitto-dev mailing list
mosquitto-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/mosquitto-dev

_______________________________________________
mosquitto-dev mailing list
mosquitto-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/mosquitto-dev

Follow-Ups:
- Re: [mosquitto-dev] A decentralize Mosquitto cluster design.
  - From: Tatsuzo Osawa
- Re: [mosquitto-dev] A decentralize Mosquitto cluster design.
  - From: Oegma2

References:
- Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.
  - From: zhan jianhui
- Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.
  - From: Tatsuzo Osawa
- Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.
  - From: Tatsuzo Osawa
- Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.
  - From: jianhui zhan
- Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.
  - From: Tatsuzo Osawa
- Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.
  - From: jianhui zhan

Prev by Date: [mosquitto-dev] performance test tools
Next by Date: Re: [mosquitto-dev] performance test tools
Previous by thread: Re: [mosquitto-dev] A non-centralize Mosquitto cluster design.
Next by thread: Re: [mosquitto-dev] A decentralize Mosquitto cluster design.
Index(es):
- Date
- Thread

Breadcrumbs