Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » 4DIAC - Framework for Distributed Industrial Automation and Control » FORTE: Bug in CIPComLayer(The connection is not restored when you debug FORTE, break and continue)
FORTE: Bug in CIPComLayer [message #1754575] Mon, 20 February 2017 13:15 Go to next message
Lik Lik is currently offline Lik LikFriend
Messages: 20
Registered: February 2017
Location: Russia
Junior Member
Hi,
We found one bug in CIPComLayer:
We add closesocket when received data with fault in the method handledConnectedDataRecv()
switch (nRetVal){
      case 0:
        DEVLOG_INFO("Connection closed by peer\n");
        ...
        break;
      case -1:
        DEVLOG_INFO("ProcessDataRecvFaild\n");
        m_eInterruptResp = e_ProcessDataRecvFaild;
        closeSocket (&m_nSocketID);
        if(e_Server == m_poFb->getComServiceType()){
          //Move server into listening mode again
          m_eConnectionState = e_Listening;
        }
        break;
      default:
        //we successfully received data
        m_unBufFillSize += nRetVal;
        m_eInterruptResp = e_ProcessDataOk;
        break;
    }


This situation can occur when you debug FORTE, break and continue.
Re: FORTE: Bug in CIPComLayer [message #1754607 is a reply to message #1754575] Mon, 20 February 2017 21:15 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 619
Registered: January 2014
Senior Member
Thanks for pointing this out. If applied your proposed fix in commit [1].

However I'm not perfectly sure if your fix is sufficient. Although it will handle some error case more correctly should we really close the connection. Shouldn't we allow to receive more messages? Should we inform the application about the situation? Are we requiring to be reinitialized? Is it possible that just one message filed and the next one can be received? When you debug and continue I would expect to get into the first switch case element.

Looking forward to your opinioin.
Alois

[1] http://git.eclipse.org/c/4diac/org.eclipse.4diac.forte.git/commit/?h=1.8.x
Re: FORTE: Bug in CIPComLayer [message #1754616 is a reply to message #1754607] Tue, 21 February 2017 05:58 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 20
Registered: February 2017
Location: Russia
Junior Member
>Shouldn't we allow to receive more messages?
If we allow to receive more messages then CFDSelectHandler::run() will generate "startNewEventChain(comLayer->getCommFB())" because "comLayer->recvData(&sockDes,0)" return "not forte::com_infra::e_Nothing". And will be a semafores leak.
        if((0 != FD_ISSET(sockDes, &anFDSet)) && (0 != comLayer)){
          m_oSync.unlock();
          if(forte::com_infra::e_Nothing != comLayer->recvData(&sockDes,0)){
            startNewEventChain(comLayer->getCommFB());
          }
          m_oSync.lock();
        }

Why do you not add log message DEVLOG_INFO("ProcessDataRecvFaild\n")?
Thank you for fix.
Re: FORTE: Bug in CIPComLayer [message #1754636 is a reply to message #1754616] Tue, 21 February 2017 08:50 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 619
Registered: January 2014
Senior Member
But this is exeactly what I wanted to point out. That "startNewEventChain(comLayer->getCommFB())" is needed so that the comm FB can send output events to inform the application that something bad has happend. Typically we even want to close the whole communication stack in the commFB and allow the application to reconnect and reinitialize. With your proposed fix only the lowest layer will close its resources but the comm FB can then be in an invalid state. The more I think about it the more I think that the close here is maybe not enough. But more is needed also in the comm FB. Can you expand why a semaphore should leak?

Sorry for forgetting the log info. In this case I think even a log error would be more apropriate. This is another reason why it would be great that you can submit improvements directly to our gerrit as described in the contributing to FORTE documentation.
Re: FORTE: Bug in CIPComLayer [message #1754648 is a reply to message #1754636] Tue, 21 February 2017 10:02 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 20
Registered: February 2017
Location: Russia
Junior Member
>Can you expand why a semaphore should leak?
For example we enable "Monitoring" in 4DIAC and after that we break FORTE. 4DIAC close connection and after that it will send SYN message to FORTE many times.
Then we continue FORTE.
In "CFDSelectHandler::run()" retval = select(..) > 0.
FORTE will do "startNewEventChain(comLayer->getCommFB())" with creating semaphore many times.
Re: FORTE: Bug in CIPComLayer [message #1754704 is a reply to message #1754648] Tue, 21 February 2017 16:43 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 619
Registered: January 2014
Senior Member
Ah thanks for the clarification. I think what you called semaphores is called events in IEC 61499. But events are the only way that the application can be informed that something bad happened in the communication. The only difference is that that QO is set to false and the status should have some additional value. But I less and less think that the socket should be closed in that case. This is something the application has to decide. For example by using an E_SWITCH FB and decide on the value of QO if the normal execution flow should be take or that an error handling flow should be taken. By just closing the application would starve and not knowing why. Therefore if there are no further arguments in favour of leaving the close I would remove it again.
Re: FORTE: Bug in CIPComLayer [message #1754756 is a reply to message #1754704] Wed, 22 February 2017 05:18 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 20
Registered: February 2017
Location: Russia
Junior Member
>But I less and less think that the socket should be closed in that case.
Is it true also for "case nRetVal=0"("Connection closed by peer\n") ?
Re: FORTE: Bug in CIPComLayer [message #1754765 is a reply to message #1754756] Wed, 22 February 2017 08:32 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 619
Registered: January 2014
Senior Member
In is case closing is correct as the connection really is gone. Furthermore we inform the application on the closing with an INITO event with QO set to false. Although this is only needed for CLIENTS as a CLIENT needs to know that the server has gone away. For a Server the server switches back to listening mode so that another client or the same one can connect again.
Re: FORTE: Bug in CIPComLayer [message #1756278 is a reply to message #1754765] Tue, 14 March 2017 21:04 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 619
Registered: January 2014
Senior Member
Follwing up on our discussion I just reverted the behavior back to the original code.
Re: FORTE: Bug in CIPComLayer [message #1756374 is a reply to message #1756278] Thu, 16 March 2017 06:26 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 20
Registered: February 2017
Location: Russia
Junior Member
Quote:
For example we enable "Monitoring" in 4DIAC and after that we break FORTE. 4DIAC close connection and after that it will send SYN message to FORTE many times.
Then we continue FORTE.
In "CFDSelectHandler::run()" retval = select(..) > 0.
FORTE will do "startNewEventChain(comLayer->getCommFB())" with creating semaphore many times.

I don't understend what to do in this case without that patch?
Re: FORTE: Bug in CIPComLayer [message #1756434 is a reply to message #1756374] Thu, 16 March 2017 20:02 Go to previous message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 619
Registered: January 2014
Senior Member
The point I tried to make clear is that the situation that you are describing is from a FORTE perspective not distinguishable from a wrong message received or any other communication error on the network. IEC 61499 defines for such situations that the comfb should send an output event IND with QO set to false. An application can now use this to perform correct measures. In the attached image I tried to show you two possible solutions.

Lets assume you have a SUBSCRIBE_1 which is connected to an E_CTUD (most left element). So when a wrong message is coming in QO is false and the application should not use this message. The simplest solution is shown in the middle FB Network. There an E_PERMIT is used to only trigger the application (i.e., E_CTUD) when QO is true.

The left part shows a more sophisticated approach. There an E_SWITCH is used to trigger a failure handling network. The failure handling network counts if 3 consecutive error messages have been received and if this was the case shut down the SUBSCRIBE_1 block.

A next step then could be to trigger an E_DELAY for waiting a certain amount of time and then try to reopen the connection so that hopefully the data is received again.

I hope this helps.

Cheers,
Alois
index.php/fa/28775/0/
Previous Topic:FORTE: the initial size of the event chain's external event list.
Next Topic:Server disconection
Goto Forum:
  


Current Time: Wed Aug 22 04:04:11 GMT 2018

Powered by FUDForum. Page generated in 0.02855 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top