Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » 4DIAC - Framework for Distributed Industrial Automation and Control » FORTE: Bug in CIPComLayer(The connection is not restored when you debug FORTE, break and continue)
FORTE: Bug in CIPComLayer [message #1754575] Mon, 20 February 2017 13:15 Go to next message
Lik Lik is currently offline Lik LikFriend
Messages: 32
Registered: February 2017
Location: Russia
Member
Hi,
We found one bug in CIPComLayer:
We add closesocket when received data with fault in the method handledConnectedDataRecv()
switch (nRetVal){
      case 0:
        DEVLOG_INFO("Connection closed by peer\n");
        ...
        break;
      case -1:
        DEVLOG_INFO("ProcessDataRecvFaild\n");
        m_eInterruptResp = e_ProcessDataRecvFaild;
        closeSocket (&m_nSocketID);
        if(e_Server == m_poFb->getComServiceType()){
          //Move server into listening mode again
          m_eConnectionState = e_Listening;
        }
        break;
      default:
        //we successfully received data
        m_unBufFillSize += nRetVal;
        m_eInterruptResp = e_ProcessDataOk;
        break;
    }


This situation can occur when you debug FORTE, break and continue.
Re: FORTE: Bug in CIPComLayer [message #1754607 is a reply to message #1754575] Mon, 20 February 2017 21:15 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 1584
Registered: January 2014
Senior Member

Thanks for pointing this out. If applied your proposed fix in commit [1].

However I'm not perfectly sure if your fix is sufficient. Although it will handle some error case more correctly should we really close the connection. Shouldn't we allow to receive more messages? Should we inform the application about the situation? Are we requiring to be reinitialized? Is it possible that just one message filed and the next one can be received? When you debug and continue I would expect to get into the first switch case element.

Looking forward to your opinioin.
Alois

[1] http://git.eclipse.org/c/4diac/org.eclipse.4diac.forte.git/commit/?h=1.8.x
Re: FORTE: Bug in CIPComLayer [message #1754616 is a reply to message #1754607] Tue, 21 February 2017 05:58 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 32
Registered: February 2017
Location: Russia
Member
>Shouldn't we allow to receive more messages?
If we allow to receive more messages then CFDSelectHandler::run() will generate "startNewEventChain(comLayer->getCommFB())" because "comLayer->recvData(&sockDes,0)" return "not forte::com_infra::e_Nothing". And will be a semafores leak.
        if((0 != FD_ISSET(sockDes, &anFDSet)) && (0 != comLayer)){
          m_oSync.unlock();
          if(forte::com_infra::e_Nothing != comLayer->recvData(&sockDes,0)){
            startNewEventChain(comLayer->getCommFB());
          }
          m_oSync.lock();
        }

Why do you not add log message DEVLOG_INFO("ProcessDataRecvFaild\n")?
Thank you for fix.
Re: FORTE: Bug in CIPComLayer [message #1754636 is a reply to message #1754616] Tue, 21 February 2017 08:50 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 1584
Registered: January 2014
Senior Member

But this is exeactly what I wanted to point out. That "startNewEventChain(comLayer->getCommFB())" is needed so that the comm FB can send output events to inform the application that something bad has happend. Typically we even want to close the whole communication stack in the commFB and allow the application to reconnect and reinitialize. With your proposed fix only the lowest layer will close its resources but the comm FB can then be in an invalid state. The more I think about it the more I think that the close here is maybe not enough. But more is needed also in the comm FB. Can you expand why a semaphore should leak?

Sorry for forgetting the log info. In this case I think even a log error would be more apropriate. This is another reason why it would be great that you can submit improvements directly to our gerrit as described in the contributing to FORTE documentation.
Re: FORTE: Bug in CIPComLayer [message #1754648 is a reply to message #1754636] Tue, 21 February 2017 10:02 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 32
Registered: February 2017
Location: Russia
Member
>Can you expand why a semaphore should leak?
For example we enable "Monitoring" in 4DIAC and after that we break FORTE. 4DIAC close connection and after that it will send SYN message to FORTE many times.
Then we continue FORTE.
In "CFDSelectHandler::run()" retval = select(..) > 0.
FORTE will do "startNewEventChain(comLayer->getCommFB())" with creating semaphore many times.
Re: FORTE: Bug in CIPComLayer [message #1754704 is a reply to message #1754648] Tue, 21 February 2017 16:43 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 1584
Registered: January 2014
Senior Member

Ah thanks for the clarification. I think what you called semaphores is called events in IEC 61499. But events are the only way that the application can be informed that something bad happened in the communication. The only difference is that that QO is set to false and the status should have some additional value. But I less and less think that the socket should be closed in that case. This is something the application has to decide. For example by using an E_SWITCH FB and decide on the value of QO if the normal execution flow should be take or that an error handling flow should be taken. By just closing the application would starve and not knowing why. Therefore if there are no further arguments in favour of leaving the close I would remove it again.
Re: FORTE: Bug in CIPComLayer [message #1754756 is a reply to message #1754704] Wed, 22 February 2017 05:18 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 32
Registered: February 2017
Location: Russia
Member
>But I less and less think that the socket should be closed in that case.
Is it true also for "case nRetVal=0"("Connection closed by peer\n") ?
Re: FORTE: Bug in CIPComLayer [message #1754765 is a reply to message #1754756] Wed, 22 February 2017 08:32 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 1584
Registered: January 2014
Senior Member

In is case closing is correct as the connection really is gone. Furthermore we inform the application on the closing with an INITO event with QO set to false. Although this is only needed for CLIENTS as a CLIENT needs to know that the server has gone away. For a Server the server switches back to listening mode so that another client or the same one can connect again.
Re: FORTE: Bug in CIPComLayer [message #1756278 is a reply to message #1754765] Tue, 14 March 2017 21:04 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 1584
Registered: January 2014
Senior Member

Follwing up on our discussion I just reverted the behavior back to the original code.
Re: FORTE: Bug in CIPComLayer [message #1756374 is a reply to message #1756278] Thu, 16 March 2017 06:26 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 32
Registered: February 2017
Location: Russia
Member
Quote:
For example we enable "Monitoring" in 4DIAC and after that we break FORTE. 4DIAC close connection and after that it will send SYN message to FORTE many times.
Then we continue FORTE.
In "CFDSelectHandler::run()" retval = select(..) > 0.
FORTE will do "startNewEventChain(comLayer->getCommFB())" with creating semaphore many times.

I don't understend what to do in this case without that patch?
Re: FORTE: Bug in CIPComLayer [message #1756434 is a reply to message #1756374] Thu, 16 March 2017 20:02 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 1584
Registered: January 2014
Senior Member

The point I tried to make clear is that the situation that you are describing is from a FORTE perspective not distinguishable from a wrong message received or any other communication error on the network. IEC 61499 defines for such situations that the comfb should send an output event IND with QO set to false. An application can now use this to perform correct measures. In the attached image I tried to show you two possible solutions.

Lets assume you have a SUBSCRIBE_1 which is connected to an E_CTUD (most left element). So when a wrong message is coming in QO is false and the application should not use this message. The simplest solution is shown in the middle FB Network. There an E_PERMIT is used to only trigger the application (i.e., E_CTUD) when QO is true.

The left part shows a more sophisticated approach. There an E_SWITCH is used to trigger a failure handling network. The failure handling network counts if 3 consecutive error messages have been received and if this was the case shut down the SUBSCRIBE_1 block.

A next step then could be to trigger an E_DELAY for waiting a certain amount of time and then try to reopen the connection so that hopefully the data is received again.

I hope this helps.

Cheers,
Alois
index.php/fa/28775/0/
Re: FORTE: Bug in CIPComLayer [message #1800215 is a reply to message #1754575] Fri, 21 December 2018 12:46 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 32
Registered: February 2017
Location: Russia
Member
Up.
And yet, try to start debugging on the board FORTE, deploy the program in the 4DIAC IDE to FORTE , turn on Monitoring in the 4DIAC IDE, add a variable to watch . Then on the board, pause (or Break) the program and you see the monitoring in the 4DIAC IDE stop. Then on the board, start (or Continue) the program and restart the monitoring in the 4DIAC IDE. You will get DEVLOG_ERROR("External event queue is full, external event dropped!\n");
(file ecet.cpp in void CEventChainExecutionThread::startEventChain(SEventEntry *paEventToAdd)).

You wrote:
"When you debug and continue I would expect to get into the first switch case element."

No, into the second switch case element (case -1).
Re: FORTE: Bug in CIPComLayer [message #1800287 is a reply to message #1800215] Sun, 23 December 2018 15:04 Go to previous messageGo to next message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 1584
Registered: January 2014
Senior Member

Hi,


long time no see. Welcome back!

After refreshing our discussion I wanted to first ask what your intent is?

4diac FORTE as reactive real-time program is rather hard to fully and correctly debug with break points. I've made this experience also with other reactive real-time like programs I worked on. The reason is that in such programs several things have to constantly run in parallel and if you stop it may lead to unexpected side effects (e.g., also timers would behave oddly). This normally requires careful planing the debugging (e.g., log stuff into files instead of break-pionts). Although this is sub-optimal for finding issues.

If you want to test and step through your FBs maybe puting them in a unit test and step through the unit test is a better option. We've made some improvements to our FB unit test infrastructure in the development branch of 4diac FORTE.

I hope this helps.

Cheers,
Alois
Re: FORTE: Bug in CIPComLayer [message #1800379 is a reply to message #1800287] Wed, 26 December 2018 08:37 Go to previous messageGo to next message
Lik Lik is currently offline Lik LikFriend
Messages: 32
Registered: February 2017
Location: Russia
Member
Since we make our controller, we sometimes need to debug the firmware.
Also sometimes a FB unit test is not enough to find the error in our Service Interface Function Blocks.
Re: FORTE: Bug in CIPComLayer [message #1800404 is a reply to message #1800379] Wed, 26 December 2018 23:24 Go to previous message
Alois Zoitl is currently offline Alois ZoitlFriend
Messages: 1584
Registered: January 2014
Senior Member

I totally understand that point. You can imagine that we also needed that for finding issues in 4diac FORTE. I just wanted to point out some hints which helped me to track done strange behavior in 4diac FORTE, in hard to debug situations. I definitely want to improve here on unit tests as unit tests gave is much more confidence and you can much easier debug certain problems in isolation. But I totally understand that it is just one additional tool on the self and not suited for everything.
Previous Topic:Forte Runtime status resume from exception
Next Topic:how to create function block with Lua
Goto Forum:
  


Current Time: Tue Apr 23 13:57:14 GMT 2024

Powered by FUDForum. Page generated in 0.04037 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top