Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [lyo-dev] TRS Server paging & persistence proposal

I believe it is unacceptable for a client to need to re-read the base when the server rebases and/or truncates the change log, unless this happens extremely rarely - perhaps once per several years. Restarting processing of the entire base plus change log can take 4 weeks or more in some existing user installations of IBM's Requirement Management applications with many millions of requirements. Such users have been very unhappy if we tell them to rebuild the reporting index from scratch, and have their reports incomplete for the next month until the index catches up!

For this reason, I do not think it acceptable to replace all change log pages at the time of a rebase, unless old pages are also kept for a reasonable period of time (at least 15 days, and preferably double that). The proposal allows for this, but does not make it clear that implementations really should do this.

Under normal circumstances, the client should not need to re-read the base, and should be able to completely ignore the server's rebase procedure - it should not need to detect it, because normal processing of the change log should suffice.

Considering TRS client performance, having the number of change events in the TRS resource itself be reduced to 1, or a small number during the period following a rebase, is not ideal. Ideally, clients that poll the TRS feed at a reasonable frequency might expect to get all the change events they have not yet processed in the initial GET of the TRS resource - so keeping that first page fully populated with the most recent change events is more efficient for the clients. With a reduced number of change events in the TRS resource, the client has a higher chance of needing to read the next page of the change log to find the change event it last processed.

Nick.



From:        Andrii Berezovskyi <andriib@xxxxxx>
To:        Lyo project developer discussions <lyo-dev@xxxxxxxxxxx>
Date:        06/05/2018 11:51 AM
Subject:        [lyo-dev] TRS Server paging & persistence proposal
Sent by:        lyo-dev-bounces@xxxxxxxxxxx




Hello,
 
Current TRS Server implementation in Lyo is rather naïve when it comes to long-term server operation. One thing that can be improved is how paging is done. Another is how to use Lyo Store and/or Redis for persisting the pages.
 
Currently, a TRS Server keeps its Change Log in memory and uses simple URI patterns:
 
  • /services/trs/ points to
    • /services/trs/base/1
        • /services/trs/base/n
    • /services/trs/changeLog/2
        • /services/trs/changeLog/m
 
When the rebase happens, it rebuilds the Change Log completely. Finally, Change Log pages are formed on-the-fly.
 
Issues 1 & 2 can cause the following problems:
 
  • Keeping things in memory means rebase would happen on every restart.
  • Stateful TRS Server also prevents an OSLC microservice from being placed behind a load balancer.
  • When a new event happens, the contents of all Change Log pages would change and a TRS Client would see previously observed Change Events in the subsequent Change Log pages.
  • trs:order property may be assigned to different resources upon rebase and/or restart and the Cutoff Event would lose sense. TRS Client would detect this and would perform a full rebase.
 
Because of the URI patterns & issue 3, the only way a TRS Client can detect a rebase is to follow the TRS Base link and check on its first page if the Cutoff Event URI has changed or wait to fail to find a Cutoff Event (or the most recent Change Event observed by the TRS Client).
 
Jad suggested an idea to use Lyo Store to persist Change Events under a triplestore. But without nice and clean paging that allows pages to be persisted once and be deleted completely once they are “evicted”, doing this would be challenging. After discussions in OSLC Core committee (special thanks to Nick for extensive analysis and detailed examples in the slides), I came up with the following:
 
  1. The TRS Resource should display a variable number of the most recent changes, the pages should have fixed size “n”.
  2. When the number of Change Events to be returned with the TRS Resource exceeds the page size, a new page is created and the number of Change Events returned in the TRS Resource should go from n+1 to 1.
  3. The Change Log pages should be numbered in reverse (see an example below).
  4. A truncated hash of a Cutoff Event URI is used to provide an ability to return 410 gone or 404 Not Found when the rebase happens in the middle of the client’s traversal of the change log pages.
 
Here is an example:
 
  • /services/trs includes a TrackedResourceSet resource with
    • a ChangeLog resource that
      • via trs:previous points to /services/trs/log/ABCD/9 which
        • via trs:previous points to /services/trs/log/ABCD/8…
 
When m events get added, and the “root” ChangeLog resource grows beyond the page size, we add a page log/ABCD/10:
 
  • /trs includes a TrackedResourceSet resource with
    • a ChangeLog resource that
      • via trs:previous points to /services/trs/log/ABCD/10 which
        • via trs:previous points to /services/trs/log/ABCD/9…
  • the contents of /services/trs/log/ABCD/9 are identical to its contents before page 10 was added
  • the ChangeLog resource under /trs now contains only one Change Event
 
When rebase happens and we decide to keep 3 pages worth of events:
 
  • /trs includes a TrackedResourceSet resource with
    • a ChangeLog resource that
      • via trs:previous points to /services/trs/log/9FE2/2 which
        • via trs:previous points to /services/trs/log/9FE2/1…
  • a request to /services/trs/log/ABCD/5 would return 404 Not Found if we only keep the hash of the current base or 410 Gone if we also keep a set of older bases.
 
At the cost of keeping the list of the most recent changes separate from the paged log, we get a perfectly cacheable solution with predictable behavior that can be persisted in a triplestore:
 
  • each page must be persisted once
  • all pages from a given key get removed upon rebase
    • or we keep all pages for the current and the last keys
  • Varnish can be employed to cache whole Change Log pages for extended periods of time (>60s) and the TRS Resource page for shorter periods (<10s).
 
Feedback is welcome!
 
/Andrew
 _______________________________________________
lyo-dev mailing list
lyo-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/lyo-dev



Back to the top