Re: [eclipse-incubator-e4-dev] [resources] Asynchronous APIs for EFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [eclipse-incubator-e4-dev] [resources] Asynchronous APIs for EFS

From: Scott Lewis <slewis@xxxxxxxxxxxxx>
Date: Tue, 07 Oct 2008 10:33:40 -0700
Delivered-to: eclipse-incubator-e4-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/eclipse-incubator-e4-dev>
List-help: <mailto:eclipse-incubator-e4-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev>, <mailto:eclipse-incubator-e4-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev>, <mailto:eclipse-incubator-e4-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Thunderbird 2.0.0.17 (Windows/20080914)

Hi Martin,

Interesting points...and good discussion to have, I believe. Some of mythoughts/comments below.


Oberhuber, Martin wrote:

Hi all,

I thought a little bit about what it means to have both synchronousand asynchronous APIs at the file system layer. Some questions come up:


    * What does it mean for clients: does each client need to be aware
      of both API variants ? How do clients pick any variant? It seems
      like if we offer dual sync/async natures, that duplicate concept
      would bubble up through all our architecture, which does not
      seem desirable.

I think it's debatable whether it's desirable...I do see your point thatone API is always better than two (i.e. less complexity, fewer clientchoices, etc).

But I think the evidence does show that both synchronous andasynchronous APIs for IO (in particular) are useful, and in some casesnecessary. For example, java's new io (nio), is an asynchronous IO APIthat is (IMHO)

1) Harder to use than a blocking/synchronous API (i.e. using the normaljava io/stream classes)2) Useful/necessary for some API clients (e.g. those that require morescalability)

    * What is the granularity of being synchronous / asynchronous? Can
      a provider choose returning synchronously or asynchronously with
      each call, or does it need to pick one strategy once and for all?

This is a very interesting question...which, I don't have a ready answerfor :). I think that the most I can say about this is that clients that'know' what their scalability, performance, reliability requirements areof the communications layer should be able to use one or the other (syncand/or async) as appropriate to their application. I do think that'hiding' one behind the other at the file system layer ultimatelycreates very hard issues of performance (e.g. running Eclipse overEFS-ftp), and/or reliability (e.g. having asynchronous messaging with nofailure detection). In some ways this is related to the issue of'transparency' in networked applications...i.e. whether the network'scharacteristics (e.g. slower by orders of magnitude, much more likely topartially fail than a local file system) can/should be 'hidden' behind asingle API that allows clients to use the same calls whether or not thefile system is local (File) or over network (FTP, etc).

So by personal disposition I would be inclined to allow call-leveldecisions about synchronous vs asynchronous patterns. For example,ECF's IRemoteCall interface allows several 'styles' of invocation of aremote service:


http://www.eclipse.org/ecf/org.eclipse.ecf.docs/api/org/eclipse/ecf/remoteservice/IRemoteService.html

This does mean more complexity/decisions for clients (i.e. it can be aproxy, but it doesn't *have* to be), but it does add a layer offlexibility in use of a remote API (e.g. with AsychResults...i.e.'Futures').

It's always possible to write a bridge between for an asynchronous APIto drive synchronous providers, or the other way round. But thebenefit of being synchronous or asynchronous in a particular situationcan only be leveraged if it bubbles up right into the application layer!


True, I agree.

    * Asynchronous APIs add considerable overhead for fast queries
      (i.e. have at least one Thread switch for the callback, even if
      the result is available immediately); also, in terms of system
      consistency and resource locking, what kinds of locks can really
      be given up while an asynchronous request is pending but before
      its result is in?

Good questions...but the use case you describe (i.e. fast/local queries)smells like a good use case for synchronous invocation. And further,thread context switching is almost always of very low cost relative toactual inter-process communication (e.g. serialization, transmission/ioover wire, etc).

    * Synchronous APIs add considerable overhead for slow queries
      (i.e. explosion in the number of Threads in wait state, thus
      locking Resources).


Agreed.

To be more concrete, let's look at the current primary EFS methods inIFileStore that can be slow on a remote FS:
    * Information Retrieval
          o String[] childNames(int options, IProgressMonitor);   //
            and its relatives: childInfos(), childStores()
          o IFileInfo fetchInfo(int options, IProgressMonitor);
    * Manipulation
          o void copy(IFileStore destination, int options,
            IProgressMonitor);
          o void delete(int options, IProgressMonitor);
          o void mkdir(int options, IProgressMonitor);
          o void move(IFileStore destination, int options,
            IProgressMonitor);
          o void putInfo(IFileInfo, int options, IProgressMonitor);
It certainly makes sense to have asynchronous variants of these, ifthe provider is inherently asynchronous (like the ECF filetransferAPI). But how high would we allow this to bubble up, how would wetreat requests on the sychronous API if the provider is asynchronousor vice versa?

A good question....though I believe the answer is application-specificthough (even in the context of 'app' as 'Eclipse plugin/bundle/tool/RCPapp, etc'). Which is why I do think having both sync and async variantsis useful, rather than trying to say to clients 'this one API (sync orasync) is your access to the file system' (remote and/or local). Iagree the simplicity of such a file system model is very appealing, butI'm not convinced it is effective for all applications (i.e. someapplications require some other things).

Of course, with both sync and async APIs the respective advantages anddisadvantages (as Martin begins to lay out above) of both approaches(performance/blocking behavior, OS-level resource usage,locking/synchronization requirements, etc need to be specified asclearly as possible in the API and implementations, so that clients canmake informed choices about which approaches make sense in any givensituation.

One other thing to point mention...one approach that we've (ECF) foundhelpful for the creation of model-based editors is replication. Thatis, for some use cases (when read access to a model must be veryfast...e.g. for rendering), but write access can reasonably be slower itmakes sense to replicate the state of the model, and then use asyncmessaging plus a synchronization approach...e.g. optimistic, pessimistic[three-phase commit, etc], or something in between [e.g. ECF's cola forreal-time shared editing]. Then (frequent) read accesses to the modelwill be very fast (accessing local replication), and write accesses canbe non-blocking (async). We've begun work on an ECF 'synchronizationstrategy' API, that for documents provides a way to synchronize/resolveconflicting local and/or remote changes that are deliveredasynchronously. It uses ECF's cola algorithm (which is based upon aconcept called 'operational transforms') to resolve conflicting localand/or remote changes. If people are interested in the development ofthis API please see:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=234142

Note for the moment it's focused on resolving changes in documents (i.e.Strings), but the operational transform notion can be extended to othermodel forms. We just haven't done so (yet).


My $0.03.

Scott

References:
- [eclipse-incubator-e4-dev] [resources] Asynchronous APIs for EFS
  - From: Oberhuber, Martin

Prev by Date: RE: [eclipse-incubator-e4-dev] [resources] Alias management
Next by Date: Re: [eclipse-incubator-e4-dev] [resources] File system layer requirements
Previous by thread: [eclipse-incubator-e4-dev] [resources] Asynchronous APIs for EFS
Next by thread: [eclipse-incubator-e4-dev] [resources] Alias management
Index(es):
- Date
- Thread

Breadcrumbs