Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jetty-users] Jetty Client 9 for Iudex crawler

On Wed, 2012-12-12 at 15:50 +0100, Simone Bordet wrote:
> Hi,
> On Wed, Dec 12, 2012 at 2:20 AM, David Kellum <dek94@xxxxxxxxxxxxx> wrote:
> > I'm working on upgrading use of Jetty Client in the IĆ«dex web crawler to
> > 9.0.0.M3.  Firstly, despite the work of absorbing a rewrite, client 9.x
> > looks to simplify several aspects of my integration.  I like the callback
> > pure-interfaces and move to use nio ByteBuffers (which matches my
> > internals.)  Thanks very much for the open source!
> >
> > I have couple of questions below, mostly related to the changes of client
> > 9.x vs client 7.x  My current Client code is here:
> >
> >
> >
> > Timeouts
> >
> > Client 7.x had settings for timeout, soTimeout, connectTimeout and
> > idleTimeout.  Client 9.x only has idleTimeout and connectTimeout. My timeout
> > related integration tests do appear to work: is idleTimeout essentially used
> > as an soTimeout and (catch all) timeout in client 9.x?
> Documentation link:
> Please report if you miss some documentation sections or things you
> would like to see in the docs.

Thanks Simone for doc link. I hadn't seen this before. 

> In 7.x soTimeout was basically unused, especially with the NIO
> connector. With this timeout out of the picture, in 9.x you get the
> same timeouts.
> The idleTimeout works with the same semantic of soTimeout for sockets,
> while the global timeout for the whole request/response conversation
> is either achieved with the Future returned by send()

Saw this in the blog post summary. It begs the question: Is the request
actually aborted, and connection freed, in the case of this Future.get()

> or via the utility class TimedResponseListener for asynchronous usage.
> So, idleTimeout and the global timeout have 2 different meanings: you
> can have a slow connection that sends 1 byte every second, so the
> idleTimeout never fires, but the total timeout does fire.

I am using it async, and this was the case that had me concerned about
the seemingly missing global timeout.  Adding a doc section on timeouts
might help others.  Looking at TimedResponseListener implementation:
shouldn't it cancel itself in onComplete()? I can't find any caller of
Schedulable.cancel() in M3?

> > Retry
> >
> > Client 7.x had a setting for maxRetries.  Is retry as a feature no longer
> > supported in Client 9.x? Any plans to add it?
> Not supported in 9.x.
> Can you explain how you would use it, and how would it work to be
> useful for you ?

I was under the impression that there was certain error conditions with
idempotent requests that the retry feature of client 7.x could
automatically recover from.  An example might be if a (i.e. keep-alive)
socket is closed while sending the request? Or am I misunderstanding
what enabling this feature does in 7.x?

> > Cookies
> >
> > I think I have may have a use case for reuse of the new Cookie support,
> > however, this being a crawler I need more control over it.  Is there a way
> > to control what cookies are sent on a per-request basis while still using
> > Jetty's pooled connections?  In other words I would like to introspect
> > cookies from a response, possibly filtering, and then apply these to a
> > subsequent request to the same registration-level domain but possibly via a
> > different connection.   I believe this is not completely unlike how browsers
> > behave.
> >
> > Short of this, is there a way to disable cookie storage and sending
> > entirely, with pooled connections?
> Cookie handling will change in M4, since we decided to base it on JDK
> classes and the likes to avoid code duplication
> and integration with WebSocket.
> You can inspect cookie headers in Response.HeadersListener callbacks,
> where all headers are arrived.
> To filter cookies that are stored, you can "wrap" the CookieStore
> implementation in this way:
> HttpClient client = ...;
> client.setCookieStore(new FilteringCookieStore(new HttpCookieStore()));
> HttpCookieStore is a utility class from the jetty-util module, and
> FilteringCookieStore will be a class you write that filters cookies
> based on your logic, delegating to the inner CookieStore.
> This is available in current master branch only, and shortly in M4.

OK I'll wait for M4 before investigating cookie support further. I'd
like to disable cookies with M3, but it looks like this (per doc) is
only on master as well? 

httpClient.setCookieStore(new CookieStore.Empty());

Though I can't find an Empty class in either?



Back to the top