[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [ecf-dev] E-intro [Was Efficient downloads]
|
Hi Scott,
- retrieving information from special headers (like
Content-Disposition)
- detecting URL redirections to final mirrors
I'm not sure what you are going to use to implement this, but would
be curious to find out.
If you download a file from an URL, you have to discover the filename
if user doesn't specify it explicitly. The most precise solution is
parsing the Content-Disposition header if it's available (browsers
use it for determining the name of the file to save). Unlike other
http headers, Content-Disposion has a very complex syntax. We should
be able to parse it properly.
OK. Do all http x.y servers support Content-Disposition? Could you
also point to the spec for it (w3c?) just for my information? And do
you know if Apache httpclient 3.0.1 implements the parsing of
Content-Disposition? If so, then perhaps the existing
org.eclipse.ecf.provider.filetransfer.httpclient could simply be
modified.
This document describes the Content-Disposition syntax:
http://www.faqs.org/ftp/rfc/pdf/rfc2183.txt.pdf
There might be some more official document, I'd have to search for it.
I looked into the HttpClient documentation. I guess that this API could
be used for parsing such a header:
http://jakarta.apache.org/commons/httpclient/apidocs/org/apache/commons/httpclient/HeaderElement.html
(not tested by me yet, I might be wrong).
The Content-Disposition header doesn't have to be supported directly by
the server. Let me introduce two basic use cases of pointing to a file
with an URL
a) Direct download
There's a physical file saved on the server. The web server is able to
serve this file directly. No Content-Disposition header is sent to the
client.
URL example:
http://my.downloads.com/filestorage/my-wonderful-piece-of-work-1.0.0.zip
The browser/application doesn't find the Content-Disposition header, but
it can use the last segment of URL to determine the file name (i.e.
my-wonderful-piece-of-work-1.0.0.zip)
b) Download of a virtual file
There's no physical file saved on the server (can be stored as a blob in
a database or generated on demand at the moment of the download
request), or its real location is secret. The web server is able to
serve the data dynamically (using php, jsp or whatever else). No
filename is visible in the URL, but Content-Disposition contains
information about the file.
URL example:
http://my.downloads.com/virtualstorage/download.php?id=142355&use_best_mirror=1
Content-Disposition header: attachment;
filename="my-wonderful-piece-of-work-1.0.0.zip"
The browser/application finds the Content-Disposition header. The
information retrieved from there has higher priority than any
information from the URL.
What to do if there's no Content-Disposition header in this case? It's a
question. Saving the file as "download.php" is probably not a very good
idea. There must be another alternative how to tell the downloader what
to do if no reasonable filename can be retreived from the URL or http
headers (e.g. specify explicit required filename and override automatic
file name).
Now I'm talking about use cases, not about the API. As I've said before,
I have to look into what you already have to be able to imagine what can
be done right now with current API and what's needed to add or modify.
Regards
Filip