[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
| Re: [ecf-dev] E-intro [Was Efficient downloads] | 
Hi Scott,
- retrieving information from special headers (like 
Content-Disposition)
- detecting URL redirections to final mirrors
I'm not sure what you are going to use to implement this, but would 
be curious to find out.
If you download a file from an URL, you have to discover the filename 
if user doesn't specify it explicitly. The most precise solution is 
parsing the Content-Disposition header if it's available (browsers 
use it for determining the name of the file to save). Unlike other 
http headers, Content-Disposion has a very complex syntax. We should 
be able to parse it properly.
OK.  Do all http x.y servers support Content-Disposition?  Could you 
also point to the spec for it (w3c?) just for my information?  And do 
you know if Apache httpclient 3.0.1 implements the parsing of 
Content-Disposition?  If so, then perhaps the existing 
org.eclipse.ecf.provider.filetransfer.httpclient could simply be 
modified.
This document describes the Content-Disposition syntax: 
http://www.faqs.org/ftp/rfc/pdf/rfc2183.txt.pdf
There might be some more official document, I'd have to search for it.
I looked into the HttpClient documentation. I guess that this API could 
be used for parsing such a header: 
http://jakarta.apache.org/commons/httpclient/apidocs/org/apache/commons/httpclient/HeaderElement.html
(not tested by me yet, I might be wrong).
The Content-Disposition header doesn't have to be supported directly by 
the server. Let me introduce two basic use cases of pointing to a file 
with an URL
a) Direct download
There's a physical file saved on the server. The web server is able to 
serve this file directly. No Content-Disposition header is sent to the 
client.
URL example: 
http://my.downloads.com/filestorage/my-wonderful-piece-of-work-1.0.0.zip
The browser/application doesn't find the Content-Disposition header, but 
it can use the last segment of URL to determine the file name (i.e. 
my-wonderful-piece-of-work-1.0.0.zip)
b) Download of a virtual file
There's no physical file saved on the server (can be stored as a blob in 
a database or generated on demand at the moment of the download 
request), or its real location is secret. The web server is able to 
serve the data dynamically (using php, jsp or whatever else). No 
filename is visible in the URL, but Content-Disposition contains 
information about the file.
URL example: 
http://my.downloads.com/virtualstorage/download.php?id=142355&use_best_mirror=1
Content-Disposition header: attachment; 
filename="my-wonderful-piece-of-work-1.0.0.zip"
The browser/application finds the Content-Disposition header. The 
information retrieved from there has higher priority than any 
information from the URL.
What to do if there's no Content-Disposition header in this case? It's a 
question. Saving the file as "download.php" is probably not a very good 
idea. There must be another alternative how to tell the downloader what 
to do if no reasonable filename can be retreived from the URL or http 
headers (e.g. specify explicit required filename and override automatic 
file name).
Now I'm talking about use cases, not about the API. As I've said before, 
I have to look into what you already have to be able to imagine what can 
be done right now with current API and what's needed to add or modify.
Regards
 Filip