Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » Virgo » Possible caching issue?
Possible caching issue? [message #823575] Sun, 18 March 2012 10:46 Go to next message
Barbara Rosi-Schwartz is currently offline Barbara Rosi-Schwartz
Messages: 448
Registered: July 2009
Senior Member
Hello.

As everybody on this forum probably knows by now Razz , I am running a desktop application that runs on top of a Virgo kernel that communicates with a remote Virgo provisioning server for the dynamic download of all bundles that are required by a specific piece of functionality that the user requests via the applications user interface. I invoke the kernel startup command with the "-clean" option.

I am facing a weird, sporadic and difficult to reproduce problem. Most of the time when the user makes her request, everything ticks along nicely. Very occasionally, however, the system downloads the required bundles to the user's PC but, when it tries to activate them, it falls over with a ClassNotFound exception, even though all required bundles (including the one containing the class that is not found) are present locally. If the user stops the application, removes the local kernel's work directory by hand and restarts the application, everything works fine.

When I look at the timestamp of the work directory, it is what I expect it to be, which seems to indicate that the -clean command line argument has worked correctly.

I think (although I am not completely certain) that the problem tends to occur after a new deployment of bundles to the provisioning server (regardless if they are new bundles, with a new version, or updates of existing bundles).

Any idea as to what this might be caused by?

TIA,
B.
Re: Possible caching issue? [message #824092 is a reply to message #823575] Mon, 19 March 2012 04:53 Go to previous messageGo to next message
Glyn Normington is currently offline Glyn Normington
Messages: 1222
Registered: July 2009
Senior Member
My suspicion is that the output stream of a newly downloaded artefact is not closed before (or strictly speaking, in the correct "happens before" relationship to) its use in the bundle class loader.

I assume you are using the normal Virgo mechanism for downloading and cacheing the remote repository contents. One of the key cache classes is StandardRepositoryCache (in the artefact repository git repo) which uses StandardSingleArtifactCache to cache a specific artefact and its hash. Note that only JARs have hashes computed for them and so only JARs are subject to cacheing. SSAC.refresh downloads the artefact and uses Downloader which in turn uses FileCopyUtils(InputStream, OutputStream) which has the following stream closing logic (apologies for the line numbers - courtesy of OpenGrok):

101 public static int copy(InputStream in, OutputStream out) throws IOException {
102 Assert.notNull(in, "No InputStream specified");
103 Assert.notNull(out, "No OutputStream specified");
104 try {
105 int byteCount = 0;
106 byte[] buffer = new byte[BUFFER_SIZE];
107 int bytesRead = -1;
108 while ((bytesRead = in.read(buffer)) != -1) {
109 out.write(buffer, 0, bytesRead);
110 byteCount += bytesRead;
111 }
112 out.flush();
113 return byteCount;
114 }
115 finally {
116 try {
117 in.close();
118 }
119 catch (IOException ex) {
120 }
121 try {
122 out.close();
123 }
124 catch (IOException ex) {
125 }
126 }
127 }

So if in.close throws an unchecked exception, out.close would not be called. I'm not sure what would cause an unchecked exception in in.close,but that might be an area to explore.
Re: Possible caching issue? [message #824115 is a reply to message #824092] Mon, 19 March 2012 05:21 Go to previous messageGo to next message
Barbara Rosi-Schwartz is currently offline Barbara Rosi-Schwartz
Messages: 448
Registered: July 2009
Senior Member
Thank you very much Glyn.

I will have to try and figure out why the exception is thrown, of course, but in the meantime, is there a workaround that I can apply to my code to recover from this?
Re: Possible caching issue? [message #824314 is a reply to message #824115] Mon, 19 March 2012 10:23 Go to previous messageGo to next message
Glyn Normington is currently offline Glyn Normington
Messages: 1222
Registered: July 2009
Senior Member
Remember that we haven't any evidence yet that such an exception is being thrown. (Another possibility, for example, is that the JRE or the Windows file system is not closing the file synchronously under out.close.)

Have you checked the client side trace log.log for exceptions?

I can't think of a workaround in your code as your code isn't really involved in the failing code path.
Re: Possible caching issue? [message #824355 is a reply to message #824314] Mon, 19 March 2012 11:15 Go to previous messageGo to next message
Barbara Rosi-Schwartz is currently offline Barbara Rosi-Schwartz
Messages: 448
Registered: July 2009
Senior Member
One of the mysterious symptoms is that, once it fails, it continues failing across subsequent Virgo kernel launches, despite using the -clean option each time, until the work directory is wiped out, at which point it starts working fine again.

If the cause is one of those you suggest, it would be more random than that, correct?

And no, there is no exception in the trace log.
Re: Possible caching issue? [message #824390 is a reply to message #824355] Mon, 19 March 2012 12:12 Go to previous messageGo to next message
Glyn Normington is currently offline Glyn Normington
Messages: 1222
Registered: July 2009
Senior Member
Yes, it would be more random. Failing after -clean suggests that the downloaded artefacts are corrupt, but you said earlier that they are intact. It might be worth doing a binary compare of the downloaded JARs and those on the server to make sure they are byte equal.
Re: Possible caching issue? [message #824402 is a reply to message #824390] Mon, 19 March 2012 12:20 Go to previous messageGo to next message
Barbara Rosi-Schwartz is currently offline Barbara Rosi-Schwartz
Messages: 448
Registered: July 2009
Senior Member
What I have already done is I have taken a snapshot of the work directory at the time of a failure, I have then deleted it by hand and restarted the system, which of course has succedeed and created a new work directory. I have then binary compared the jars in the two work dirs, but everything appears to be the same.

Why would deleting the work dir by hand fix the problem anyway???

[Updated on: Mon, 19 March 2012 12:29]

Report message to a moderator

Re: Possible caching issue? [message #825399 is a reply to message #823575] Tue, 20 March 2012 16:08 Go to previous message
Miles Parker is currently offline Miles Parker
Messages: 1338
Registered: July 2009
Senior Member
For what it's worth, when you define the -clean option using tooling*, we delete the "work" and "serviceability" directories. So this might be helpful at least until you get the deeper issue worked out. I'm not sure if the -clean option does this on the runtime side as well?

*Server Editor:Overview Page:Server Startup Configuration
Previous Topic:Accessing the kernel via telnet
Next Topic:[tooling] Important Note re: Server Runtime Versions
Goto Forum:
  


Current Time: Wed Jul 30 15:26:07 EDT 2014

Powered by FUDForum. Page generated in 0.01670 seconds