Eclipse Community Forums: Virgo » Possible caching issue?

Help

Home

Home » Eclipse Projects » Virgo » Possible caching issue?

Show: Today's Messages :: Show Polls :: Message Navigator

Possible caching issue? [message #823575]

Sun, 18 March 2012 14:46

Barbara Rosi-Schwartz

Messages: 448
Registered: July 2009

Senior Member

Hello.

As everybody on this forum probably knows by now Razz

, I am running a desktop application that runs on top of a Virgo kernel that communicates with a remote Virgo provisioning server for the dynamic download of all bundles that are required by a specific piece of functionality that the user requests via the applications user interface. I invoke the kernel startup command with the "-clean" option.

I am facing a weird, sporadic and difficult to reproduce problem. Most of the time when the user makes her request, everything ticks along nicely. Very occasionally, however, the system downloads the required bundles to the user's PC but, when it tries to activate them, it falls over with a ClassNotFound exception, even though all required bundles (including the one containing the class that is not found) are present locally. If the user stops the application, removes the local kernel's work directory by hand and restarts the application, everything works fine.

When I look at the timestamp of the work directory, it is what I expect it to be, which seems to indicate that the -clean command line argument has worked correctly.

I think (although I am not completely certain) that the problem tends to occur after a new deployment of bundles to the provisioning server (regardless if they are new bundles, with a new version, or updates of existing bundles).

Any idea as to what this might be caused by?

TIA,
B.

Report message to a moderator

Re: Possible caching issue? [message #824092 is a reply to message #823575]

Mon, 19 March 2012 08:53

Glyn Normington

Messages: 1222
Registered: July 2009

Senior Member

My suspicion is that the output stream of a newly downloaded artefact is not closed before (or strictly speaking, in the correct "happens before" relationship to) its use in the bundle class loader.

I assume you are using the normal Virgo mechanism for downloading and cacheing the remote repository contents. One of the key cache classes is StandardRepositoryCache (in the artefact repository git repo) which uses StandardSingleArtifactCache to cache a specific artefact and its hash. Note that only JARs have hashes computed for them and so only JARs are subject to cacheing. SSAC.refresh downloads the artefact and uses Downloader which in turn uses FileCopyUtils(InputStream, OutputStream) which has the following stream closing logic (apologies for the line numbers - courtesy of OpenGrok):

101 public static int copy(InputStream in, OutputStream out) throws IOException {
102 Assert.notNull(in, "No InputStream specified");
103 Assert.notNull(out, "No OutputStream specified");
104 try {
105 int byteCount = 0;
106 byte[] buffer = new byte[BUFFER_SIZE];
107 int bytesRead = -1;
108 while ((bytesRead = in.read(buffer)) != -1) {
109 out.write(buffer, 0, bytesRead);
110 byteCount += bytesRead;
111 }
112 out.flush();
113 return byteCount;
114 }
115 finally {
116 try {
117 in.close();
118 }
119 catch (IOException ex) {
120 }
121 try {
122 out.close();
123 }
124 catch (IOException ex) {
125 }
126 }
127 }

So if in.close throws an unchecked exception, out.close would not be called. I'm not sure what would cause an unchecked exception in in.close,but that might be an area to explore.

Report message to a moderator

Re: Possible caching issue? [message #824115 is a reply to message #824092]

Mon, 19 March 2012 09:21

Barbara Rosi-Schwartz

Messages: 448
Registered: July 2009

Senior Member

Thank you very much Glyn.

I will have to try and figure out why the exception is thrown, of course, but in the meantime, is there a workaround that I can apply to my code to recover from this?

Report message to a moderator

Re: Possible caching issue? [message #824314 is a reply to message #824115]

Mon, 19 March 2012 14:23

Glyn Normington

Messages: 1222
Registered: July 2009

Senior Member

Remember that we haven't any evidence yet that such an exception is being thrown. (Another possibility, for example, is that the JRE or the Windows file system is not closing the file synchronously under out.close.)

Have you checked the client side trace log.log for exceptions?

I can't think of a workaround in your code as your code isn't really involved in the failing code path.

Report message to a moderator

Re: Possible caching issue? [message #824355 is a reply to message #824314]

Mon, 19 March 2012 15:15

Barbara Rosi-Schwartz

Messages: 448
Registered: July 2009

Senior Member

One of the mysterious symptoms is that, once it fails, it continues failing across subsequent Virgo kernel launches, despite using the -clean option each time, until the work directory is wiped out, at which point it starts working fine again.

If the cause is one of those you suggest, it would be more random than that, correct?

And no, there is no exception in the trace log.

Report message to a moderator

Re: Possible caching issue? [message #824390 is a reply to message #824355]

Mon, 19 March 2012 16:12

Glyn Normington

Messages: 1222
Registered: July 2009

Senior Member

Yes, it would be more random. Failing after -clean suggests that the downloaded artefacts are corrupt, but you said earlier that they are intact. It might be worth doing a binary compare of the downloaded JARs and those on the server to make sure they are byte equal.

Report message to a moderator

Re: Possible caching issue? [message #824402 is a reply to message #824390]

Mon, 19 March 2012 16:20

Barbara Rosi-Schwartz

Messages: 448
Registered: July 2009

Senior Member

What I have already done is I have taken a snapshot of the work directory at the time of a failure, I have then deleted it by hand and restarted the system, which of course has succedeed and created a new work directory. I have then binary compared the jars in the two work dirs, but everything appears to be the same.

Why would deleting the work dir by hand fix the problem anyway???

[Updated on: Mon, 19 March 2012 16:29]

Report message to a moderator

Re: Possible caching issue? [message #825399 is a reply to message #823575]

Tue, 20 March 2012 20:08

Miles Parker

Messages: 1341
Registered: July 2009

Senior Member

For what it's worth, when you define the -clean option using tooling*, we delete the "work" and "serviceability" directories. So this might be helpful at least until you get the deeper issue worked out. I'm not sure if the -clean option does this on the runtime side as well?

*Server Editor:Overview Page:Server Startup Configuration

Report message to a moderator

Previous Topic:	Accessing the kernel via telnet
Next Topic:	[tooling] Important Note re: Server Runtime Versions

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Fri Apr 19 22:33:04 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter