Discussion:
PERL_INLINE_JAVA_SHARED_JVM and mod_perl
Desilets, Alain
2012-10-04 15:16:32 UTC
Permalink
Hi all,

This is my first post on this list.

I have a pretty complex problem to solve, and I was wondering if some of you have some advice for me.

I have a perl CGI script running under mod_perl. The script uses WWW::HtmlUnit to do some "on the fly crawling" of a small set of pages. In turn, WWW::HtmlUnit users Inline::Java to connect to the Java implementation of HtmlUnit.

We have set PERL_INLINE_JAVA_SHARED_JVM=1, because we don't want to start a new JVM for every request. Plus, the documentation of Inline::Java says that this variable is meant specifically for use under mod_perl (so we assumed that it would not work under mod_perl without it).

Shortly after we started using HtmlUnit in our script, we noticed that it would periodically stop working after a while, with an error to the effect that the JVM ran out of memory (don't have the exact message anymore). Once the error occurs once, it occurs on every invocation of the script, until we restart the shared JVM, by closing all the applications that use it (in our case, the Apache service, and Eclipse).

Our current understanding is that this happens because our script keeps accumulating some HtmlUnit objects for which closeAllWindows() is not called, which, from what I have seen on the web causes memory to keep growing in the JVM.

We're in the process of plugging that memory leak, but it has me worried about using PERL_INLINE_JAVA_SHARED_JVM=1, because it essentially means that if one process fills up the shared JVM, then none of the other processes (either existing or future) will work. In contrast, with a non shared JVM, a process might fill its own JVM, and then die, but that would not make all the other process fail.

But I did a couple of attempts to use Inline::Java in a script that runs under mod_perl, and it doesn't seem to work (as is implied by the Inline::Java doc).

So my question is twofold:


- Is it possible to run Inline::Java under mod_perl, without usng a shared JVM (i.e. without setting PERL_INLINE_JAVA_SHARED_JVM=1)?

- Even if it is "possible", what would be the downside of doing this? I would guess:

o Higher memory consumption (but the advantage is that if one process runs out of JVM memory, it won't affect the other processes)

o Overhead of starting the JVM happens N times, where N is the number of child processes that Apache allows (instead of 1 time if the JVM is shared)

-


Thanks.



Alain Désilets

Agent de recherche
Technologies de l'information et des communications
Conseil national de recherches Canada
Tél. : 613-993-0610 | Téléc. : 613-952-0215
alain.desilets-GPT7cTdnlGT+***@public.gmane.org<mailto:alain.desilets-GPT7cTdnlGT+***@public.gmane.org>

Research Officer
Information and Communication Technology
National Research Council Canada
Telephone: 613-993-0610 | Fax: 613-952-0215
alain.desilets-GPT7cTdnlGT+***@public.gmane.org<mailto:alain.desilets-GPT7cTdnlGT+***@public.gmane.org>
Patrick LeBoutillier
2012-10-04 23:28:32 UTC
Permalink
Salut Alain,

Its been good a while since I've worked with Inline::Java under a
CGI/mod_perl environment, but
here's what I can tell you :

On Thu, Oct 4, 2012 at 11:16 AM, Desilets, Alain <
Post by Desilets, Alain
Hi all,
...
- Is it possible to run Inline::Java under mod_perl, without
using a shared JVM (i.e. without setting PERL_INLINE_JAVA_SHARED_JVM=1)?
I /think/ the answer is yes, but it may be tricky. What platform are you
running all this on?

Basically if you don't use PERL_INLINE_JAVA_SHARED_JVM=1, each Perl
script/Apache child will try to start it's own JVM server. The problem is
that the communication with the JVM server is done on a TCP socket, and
each JVM server will try to allocate a new port number to listen on, if
your platform supports the "next available port number" feature. If it
doesn't (i.e. Windows), then all servers will start up on the same port
number (7890) and you will get errors.

What errors are you getting when you try it out?
Post by Desilets, Alain
- Even if it is "possible", what would be the downside of doing
o Higher memory consumption (but the advantage is that if one process
runs out of JVM memory, it won't affect the other processes)
o Overhead of starting the JVM happens N times, where N is the number of
child processes that Apache allows (instead of 1 time if the JVM is shared)
I think that pretty much sums it up. You can also configure you Apache
children to die after a certain number of requests to make sure you never
reach your memory limit.

Patrick
Post by Desilets, Alain
-
Thanks.
Alain Désilets
Agent de recherche
Technologies de l'information et des communications
Conseil national de recherches Canada
Tél. : 613-993-0610 | Téléc. : 613-952-0215
Research Officer
Information and Communication Technology
National Research Council Canada
Telephone: 613-993-0610 | Fax: 613-952-0215
--
=====================
Patrick LeBoutillier
Rosemère, Québec, Canada
Loading...