Desilets, Alain
2012-10-04 15:16:32 UTC
Hi all,
This is my first post on this list.
I have a pretty complex problem to solve, and I was wondering if some of you have some advice for me.
I have a perl CGI script running under mod_perl. The script uses WWW::HtmlUnit to do some "on the fly crawling" of a small set of pages. In turn, WWW::HtmlUnit users Inline::Java to connect to the Java implementation of HtmlUnit.
We have set PERL_INLINE_JAVA_SHARED_JVM=1, because we don't want to start a new JVM for every request. Plus, the documentation of Inline::Java says that this variable is meant specifically for use under mod_perl (so we assumed that it would not work under mod_perl without it).
Shortly after we started using HtmlUnit in our script, we noticed that it would periodically stop working after a while, with an error to the effect that the JVM ran out of memory (don't have the exact message anymore). Once the error occurs once, it occurs on every invocation of the script, until we restart the shared JVM, by closing all the applications that use it (in our case, the Apache service, and Eclipse).
Our current understanding is that this happens because our script keeps accumulating some HtmlUnit objects for which closeAllWindows() is not called, which, from what I have seen on the web causes memory to keep growing in the JVM.
We're in the process of plugging that memory leak, but it has me worried about using PERL_INLINE_JAVA_SHARED_JVM=1, because it essentially means that if one process fills up the shared JVM, then none of the other processes (either existing or future) will work. In contrast, with a non shared JVM, a process might fill its own JVM, and then die, but that would not make all the other process fail.
But I did a couple of attempts to use Inline::Java in a script that runs under mod_perl, and it doesn't seem to work (as is implied by the Inline::Java doc).
So my question is twofold:
- Is it possible to run Inline::Java under mod_perl, without usng a shared JVM (i.e. without setting PERL_INLINE_JAVA_SHARED_JVM=1)?
- Even if it is "possible", what would be the downside of doing this? I would guess:
o Higher memory consumption (but the advantage is that if one process runs out of JVM memory, it won't affect the other processes)
o Overhead of starting the JVM happens N times, where N is the number of child processes that Apache allows (instead of 1 time if the JVM is shared)
-
Thanks.
Alain Désilets
Agent de recherche
Technologies de l'information et des communications
Conseil national de recherches Canada
Tél. : 613-993-0610 | Téléc. : 613-952-0215
alain.desilets-GPT7cTdnlGT+***@public.gmane.org<mailto:alain.desilets-GPT7cTdnlGT+***@public.gmane.org>
Research Officer
Information and Communication Technology
National Research Council Canada
Telephone: 613-993-0610 | Fax: 613-952-0215
alain.desilets-GPT7cTdnlGT+***@public.gmane.org<mailto:alain.desilets-GPT7cTdnlGT+***@public.gmane.org>
This is my first post on this list.
I have a pretty complex problem to solve, and I was wondering if some of you have some advice for me.
I have a perl CGI script running under mod_perl. The script uses WWW::HtmlUnit to do some "on the fly crawling" of a small set of pages. In turn, WWW::HtmlUnit users Inline::Java to connect to the Java implementation of HtmlUnit.
We have set PERL_INLINE_JAVA_SHARED_JVM=1, because we don't want to start a new JVM for every request. Plus, the documentation of Inline::Java says that this variable is meant specifically for use under mod_perl (so we assumed that it would not work under mod_perl without it).
Shortly after we started using HtmlUnit in our script, we noticed that it would periodically stop working after a while, with an error to the effect that the JVM ran out of memory (don't have the exact message anymore). Once the error occurs once, it occurs on every invocation of the script, until we restart the shared JVM, by closing all the applications that use it (in our case, the Apache service, and Eclipse).
Our current understanding is that this happens because our script keeps accumulating some HtmlUnit objects for which closeAllWindows() is not called, which, from what I have seen on the web causes memory to keep growing in the JVM.
We're in the process of plugging that memory leak, but it has me worried about using PERL_INLINE_JAVA_SHARED_JVM=1, because it essentially means that if one process fills up the shared JVM, then none of the other processes (either existing or future) will work. In contrast, with a non shared JVM, a process might fill its own JVM, and then die, but that would not make all the other process fail.
But I did a couple of attempts to use Inline::Java in a script that runs under mod_perl, and it doesn't seem to work (as is implied by the Inline::Java doc).
So my question is twofold:
- Is it possible to run Inline::Java under mod_perl, without usng a shared JVM (i.e. without setting PERL_INLINE_JAVA_SHARED_JVM=1)?
- Even if it is "possible", what would be the downside of doing this? I would guess:
o Higher memory consumption (but the advantage is that if one process runs out of JVM memory, it won't affect the other processes)
o Overhead of starting the JVM happens N times, where N is the number of child processes that Apache allows (instead of 1 time if the JVM is shared)
-
Thanks.
Alain Désilets
Agent de recherche
Technologies de l'information et des communications
Conseil national de recherches Canada
Tél. : 613-993-0610 | Téléc. : 613-952-0215
alain.desilets-GPT7cTdnlGT+***@public.gmane.org<mailto:alain.desilets-GPT7cTdnlGT+***@public.gmane.org>
Research Officer
Information and Communication Technology
National Research Council Canada
Telephone: 613-993-0610 | Fax: 613-952-0215
alain.desilets-GPT7cTdnlGT+***@public.gmane.org<mailto:alain.desilets-GPT7cTdnlGT+***@public.gmane.org>