Intersession Caching of Downloaded Java Classes

David Wallace Croft

1998-04-01


1997-10-04

Java Developers:

Abstract:  I propose that the implementation of inter-session
class caching mechanisms in Java applications and browsers
is the technical solution to the current raging debate
concerning Java standardization and monopolization.
Efficiencies would be improved in both the developer market
and in the consumption of computer resources.

Microsoft has recently made its position clear with regard
to Java 1.1 and 1.2 support:  it will not support the core
components of RMI, JNI, and, most important to my mind,
the JFC.  This is somewhat understandable as Microsoft
is fighting to preserve its monopoly; truly portable
run-time code makes the choice of the operating system
somewhat irrelevant.

http://www.javaworld.com/jw-10-1997/jw-10-javalobby.html

Looking farther down the road, we as developers have to
be concerned about the prospect of an effective monopoly
from Sun.  The number of standard (core) Java packages
and classes required of a compliant Java implementation
increased from 8 and 211 in Java 1.0 to 23 and 503 in
Java 1.1, respectively.  If we see another doubling
of those numbers with the release of Java 1.2, one has
to wonder if Sun isn't simply creating its own virtual
monopoly by continuously adding new components to the
core Java specification which other companies must struggle
to implement, continuously lagging behind Sun's
self-awarded head start.

In my mind, the most significant advantage of the Java
programming language, the feature that makes it a 
revolutionary leap above and beyond all other programming
languages to date, is its guaranteed cross-platform
run-time portability.  Having a sufficiently rich set of
classes as standard available components is vital to the
wide-spread acceptance of any portable language.  Whereas 
I applaud Sun's apparent movement toward including 
specifications in the Java core for a vast number of APIs
including what would normally be considered operating 
system functionality, I have to wonder what risks we as 
developers are taking by allowing the Java language to 
become too bloated.  At some point it has to stop.

One always has the option of using non-core classes
in any Java application or applet.  The Java language
does not restrict the use of the IFC or the AFC simply
because the JFC has become standardized.  However, having
any particular set of classes pre-installed on the
client is such a significant performance improvement, in
terms of class downloading time and access to trusted
native code optimizations, it is clear that one standard
will come to dominate.  My vote is for the cross-platform,
cross-vendor solutions of which I believe the JFC will be.

What do we do, however, when several competitors are
offering cross-platform, cross-vendor specifications?
If we simply wait until Sun incorporates a variant of
one of the specifications into its core, we as developers
will be punishing future innovators in the market.
Whereas we might rely on the goodwill and generosity
of Sun to work with competing vendors of superior
technologies to integrate the APIs into future versions
of Java, we know that the small vendors will lose
their time-to-market advantage in the process.  What
we need is a technical fix.

I propose that we rewrite our class loader implementations
in our browsers and our Java applications to cache
Java classes between sessions.  Infrequently used
classes would be automatically purged from local storage
over time given the algorithms and user-selected choices
to constrain long-term storage consumption while meeting
class downloading time requirements.  Native code
optimizations, once accepted given a deliberate
acknowledgment of trust by the receiver, would remain
resident over multiple sessions until expired through
lack of use.  End users would have the option to select
which classes to permanently maintain, which would
normally but not necessarily include a significant number
of the core Java classes, and which would be expired over
time according to some heuristic.  End users would keep
the ability, as an option, to always expire classes at the
termination of an application or session as is done now
for non-pre-installed classes.  The new inherent class
versioning feature of Java 1.1 could be used to ensure that
cached classes are updated as needed between sessions or
possibly even within a session at the launch of a new class
loader.

Microsoft sparked this current conflict with the decision
to remove hundreds of Java applets from its web pages citing
the bandwidth consumption and delays.  Regardless of their
true motivations in this action, I find that there is a
significant element of truth to this statement.  I myself
am tempted to pull an applet off of my own home page after
seeing the same classes downloaded every single time I open
my browser.  Over time, the delay adds up to a costly
expenditure in terms of both man-hours and bandwidth.
Caching downloaded classes between sessions would solve this
problem.

Indeed, this leaves the door open to a superior class library
gaining a de facto advantage over established or standardized
class libraries supported by one or some of the early players:
older, inferior class libraries will make fewer and fewer cache
hits over the Internet, requiring the expenditure of bandwidth
and time.  Caching classes between sessions
promotes competitiveness, rewards innovation and efficiency,
and returns freedom of choice to the developers.  As it now
stands, even if a cross-platform, cross-vendor class that
consumes 1/10 of the resources in byte size and execution time
is available, there is a strong force against using other than
the established, pre-installed Java classes.

One might consider that the law of increasing returns
would play out in this proposed scenario as well, simply
replacing favoritism of those class libraries with the
best marketing and largest pre-installed base with those
class libraries with the best marketing and largest cached
base.  This is not true, however, as those applications
and applets with loyal repeat customers, such as for
in-house applications and popular innovative applets,
will have in effect created a niche market where the use
of their class libraries can survive until they can
pass the threshold required to be recognized and accepted
as a superior product by the mass market.

As an example of this, consider a Java game engine class library.
Rather than relying on the pre-installaton of classes through the
expensive and limited distribution of a CD-ROM, game developers
can deliver hundreds of kilobytes of class and data files to
the players over the Internet once, and only once, with the
reliance upon the user's efficient caching class loader.
Necessary class version updates would still take place and the
end user, not the distributor of a CD-ROM, would then have the
option of deciding just how much long-term storage he wanted to
dedicate to game given his preferences for minimizing download
time, reliance on optimizing smart caching mechanisms, and
frequency of play.  On the advice of the game provider, the
player might even opt to increase cache storage or mark the
downloaded classes in the library as immune from purges for
a fixed duration.

Continuing this example, which is applicable to the vast
number of Java development industries, successful game engines
would have an advantage due to the increased probability of
successful cache hits.  This promotes licensing opportunities
to game engine developers separate and apart from the game
content developers.  The layerization of an industry, just as
it is in software engineering, promotes efficiency, reliability,
and increased application opportunities.  One might even imagine
that a popular public domain game engine, seeded by the
downloads from academic universities over time, might be adopted
by a number of for-profit game developers due to its established
customer base.  These same industries might even find that
there is less disincentive to collaborate on a standard set
of class libraries now that the monopolistic advantage of
pre-installing class libraries has been removed.

To summarize, the use of efficient inter-session Java class
caching mechanisms would improve resource consumption efficiency,
return choice to developers, and make this whole Java language
standardization and monopolization debate moot.

David Wallace Croft, croft@alumni.caltech.edu
http://www.alumni.caltech.edu/~croft/research/java/cache/


1997-10-10

Java Developers:

I suggest that it would serve our best interests if major
browsers would cache JAR files between sessions.

Earlier I had suggested that Java classes should be cached
between sessions to prevent stifling developer innovation.
As it currently stands, the makers of a particular browser
can pre-install a proprietary Java class library on the
client and gain a significant advantage over distributors
of classes which must be downloaded with each new session.
http://www.alumni.caltech.edu/~croft/research/java/cache/

After reading the e-mail discussions on this topic, I see
that there are technical issues to be resolved with regard
to security, namespace conflicts, and version control.
While I am certain that these problems can be worked out
for the distribution of individual class files if there is
a will, I have recently come to the conclusion that the
intersession caching of JAR files may make resolving these
issues trivial.

By quickly comparing just the hash digests of a requested
JAR file to a cached JAR file, a download can be avoided
while being reasonably assured that any security, namespace,
or version problems are highly improbable.  So long as a
hash digest can be recomputed and validated the first time
a JAR file is downloaded and cached, it should be quite
safe to download a large foundation class library from any
random server and use it as called by applets from another
server, trusted or untrusted.

If that hypothetical JAR file were identically copied on
a multitude of servers, only the first web server encountered
delivering an HTML page with an applet tag requiring that
specific archive would actually have to deliver it to the
client browser.  From then on, it would not need to be
downloaded as it would already be cached.

To sum it up, if the JAR hash digests match, use the cache.

David Wallace Croft
Mountaineer Java Users Group (MtJUG) Moderator


1998-04-01

David Macias, Group Marketing Manager, Sun:

We talked briefly immediately after the online Java gaming
session Friday at the JavaOne conference.  I had wanted to
bring to your attention how crucial it is to the online
Java gaming industry for JavaSoft to provide "persistent JAR
caching" in Java 1.3 or earlier.  As you requested, I am
e-mailing you with some information that you may use to study
this opportunity.

In a nutshell, persistent JAR caching is the ability to cache
Java bytecode class files on the client disk drive in the same
manner that your web browser caches graphics images and web
pages.  This is crucial for "thin pipe" clients such as
applet users coming in over a modem or even mobile computers.
As it currently stands, by requiring a download over a slow
modem of all of the Java resources every time an applet is
loaded on a daily basis, online Java gaming development as
well as other applet applications are severely hindered.

Bytecode class file caching could have a significant impact on
the acceptance of Java especially if it is introduced in the
JDK.  With Netscape introducing their new Open API for replacing
the virtual machine in their browser with one from JavaSoft,
end-users would suddenly see a dramatic improvement in loading
applets by simply making the virtual machine switch to a JDK
with persistent JAR caching.  This would make Java far more
palatable to many browser users.

To be clear here, persistent JAR caching makes the distribution
problem go away.  An end-user can download a Java 3D interactive
applet game for a few hours the first time the program is run
but have it load instantly every day in the future.  The bytecode
class files will be updated in the cache only as needed,
eliminating any need for a pre-install or update of standard Java
extensions or third-party libraries for performance.

For more extensive arguments regarding this technology
with regard to driving this industry, see
http://www.alumni.caltech.edu/~croft/research/java/cache/


--
David Wallace Croft, Senior Intelligent Systems Engineer